Bayesian classifier for TTRSS

Development-related discussion, including bundled plugins
rknobbe-other
Bear Rating Trainee
Bear Rating Trainee
Posts: 13
Joined: 11 Jun 2015, 22:37

Bayesian classifier for TTRSS

Postby rknobbe-other » 11 Jun 2015, 22:41

There have been other threads about reproducing the GReader "sort by magic" or other kind of automatic classification. I took a stab at it using an external perl script, and employing AI::Categorize along with a couple of canned labels for training and scoring of articles. Please take a look and provide feedback.

https://github.com/rknobbe/tt-rss-bayes-tools

Note: I'm also "rknobbe", but the password reset response isn't coming to me for some reason.

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Bayesian classifier for TTRSS

Postby fox » 11 Jun 2015, 22:51

this kinda thing would work great as a filter plugin although i have no idea if theres any bayes stuff for php

rknobbe-other
Bear Rating Trainee
Bear Rating Trainee
Posts: 13
Joined: 11 Jun 2015, 22:37

Re: Bayesian classifier for TTRSS

Postby rknobbe-other » 11 Jun 2015, 22:55

I can take a stab at porting the glue logic to php; there are bayesian classifiers for php. What I would appreciate is if somebody could take a look at turning the label indicators in the article viewer into buttons, or some other slightly less clicky way of marking interesting/uninteresting. Labels are adequate, but there are too many steps for a simple training button.

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Bayesian classifier for TTRSS

Postby fox » 11 Jun 2015, 23:35

take a look at button plugins

a simple one adding essentially a like/dislike buttons to every article would suffice

JustAMacUser
Bear Rating Overlord
Bear Rating Overlord
Posts: 373
Joined: 20 Aug 2013, 23:13

Re: Bayesian classifier for TTRSS

Postby JustAMacUser » 12 Jun 2015, 03:46

This would be such a great feature...

rknobbe-other
Bear Rating Trainee
Bear Rating Trainee
Posts: 13
Joined: 11 Jun 2015, 22:37

Re: Bayesian classifier for TTRSS

Postby rknobbe-other » 14 Jun 2015, 05:29

Do plugins have a hook to modify article score? I see I can register an article filter, but it looks like the filter logic to set scores is in the mainline rssfuncs.php and not exposed to plugins.

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Bayesian classifier for TTRSS

Postby fox » 15 Jun 2015, 19:53

you're right, it's not exposed to plugins for some reasons. most likely i forgot about it. :)

e: i think i'll only be able to add a special score modifier which would add with the base score calculated by filters, adding persistent stuff seems error-prone - i.e. what if plugin would constantly increment it or something on each run

e2: https://github.com/gothfox/Tiny-Tiny-RS ... b764e49582

rknobbe-other
Bear Rating Trainee
Bear Rating Trainee
Posts: 13
Joined: 11 Jun 2015, 22:37

Re: Bayesian classifier for TTRSS

Postby rknobbe-other » 15 Jun 2015, 22:25

this looks great. I'll try it out tonight. While stalled on the php port, I realized that my current external (label-based) script really shouldn't be limited to 2 labels (interesting vs. uninteresting). I'm relaxing that constraint to have it do bayesian learning on any labels the user provided, then apply likely labels to new articles appropriately. I'll push that updated script to github momentarily.

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Bayesian classifier for TTRSS

Postby fox » 15 Jun 2015, 22:28


rknobbe-other
Bear Rating Trainee
Bear Rating Trainee
Posts: 13
Joined: 11 Jun 2015, 22:37

Re: Bayesian classifier for TTRSS

Postby rknobbe-other » 15 Jun 2015, 23:37

yeah, I saw that one and avoided it when I saw that it needed shit-tons of extra stuff for a database backend.

this one:
https://github.com/atyks/PHP-Naive-Bayesian-Filter

only needs mysql and some foreign language skills

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Bayesian classifier for TTRSS

Postby fox » 15 Jun 2015, 23:55

errors in french, readme in half japanese, looks like a lot of fun

himynameschris
Bear Rating Trainee
Bear Rating Trainee
Posts: 2
Joined: 17 Jun 2015, 03:24

Re: Bayesian classifier for TTRSS

Postby himynameschris » 17 Jun 2015, 03:37

Hi all, I am interested in this as well. I found the web service "uclassify" that could easily be integrated into tt-rss in the short term, it seems to work pretty well and the developers give an example news classifier. The service is free up to 5000 API calls per day, past that there is a subscription (I am not affiliated in any way, found them through a web search). They also have a local server version of their software but I do not see any individual/open source licensing, only paid services.




As for php tools, the following repo seems to be the most complete for what would be needed (pos tagger, stemmer, classifier with k-means or naive bayes).

or


Sadly, php does not seem to be the best choice when it comes to natural language processing. I would personally like to implement this using javascript on nodejs so that performance could be improved with a native / c++ module if needed, or by implementing it in Java so that an Apache Spark instance could be used if the computing requirements get to be unmanageable. The process would run independently of the php processes, updating the mysql database with analytics and results for views that would need to be implemented with a php plugin.

Excellent javascript NLP library:

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Bayesian classifier for TTRSS

Postby fox » 17 Jun 2015, 15:17

well, there's this now (postgresql only at the moment, although its a matter of writing the needed script in init_database() for mysql)

https://github.com/gothfox/Tiny-Tiny-RS ... sort_bayes

it probably doesn't work correctly and/or requires tons of further tweaking, comments welcome

* how it works *

there's two article buttons, one files article in good category, another in neutral. there's no specific BAD category so it only rates things up at the moment.

when processing articles it checks if database is more or less filled on both categories and files stuff accordingly, if database is not filled it just puts everything into neutral category.

when it rates up either automatically or manually, score is bumped 50 points up.

disclaimer: i have literally no idea how any of this shit is supposed to work or if its the right way to do this lol

nameless
Bear Rating Master
Bear Rating Master
Posts: 126
Joined: 28 Aug 2013, 20:33

Re: Bayesian classifier for TTRSS

Postby nameless » 17 Jun 2015, 18:18

Is there anything to pay special attention to when updating?
I git pulled and actived the plugin, this is when ttrss began throwing errors; didn't get redirect to the database updater afterwards.
Ttrss is still throwing errors even though the plugin is diabled now.
Did I miss out on anything?

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Bayesian classifier for TTRSS

Postby fox » 17 Jun 2015, 18:46

maybe post some errors idk


Return to “Development”

Who is online

Users browsing this forum: No registered users and 3 guests