classification

Development-related discussion, including bundled plugins
mschuett
Bear Rating Trainee
Bear Rating Trainee
Posts: 2
Joined: 07 Mar 2008, 21:35

classification

Postby mschuett » 07 Mar 2008, 22:14

Hello,
is anyone interested in combining tt-rss with automatic classification (like in a spamfilter)?

I wrote myself a Script to use Python and CRM114 for classification. If anyone would like to play with this just write me a mail.

Image

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: classification

Postby fox » 07 Mar 2008, 23:31

I've been interested for quite some time, but I'd prefer a native implementation which I could cleanly integrate it in tt-rss. I'm not sure adding dependency on external scripts is something I can have in trunk.

Can you elaborate on how it works?

mschuett
Bear Rating Trainee
Bear Rating Trainee
Posts: 2
Joined: 07 Mar 2008, 21:35

Re: classification

Postby mschuett » 09 Mar 2008, 19:00

Can you elaborate on how it works?


Since there is no PHP module I use this Python-Interface to CRM114 to classify: http://www.elegantchaos.com/node/129

In the tt-rss database I added two columns to `ttrss_user_entries`: a `crm114score` for the classification score and a `crm114norm` to train an entry as good or bad.

TT-RSS just shows the `crm114score` value and includes a red or green icon depending on the score. The article gets a + and a - link to give feedback. Clicking these links calls the backend which then sets the `crm114norm` for that ref_id and owner_id. (This is the feedback for training.)

The scoring itself is an independent script (1. because I do not want it to slow down the UI and 2. because I cannot use system() on my webserver). This has its own database access and first learns from all articles with a `crm114norm`, then scores all new articles.

I'm currently experimenting with instant learning/scoring after a user clicks or after tt-rss downloads an article. But I still fear the added latency will be too high and an independent script+cronjob is better that stuffing more functions into the AJAX-dialogue.

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: classification

Postby fox » 11 Mar 2008, 12:34

That's interesting. I could stuck it into a separate update daemon task to process database periodically. I don't know how it would scale, though.

There is a native to PHP bayesian filter here:

http://www.xhtml.net/php/PHPNaiveBayesianFilter

I have no idea whether it is similar to CRM114 to work with.


Return to “Development”

Who is online

Users browsing this forum: No registered users and 2 guests