Bayesian classifier for TTRSS

Development-related discussion, including bundled plugins
User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Bayesian classifier for TTRSS

Postby fox » 17 Jun 2015, 18:55

there's three classifier categories now so after updating to https://github.com/gothfox/Tiny-Tiny-RS ... 878231a17b you all will need to do the following:

Code: Select all

delete from  ttrss_plugin_af_sort_bayes_categories


and reload tt-rss

nameless
Bear Rating Master
Bear Rating Master
Posts: 126
Joined: 28 Aug 2013, 20:33

Re: Bayesian classifier for TTRSS

Postby nameless » 17 Jun 2015, 19:04

Code: Select all

E_USER_ERROR (256)   classes/db/pgsql.php:46   Query INSERT INTO ttrss_plugin_af_sort_bayes_categories (category, owner_uid) VALUES ('NEUTRAL', 1) failed: ERROR: relation "ttrss_plugin_af_sort_bayes_categories" does not exist LINE 1: INSERT INTO ttrss_plugin_af_sort_bayes_categories (category,... ^   admin   18:00
E_USER_ERROR (256)   classes/db/pgsql.php:46   Query INSERT INTO ttrss_plugin_af_sort_bayes_categories (category, owner_uid) VALUES ('GOOD', 1) failed: ERROR: relation "ttrss_plugin_af_sort_bayes_categories" does not exist LINE 1: INSERT INTO ttrss_plugin_af_sort_bayes_categories (category,... ^   admin   18:00
E_WARNING (2)   classes/db/pgsql.php:57   pg_num_rows() expects parameter 1 to be resource, boolean given   admin   18:00
E_USER_ERROR (256)   classes/db/pgsql.php:46   Query SELECT id FROM ttrss_plugin_af_sort_bayes_categories WHERE owner_uid = 1 LIMIT 1 failed: ERROR: relation "ttrss_plugin_af_sort_bayes_categories" does not exist LINE 1: SELECT id FROM ttrss_plugin_af_sort_bayes_categories WHERE o... ^   admin   18:00
E_USER_ERROR (256)   classes/db/pgsql.php:46   Query CREATE TABLE IF NOT EXISTS ttrss_plugin_af_sort_bayes_wordfreqs ( word varchar(100) NOT NULL DEFAULT '', category_id INTEGER NOT NULL REFERENCES ttrss_plugin_af_sort_bayes_categories(id) ON DELETE CASCADE, owner_uid INTEGER NOT NULL REFERENCES ttrss_users(id) ON DELETE CASCADE, count BIGINT NOT NULL DEFAULT '0') failed: ERROR: relation "ttrss_plugin_af_sort_bayes_categories" does not exist   admin   18:00
E_USER_ERROR (256)   classes/db/pgsql.php:46   Query CREATE TABLE IF NOT EXISTS ttrss_plugin_af_sort_bayes_references ( id SERIAL NOT NULL PRIMARY KEY, document_id VARCHAR(255) NOT NULL, category_id INTEGER NOT NULL REFERENCES ttrss_plugin_af_sort_bayes_categories(id) ON DELETE CASCADE, owner_uid INTEGER NOT NULL REFERENCES ttrss_users(id) ON DELETE CASCADE, content text NOT NULL) failed: ERROR: relation "ttrss_plugin_af_sort_bayes_categories" does not exist   admin   18:00
E_USER_ERROR (256)   classes/db/pgsql.php:46   Query CREATE TABLE IF NOT EXISTS ttrss_plugin_af_sort_bayes_categories ( id SERIAL NOT NULL PRIMARY KEY, category varchar(100) NOT NULL DEFAULT '', probability DOUBLE NOT NULL DEFAULT '0', owner_uid INTEGER NOT NULL REFERENCES ttrss_users(id) ON DELETE CASCADE, word_count BIGINT NOT NULL DEFAULT '0') failed: ERROR: type "double" does not exist LINE 4: probability DOUBLE NOT NULL DEFAULT '0', ^   admin   18:00
E_USER_ERROR (256)   classes/db/pgsql.php:46   Query INSERT INTO ttrss_plugin_af_sort_bayes_categories (category, owner_uid) VALUES ('NEUTRAL', 1) failed: ERROR: relation "ttrss_plugin_af_sort_bayes_categories" does not exist LINE 1: INSERT INTO ttrss_plugin_af_sort_bayes_categories (category,... ^   admin   17:59
E_USER_ERROR (256)   classes/db/pgsql.php:46   Query INSERT INTO ttrss_plugin_af_sort_bayes_categories (category, owner_uid) VALUES ('GOOD', 1) failed: ERROR: relation "ttrss_plugin_af_sort_bayes_categories" does not exist LINE 1: INSERT INTO ttrss_plugin_af_sort_bayes_categories (category,... ^


so ttrss failed to create ttrss_plugin_af_sort_bayes_categories in the first place!?
i will take a closer look when i get off work.

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Bayesian classifier for TTRSS

Postby fox » 17 Jun 2015, 19:06

plugin should create the tables when it loads, strange.

nameless
Bear Rating Master
Bear Rating Master
Posts: 126
Joined: 28 Aug 2013, 20:33

Re: Bayesian classifier for TTRSS

Postby nameless » 17 Jun 2015, 19:12

it is.
can anybody confirm this as a bug or did i fuck up my postgres setup?

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Bayesian classifier for TTRSS

Postby fox » 17 Jun 2015, 19:23

try updating to latest trunk, it's very unlikely that something is wrong with your database

i mean you can always disable the plugin and it'll go back to normal

nameless
Bear Rating Master
Bear Rating Master
Posts: 126
Joined: 28 Aug 2013, 20:33

Re: Bayesian classifier for TTRSS

Postby nameless » 17 Jun 2015, 19:32

I just updated to latest to latest trunk things are still broken though.
Also disabling the plugin doesen't revert things back to normal, errors still occur.

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Bayesian classifier for TTRSS

Postby fox » 17 Jun 2015, 19:35

if you disable the plugin while updates are ongoing you are going to continue having errors until all ttrss processes terminate

i just checked on mysql too and table was created automatically btw

e: also your log doesn't show anything about plugin failing to create the table, only that relation doesn't exist, strange

nameless
Bear Rating Master
Bear Rating Master
Posts: 126
Joined: 28 Aug 2013, 20:33

Re: Bayesian classifier for TTRSS

Postby nameless » 17 Jun 2015, 19:38

so it looks as if i broke my database :(

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Bayesian classifier for TTRSS

Postby fox » 17 Jun 2015, 19:38

like I said it is very unlikely, i'd even say impossible

nameless
Bear Rating Master
Bear Rating Master
Posts: 126
Joined: 28 Aug 2013, 20:33

Re: Bayesian classifier for TTRSS

Postby nameless » 17 Jun 2015, 19:42

mhm.
is there anything else i can provide that could help debug this?

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Bayesian classifier for TTRSS

Postby fox » 17 Jun 2015, 19:43

oh wait there was a typo in the psql schema script, update and it should fix itself

e: there was another thing in there, gonna be fixed in a few minutes

not the smoothest rollout so far :)

e2: done

nameless
Bear Rating Master
Bear Rating Master
Posts: 126
Joined: 28 Aug 2013, 20:33

Re: Bayesian classifier for TTRSS

Postby nameless » 17 Jun 2015, 19:47

yep fixed.
i am left with this error message now though

Code: Select all

E_WARNING (2)   classes/db/pgsql.php:38   pg_query(): Query failed: ERROR: null value in column "content" violates not-null constraint DETAIL: Failing row contains (1, SHA1:43cdfc3ce412d0b85cc3e371a358936edf6cf145, 1, 1, null).   admin   18:46
E_USER_ERROR (256)   classes/db/pgsql.php:46   Query INSERT INTO ttrss_plugin_af_sort_bayes_references (document_id, category_id, owner_uid) VALUES ('SHA1:43cdfc3ce412d0b85cc3e371a358936edf6cf145', '1', 1) failed: ERROR: null value in column "content" violates not-null constraint DETAIL: Failing row contains (1, SHA1:43cdfc3ce412d0b85cc3e371a358936edf6cf145, 1, 1, null).   admin   18:46

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Bayesian classifier for TTRSS

Postby fox » 17 Jun 2015, 19:49

you can just run

Code: Select all

alter table ttrss_plugin_af_sort_bayes_references drop column content


it was that thing i just fixed

e3: just a footnote, plugin starts automatic classification when ugly aka neutral category reaches 10k words stored

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Bayesian classifier for TTRSS

Postby fox » 17 Jun 2015, 21:45

i only have a few articles in BAD category and i'm seeing it match way too much, maybe its more emphasized because there's not much data so every word counts more or whatever

maybe limiting to two categories would work better, idk

himynameschris
Bear Rating Trainee
Bear Rating Trainee
Posts: 2
Joined: 17 Jun 2015, 03:24

Re: Bayesian classifier for TTRSS

Postby himynameschris » 18 Jun 2015, 01:03

I would say that not having enough articles trained and then used to update the model is going to be the biggest reason for your results, you should try with 10 or 20 in each category.

I've also noticed that the plugin is only removing the words "'the', 'that', 'you', 'for', 'and'", this list of stopwords (words that don't carry enough meaning to be included in Bayesian classification) could be expanded to include other common words.

For example, I've noticed that some RSS feeds such as those from Ars Technica, Forbes and BBS will contain text such as "Read more" or "Continue reading", or even "this is an article about...", these alone could throw your results off. I think this would greatly improve any model based on RSS feeds.


Return to “Development”

Who is online

Users browsing this forum: No registered users and 1 guest