Here's what I figured out so far for my feed with id = 40 -- I followed one article from 2014-02-02 sitting in the DB:
Code: Select all
mysql> select id,owner_uid,update_interval,purge_interval from ttrss_feeds where id='40'\G;
*************************** 1. row ***************************
id: 40
owner_uid: 2
update_interval: 0
purge_interval: 0
1 row in set (0.00 sec)
mysql> select owner_uid,pref_name,value from ttrss_user_prefs where owner_uid='2' and pref_name='PURGE_OLD_DAYS'\G;
*************************** 1. row ***************************
owner_uid: 2
pref_name: PURGE_OLD_DAYS
value: 7
1 row in set (0.00 sec)
mysql> select id,title,updated,date_entered,date_updated from ttrss_entries limit 1\G;
*************************** 1. row ***************************
id: 28016
title: Resident / Episode 143 / February 01 2014
updated: 2014-02-02 06:00:12
date_entered: 2014-02-02 16:53:00
date_updated: 2016-02-21 15:15:32
1 row in set (0.00 sec)
...so I immediately noticed that date_updated was the outlier, and indeed find that the purging code is using that as the key. So that means the bug has to be in the update_rss_feed() function in includes/rssfunc.php, but it's really hard to follow if you don't know the code really well, so I used the debug-feed option to update.php to give it a run:
Code: Select all
$ /usr/bin/php ./update.php --debug-feed 40
...
[15:15:32/30174] guid 2,http://podcast.hernancattaneo.com/2014/02/02/resident-episode-143-february-01-2014/ / SHA1:f41ef61836ea96508585872ee896a328cbd9c6c3
[15:15:32/30174] orig date: 1391320812
[15:15:32/30174] date 1391320812 [2014/02/02 06:00:12]
[15:15:32/30174] title Resident / Episode 143 / February 01 2014
[15:15:32/30174] link http://podcast.hernancattaneo.com/e/resident-episode-143-february-01-2014/
[15:15:32/30174] author Hernan Cattaneo
[15:15:32/30174] num_comments: 0
[15:15:32/30174] looking for tags...
[15:15:32/30174] tags found: podcast
[15:15:32/30174] done collecting data.
[15:15:32/30174] article hash: b93e9b7a27b4fc6a305a32a8f176159b00a333de [stored=b93e9b7a27b4fc6a305a32a8f176159b00a333de]
[15:15:32/30174] stored article seems up to date [IID: 28016], updating timestamp only
So the error seems to be in here somewhere, this is the specific block:
Code: Select all
_debug("article hash: $entry_current_hash [stored=$entry_stored_hash]", $debug_enabled);
if ($entry_current_hash == $entry_stored_hash && !isset($_REQUEST["force_rehash"])) {
_debug("stored article seems up to date [IID: $base_entry_id], updating timestamp only", $debug_enabled);
// we keep encountering the entry in feeds, so we need to
// update date_updated column so that we don't get horrible
// dupes when the entry gets purged and reinserted again e.g.
// in the case of SLOW SLOW OMG SLOW updating feeds
$base_entry_id = db_fetch_result($result, 0, "id");
db_query("UPDATE ttrss_entries SET date_updated = NOW()
WHERE id = '$base_entry_id'");
continue;
}
...I mean, I think -- whatever is wrong needs someone more familiar with the code to help sort out what's not working right. I've left my database intact, anyone have a clue? Is this an old bug that was fixed and I just need to flush my DB somehow of all these stale entries? (there are about 900 of them in ttrss_entries going back to 2013). Thx!