Remove duplicates per category - How?

Request new functionality here
mkl
Bear Rating Trainee
Bear Rating Trainee
Posts: 20
Joined: 21 Mar 2013, 01:27

Remove duplicates per category - How?

Postby mkl » 05 Apr 2013, 23:50

The Wiki entry page states that TT-RSS does "detecting and filtering duplicate articles".
How does that work?
I found a reference to N-gram Duplicate Checking. It is based on Postgres, which I do not have in my webspace package.

I am asking because I have the following problem:
I subscribe to feeds on flightglobal.com. They are all in one category.
They re-publish content from one feed to others, so I see the same item twice, but from different feeds.
Two example GUIDs from the ttrss_feeds table:

Code: Select all

1,http://www.flightglobal.com/Articles/2013/03/21/383742/saab-2000-aew-customer-signs-five-year-support-deal.html
1,http://www.flightglobal.com/Articles/2013/04/05/383742/saab-2000-aew-customer-signs-five-year-support-deal.html

The content and content hash values are identical.
It would be easy to weed out the second post based on the id number in the GUID, in combination with title and content.
I have deactivated "allow duplicates" in the settings already.

Is duplicate detection possible per category, or across multiple feeds ?
If not, how could that be done?
Can I help?

Grüße
Michael

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Remove duplicates per category - How?

Postby fox » 06 Apr 2013, 00:29

I like you you never noticed the allow duplicate articles preference which is the first one in the list.

mkl
Bear Rating Trainee
Bear Rating Trainee
Posts: 20
Joined: 21 Mar 2013, 01:27

Re: Remove duplicates per category - How?

Postby mkl » 06 Apr 2013, 02:13

fox wrote:I like you you never noticed the allow duplicate articles preference which is the first one in the list.

Well, I did notice and deactivated it.
But I could not see any effect for my problem.

mkl
Bear Rating Trainee
Bear Rating Trainee
Posts: 20
Joined: 21 Mar 2013, 01:27

Re: Remove duplicates per category - How?

Postby mkl » 06 Apr 2013, 03:07

Can a plug-in also suppress entire articles?
Looking at ./include/rssfuncs.php, I suspect I can use HOOK_FEED_PARSED.
Are there any other plug-ins using this interface?

Thanks
Michael

mkl
Bear Rating Trainee
Bear Rating Trainee
Posts: 20
Joined: 21 Mar 2013, 01:27

Re: Remove duplicates per category - How?

Postby mkl » 06 Apr 2013, 18:07

I checked out HOOK_FEED_PARSED.
It gets a SimplePie object structure.
As far as I can see, SimplePie can fetch, parse and deliver feeds, but not modify .
I have not found a way to delete an item from a feed.

A HOOK_ARTICLE_FILTER can modify feed items, but not remove them entirely.

I'd say I am out of luck.
Have I messed something?


Return to “Feature requests”

Who is online

Users browsing this forum: No registered users and 5 guests