Feed causes tt-rss to go crazy?

Support requests, bug reports, etc. go here. Dedicated servers / VDS hosting only
User avatar
HunterZ
Bear Rating Disaster
Bear Rating Disaster
Posts: 60
Joined: 21 Mar 2013, 03:30
Location: Seattle

Feed causes tt-rss to go crazy?

Postby HunterZ » 13 Aug 2013, 20:38

So I subscribed to the following forum-generated feed yesterday: http://prospector.freeforums.org/feed.php

This is an extremely low-volume forum, but when I checked tt-rss this morning I saw over 800 new "articles" listed. Turns out that it was listing dozens of copies of each RSS article as if they were different, and I'm not sure why. I don't see any redundant articles or unreasonable-looking timestamps in the current feed source, but I'm not an RSS guru.

xtaz
Bear Rating Master
Bear Rating Master
Posts: 174
Joined: 24 Dec 2009, 16:48

Re: Feed causes tt-rss to go crazy?

Postby xtaz » 13 Aug 2013, 20:47

Cuz it looks like some idiot has put a session id in the id field which is going to change every time the session is renewed, which is probably every time the feed it fetched. Id should be permanent and never change. It's this which is used to track articles.

User avatar
HunterZ
Bear Rating Disaster
Bear Rating Disaster
Posts: 60
Joined: 21 Mar 2013, 03:30
Location: Seattle

Re: Feed causes tt-rss to go crazy?

Postby HunterZ » 13 Aug 2013, 20:56

I must be blind - can you point me to a specific example? When I look at the feed code, I'm only seeing id tags with post URLs containing only topic and post numbers.

xtaz
Bear Rating Master
Bear Rating Master
Posts: 174
Joined: 24 Dec 2009, 16:48

Re: Feed causes tt-rss to go crazy?

Postby xtaz » 13 Aug 2013, 21:05

That's weird! When I look at it now there's no sid. Looks like it's randomly changing the content then. Even better! Basically when I looked at it 10 minutes ago it looked like this:

Code: Select all

<id>http://prospector.freeforums.org/viewtopic.php?t=371&amp;p=1640&amp;sid=1bbcf113816943435593fd5943159cf7#p1640</id>

User avatar
HunterZ
Bear Rating Disaster
Bear Rating Disaster
Posts: 60
Joined: 21 Mar 2013, 03:30
Location: Seattle

Re: Feed causes tt-rss to go crazy?

Postby HunterZ » 13 Aug 2013, 21:19

Awesome. I've unsubscribed, as the maintainer of the forum hasn't even logged in all year - and probably has no idea that the RSS feature even exists in the first place. I just wanted to make sure it wasn't something on tt-rss' end.

feader
Bear Rating Master
Bear Rating Master
Posts: 160
Joined: 26 Dec 2012, 20:03

Re: Feed causes tt-rss to go crazy?

Postby feader » 13 Aug 2013, 21:31

We have had this before - you don't see the sids if cookies are set, i.e.

Code: Select all

$curl -c /tmp/cookieJar.txt prospector.freeforums.org/feed.php
[…]
<link rel="self" type="application/atom+xml" href="http://prospector.freeforums.org/feed.php?sid=57fd73e259655a6199740b6ff771491d" />
<feed data with sids>

$curl -b /tmp/cookieJar.txt prospector.freeforums.org/feed.php
[…]
<link rel="self" type="application/atom+xml" href="http://prospector.freeforums.org/feed.php" />
<feed data without sids>

You can use the ff_FeedCleaner plugin to erase the sids. It was originally created for such a case (and back then, it also was a forum feed that showed this behaviour).

ml78
Bear Rating Trainee
Bear Rating Trainee
Posts: 2
Joined: 13 Aug 2013, 21:01

Re: Feed causes tt-rss to go crazy?

Postby ml78 » 13 Aug 2013, 22:52

Same problem with this feed : http://www.romandie.com/rss/flux.xml

Each refresh always gets 150 new feeds, most of them the same as previous.
If I check the feed a few hours later, I can get dozens of same Subject and Time feeds

User avatar
HunterZ
Bear Rating Disaster
Bear Rating Disaster
Posts: 60
Joined: 21 Mar 2013, 03:30
Location: Seattle

Re: Feed causes tt-rss to go crazy?

Postby HunterZ » 13 Aug 2013, 23:05

Ooh, a regex plugin. Thanks!

feader
Bear Rating Master
Bear Rating Master
Posts: 160
Joined: 26 Dec 2012, 20:03

Re: Feed causes tt-rss to go crazy?

Postby feader » 14 Aug 2013, 15:50

ml78 wrote:Same problem with this feed : http://www.romandie.com/rss/flux.xml

Problem here is that the guid changes:

Code: Select all

[…]
<guid>http://www.romandie.com/news/n.asp?n=Nouvelles_regles_de_cybersecurite_pour_l_administration_federale22140820131310.asp-16959</guid>
[…]
[a bit later]
[…]
<guid>http://www.romandie.com/news/n.asp?n=Nouvelles_regles_de_cybersecurite_pour_l_administration_federale22140820131310.asp-16606</guid>
[…]

more precisely it does so in the suffix /-[0-9]+$/. Best course is to tell the content provider to omit these suffixes, or omit the <guid> tag entirely since the <link> stuff looks fine.
In the meantime, you could remove the guid yourself, or use ff_FeedCleaner with

Code: Select all

[
    {
        "URL": "www.romandie.com/rss/",
        "type": "xpath_regex",
        "xpath": "//item/guid",
        "pattern": "/-[0-9]+$/",
        "replacement": ""
    }
]

(Disclaimer: I didn't test it).

ml78
Bear Rating Trainee
Bear Rating Trainee
Posts: 2
Joined: 13 Aug 2013, 21:01

Re: Feed causes tt-rss to go crazy?

Postby ml78 » 16 Aug 2013, 01:48

Thanks. Works fine for me.


Return to “Support”

Who is online

Users browsing this forum: No registered users and 11 guests