howto use af_feedmod

Howtos, instructions and links to related software. Do not ask questions here.
thermionic
Bear Rating Trainee
Bear Rating Trainee
Posts: 42
Joined: 15 May 2013, 13:50

howto use af_feedmod

Postby thermionic » 17 May 2013, 13:01

This is a worked example for two sites http://msmvps.com/blogs/ehlo/default.aspx and http://theoatmeal.com/

It was tested on ttrss 1.7.9 and it is presumed that the af_feedmod in already enabled.

To use af_feedmod to expand the synopsis feed to a full article feed, you need two things, the first is an "array key" which is a unique identifier for the feed so that af_feedmod can distinguish the posts for this feed from other feeds, the second is an xpath for the article. This howto only covers a basic xpath, more complex ones can be created, but they are outside the scope of this howto.

Starting with http://msmvps.com/blogs/ehlo/default.aspx

The feed URL is http://msmvps.com/blogs/ehlo/rss.aspx, and an example full article URL is http://msmvps.com/blogs/ehlo/archive/20 ... 30371.aspx

After checking that none of my other feeds use ehlo in the URL, I will use ehlo as the "array key"

The xpath is a little more work however only a few moments work in Chrome for a basic xpath

Open an article page in Chrome, right click and choose "Inspect element".

Scroll down and expand through the element until you can identify the div that most accurately covers the article, for this example the ideal looks to be <div class="post"

Now you need to create the json to be posted to the FeedMod preferences tab.

Code: Select all

{
    "ehlo": {
        "type": "xpath",
        "xpath": "div[@class='post']"
    }
}


Now to add http://theoatmeal.com/

The published feed URL is http://feeds.feedburner.com/oatmealfeed and an example full article is http://theoatmeal.com/blog/tesla_museum_saved

To get XML from feedburner, append ?format=xml to the URL, so for the example above use http://feeds.feedburner.com/oatmealfeed?format=xml you might need similar changes from other feed providers.

For the "array key", I will use oatmealfeed

For the xpath the most appropriate divs would be <div id="comic" and <div id="blog" to cover both types of article

Now to create the json

Code: Select all

{
    "oatmealfeed": {
        "type": "xpath",
        "xpath": "div[@id='comic'] | //div[@id='blog']"
    }
}


As however a second definition is being added, it should look like the below

Code: Select all

{
    "ehlo": {
        "type": "xpath",
        "xpath": "div[@class='post']"
    },
    "oatmealfeed": {
        "type": "xpath",
        "xpath": "div[@id='comic'] | //div[@id='blog']"
    }
}


Many thanks to lotrfan who helped me to understand it and feader who provided some hints.

dang
Bear Rating Trainee
Bear Rating Trainee
Posts: 14
Joined: 19 Mar 2013, 22:06

Re: howto use af_feedmod

Postby dang » 20 May 2013, 20:01

Here's some rules that I use.

Code: Select all

{

"arstechnica": {
    "type": "xpath",
    "xpath": "article"
},
"oatmeal": {
    "type": "xpath",
    "xpath": "div[@id='comic'] | //div[@id='blog']"
},
"somethingpositive": {
    "type": "xpath",
    "xpath": "table[2]/tbody/tr[9]/td/table/tbody/tr[2]/td"
},
"tnemrot": {
    "type": "xpath",
    "xpath": "div[@id='comic']"
},
"hackaday": {
    "type": "xpath",
    "xpath": "div[@id='content']"
},
"girlswithslingshots": {
    "type": "xpath",
    "xpath": "div[@id='comicbody']"
}

}

maerco
Bear Rating Trainee
Bear Rating Trainee
Posts: 10
Joined: 04 Oct 2013, 12:27

Re: howto use af_feedmod

Postby maerco » 04 Oct 2013, 12:30

I'm really not able to let it working with wire.it

Code: Select all

http://www.wired.it/rss.xml


this is the configuration I used:

Code: Select all

{
"wired": {
    "type": "xpath",
    "xpath": "div[@class='leaf_contents_article']"
}
}


Thanks in advance for the help

User avatar
syh
Bear Rating Trainee
Bear Rating Trainee
Posts: 5
Joined: 26 Jun 2013, 23:29

Re: howto use af_feedmod

Postby syh » 04 Oct 2013, 21:25

It looks like class "leaf_contents_article" is applied to an <article> tag and not a <div>. Change your xpath to article[@class='leaf_contents_article'] and give it a try.

maerco
Bear Rating Trainee
Bear Rating Trainee
Posts: 10
Joined: 04 Oct 2013, 12:27

Re: howto use af_feedmod

Postby maerco » 05 Oct 2013, 20:36

I tested also the suggested modification, but I still do not get the full feed

Code: Select all

"wired": {
    "type": "xpath",
    "xpath": "article[@class='leaf_contents_article']"
}

maximilian
Bear Rating Trainee
Bear Rating Trainee
Posts: 3
Joined: 30 Jun 2010, 12:45

Re: howto use af_feedmod

Postby maximilian » 10 Oct 2013, 23:23

This is what I use for wired.com and it works

Code: Select all

"wired.com": {
    "type": "xpath",
    "xpath": "div[@class='entry']",
    "force_charset": "utf-8"

If I look at your page in Italian I would suggest using

Code: Select all

"wired.it": {
    "type": "xpath",
    "xpath": "div[@class='article_content']",
    "force_charset": "utf-8"

Give it a try.
Max.

maerco
Bear Rating Trainee
Bear Rating Trainee
Posts: 10
Joined: 04 Oct 2013, 12:27

Re: howto use af_feedmod

Postby maerco » 12 Oct 2013, 11:15

Many thanks maximilian,
Finally i got my fullrss thanks to your suggestion

User avatar
fxneumann
Bear Rating Trainee
Bear Rating Trainee
Posts: 8
Joined: 01 Jun 2013, 18:38
Location: Bonn, Germany
Contact:

Re: howto use af_feedmod

Postby fxneumann » 24 Oct 2013, 14:04

Hi,

I'm trying to use af_feedmod for http://www.toothpastefordinner.com/rss/rss.php but can't get it to work. My JSON:

Code: Select all

"toothpastefordinner.com": {
    "type": "xpath",
    "xpath": "html/body/table[2]/tbody/tr[5]/td/div/div[@class='headertext4']/div"
}


What am I doing wrong?

feader
Bear Rating Master
Bear Rating Master
Posts: 160
Joined: 26 Dec 2012, 20:03

Re: howto use af_feedmod

Postby feader » 24 Oct 2013, 19:29

fxneumann wrote:I'm trying to use af_feedmod for http://www.toothpastefordinner.com/rss/rss.php but can't get it to work. My JSON:

A few thoughts:
    • That is a horrible xpath.
    • The comics are horrible too.
    • There are URLs for the feed entries, but I can't see how they are related. IMO af_feedmod has trouble with this feed because it tacitly assumes that each feed entry's URL leads to a separate webpage.
So in conclusion: The xpath probably has to look somewhat like the one you used, maybe

Code: Select all

table[2]/tbody/tr[5]/td/div/div[@class='headertext4']/div

is a bit better. Note that this will only work for the very first entry in the feed, so if the feed is only updated one entry at a time, you fetch it with a higher frequency, and the content provider doesn't change the table layout too often, you should be fine.
If you want a better solution, you probably have to develop a plugin for this feed only.

User avatar
fxneumann
Bear Rating Trainee
Bear Rating Trainee
Posts: 8
Joined: 01 Jun 2013, 18:38
Location: Bonn, Germany
Contact:

Re: howto use af_feedmod

Postby fxneumann » 27 Oct 2013, 16:48

Thanks for your help, but it didn't work – the site's design is probably just too fucked up to be processed automatically without way too much effort.

mrhanman
Bear Rating Trainee
Bear Rating Trainee
Posts: 2
Joined: 31 Oct 2013, 01:04

Re: howto use af_feedmod

Postby mrhanman » 31 Oct 2013, 01:09

I've been messing with this plugin, and I've had some success on a couple sites. On two sites, I've gotten the comic to come in, but not the blog post. On another, I can't get anything to show up. If any one can help, I'd really appreciate it. Here are my settings :

Code: Select all

{
"bearmageddon": {
    "type": "xpath",
    "xpath": "div[@id='comic'] | //div[@id='entry']"
},
"brawlinthefamily": {
    "type": "xpath",
    "xpath": "div[@id='comic'] | //div[@id='entry']"
},
"campcomic": {
    "type": "xpath",
    "xpath": "div[@id='comic']"
}
}


The comics are http://bearmageddon.com/feed/, http://www.brawlinthefamily.com/?feed=rss2, and http://campcomic.com/rss

Thanks for the help!


Return to “Knowledge Base”

Who is online

Users browsing this forum: No registered users and 1 guest