using af_feedmod

Support requests, bug reports, etc. go here. Dedicated servers / VDS hosting only
feader
Bear Rating Master
Bear Rating Master
Posts: 160
Joined: 26 Dec 2012, 20:03

Re: using af_feedmod

Postby feader » 17 May 2013, 03:53

While we're at this topic: I have trouble with http://www.grandprix.com/rss.xml. The plugin refuses to recognize my beautifully crafted XPath expression

Code: Select all

"grandprix.com": {
    "type": "xpath",
    "xpath": "h6[@class='wsw-storydate']/following-sibling::p"
}

Maybe it's time to try to tweak its code a bit …

lotrfan
Bear Rating Disaster
Bear Rating Disaster
Posts: 73
Joined: 18 Mar 2013, 04:42

Re: using af_feedmod

Postby lotrfan » 17 May 2013, 08:55

Try adding

Code: Select all

                case 'xpath_all':
                    $doc = new DOMDocument();

                    if (version_compare(VERSION_STATIC, '1.7.9', '>=')) {
                        $html = fetch_file_contents($article['link']);
                        $content_type = $fetch_last_content_type;
                    } else {
                        // fallback to file_get_contents()
                        $html = file_get_contents($article['link']);

                        // try to fetch charset from HTTP headers
                        $headers = $http_response_header;
                        $content_type = false;
                        foreach ($headers as $h) {
                            if (substr(strtolower($h), 0, 13) == 'content-type:') {
                                $content_type = substr($h, 14);
                                // don't break here to find LATEST (if redirected) entry
                            }
                        }
                    }

                    if (!isset($config['force_charset'])) {
                        $charset = false;
                        if ($content_type) {
                            preg_match('/charset=(\S+)/', $content_type, $matches);
                            if (isset($matches[1]) && !empty($matches[1])) $charset = $matches[1];
                        }

                        if ($charset) {
                            $html = '<?xml encoding="' . $charset . '">' . $html;
                        }
                    } else {
                        // use forced charset
                        $html = '<?xml encoding="' . $config['force_charset'] . '">' . $html;
                    }

                    @$doc->loadHTML($html);

                    if ($doc) {
                        $xpath = new DOMXPath($doc);
                        $entries = $xpath->query('(//'.$config['xpath'].')');   // find main DIV according to config

                        if ($entries->length > 0) {
                            $article['content'] = '';
                            foreach ($entries as $entry) {
                                if ($entry) {
                                    $article['content'] .= $doc->saveXML($entry);
                                }
                            }
                            $article['plugin_data'] = "feedmod,$owner_uid:" . $article['plugin_data'];
                        }
                    }
                    break;

in the switch block, just before the "default:" line.
Then use

Code: Select all

"grandprix.com": {
    "type": "xpath_all",
    "xpath": "h6[@class='wsw-storydate']/following-sibling::p"
}


The original method only uses the first element "returned" from the XPath expression... I just added a loop.
There have been some reports of PHP returning elements in an unexpected order, but it seems to work for the feed you posted...

The above also uses the change I mentioned earlier (VERSION -> VERSION_STATIC). I'm not sure when VERSION_STATIC came into existence, so if you're running an older instance, you might want to update (or change VERSION_STATIC back to VERSION in the code above)

thermionic
Bear Rating Trainee
Bear Rating Trainee
Posts: 42
Joined: 15 May 2013, 13:50

Re: using af_feedmod

Postby thermionic » 17 May 2013, 12:48

lotrfan wrote:Try

Code: Select all

"xpath": "div[@id='comic'] | //div[@id='blog']"


Not sure if it will work, but it's worth a shot...


That worked :-)

thermionic
Bear Rating Trainee
Bear Rating Trainee
Posts: 42
Joined: 15 May 2013, 13:50

Re: using af_feedmod

Postby thermionic » 17 May 2013, 13:02

basic howto posted at viewtopic.php?f=16&t=2045

feader
Bear Rating Master
Bear Rating Master
Posts: 160
Joined: 26 Dec 2012, 20:03

Re: using af_feedmod

Postby feader » 17 May 2013, 17:22

lotrfan, you're incredibly helpful. That's the real community spirit: Making people with quarter knowledge feel good about themselves :mrgreen:
I decided to change the case 'xpath' itsself, and put this

Code: Select all

if ($entries->length > 0) {
    $new_cont = '';
    foreach ($entries as $entry) {
        if ($entry) {
            $new_cont .= $doc->saveXML($entry);
        }
    }

    if ($new_cont !== '' ) {
        $article['content'] = $new_cont;
        $article['plugin_data'] = "feedmod,$owner_uid:" . $article['plugin_data'];
    }
}

in the place of

Code: Select all

if ($entries->length > 0) $basenode = $entries->item(0);

if ($basenode) {
    $article['content'] = $doc->saveXML($basenode);
    $article['plugin_data'] = "feedmod,$owner_uid:" . $article['plugin_data'];
}

Question: The JSON decode at the start of the hook – maybe it would be more elegant to do this in the save() function. Could also gain a bit of performance, but OTOH the overhead of a bit of text parsing is probably negliable in a software that does heavy database duty & fetches network stuff? Edit: OK, thought about it. The object may not live long enough to put this in save() safely. Oh well.

yelfathi
Bear Rating Trainee
Bear Rating Trainee
Posts: 20
Joined: 14 May 2013, 19:32

Re: using af_feedmod

Postby yelfathi » 06 Jul 2013, 22:01

Hi, I have an issue with some french char that does not appear correctly, I try to play with the charset option but with no luck:

"lequipe": {
"type": "xpath",
"xpath": "div[@class='paragr paragraf1']",
"force_charset": "ISO-8859-1" try also UTF-8.

On the webpage I should have this:
Elodie Thomis (attaquante des Bleues) : «On est toujours en préparation, on va dire que c'est un mal pour un bien. On a manqué d'efficacité mais on a été plutôt bonnes dans le jeu. Ca ne m'inquiète pas. A un moment ça paiera. On manque aussi de fraîcheur, même si ça n'explique pas tout.»

But in TT-RSS I've got this:
Elodie Thomis (attaquante des Bleues) : «On est toujours en préparation, on va dire que c'est un mal pour un bien. On a manqué d'efficacité mais on a été plutôt bonnes dans le jeu. Ca ne m'inquiète pas. A un moment ça paiera. On manque aussi de fraîcheur, même si ça n'explique pas tout.»

The corresponding feed is: http://www.lequipe.fr/Xml/Football/Titres/actu_rss.xml

Any clue please?

Thanks!

feader
Bear Rating Master
Bear Rating Master
Posts: 160
Joined: 26 Dec 2012, 20:03

Re: using af_feedmod

Postby feader » 07 Jul 2013, 14:14

yelfathi wrote:Hi, I have an issue with some french char that does not appear correctly, I try to play with the charset option but with no luck:

"lequipe": {
"type": "xpath",
"xpath": "div[@class='paragr paragraf1']",
"force_charset": "ISO-8859-1" try also UTF-8.

[…]

Your problem is that your stored content is UTF-8 encoded, but is interpreted by ttRSS as ISO-8859-1. You see that because each non ASCII character in the text (é, Â etc.) is replaced by two characters (corresponding to two bytes), and verify it with a good text editor (save your second text it as latin1, and reopen in as UTF-8).

I'm not sure want you have to do tough, I'd try it with

Code: Select all

"force_charset": "utf-8"

in the JSON options. Note that any changes you make will only affect newly fetched articles because that's the way af_feedmod rolls (you can modify the annotations made by af_feedmod in the database so it reimports the articles [that may be the effect of the option in the preferences, but I really don't know], or delete and resubscribe to the feed).

yelfathi
Bear Rating Trainee
Bear Rating Trainee
Posts: 20
Joined: 14 May 2013, 19:32

Re: using af_feedmod

Postby yelfathi » 07 Jul 2013, 21:54

Hi feader, as I said in my last post I already tried the utf-8 option but with no success!


Return to “Support”

Who is online

Users browsing this forum: No registered users and 10 guests