af_gocomics plugin and xpath non-matching?

Development-related discussion, including bundled plugins
frameskip
Bear Rating Trainee
Bear Rating Trainee
Posts: 5
Joined: 06 Apr 2013, 01:00

af_gocomics plugin and xpath non-matching?

Postby frameskip » 12 Jul 2013, 18:26

So I was taking a look at the gocomics plugin the other day, and noticed that fox is pulling the html contents on gocomics.com and using xpath to scrape out the comic URL. Using $entries = $xpath->query('(//img[@src])'); he's able to grab the image from the following (for example):

Code: Select all

<img width="600" src="http://assets.amuniversal.com/b116ca2047f1013010fe001dd8b71c47" onload="Meebo('makeSharable',{element:this, type:'image', shadow:'none', title:'Calvin and Hobbes', url:document.location.href, tweet:'Check out Calvin and Hobbes on GoComics', description:'Check out Calvin and Hobbes on GoComics'})" class="strip" alt="Calvin and Hobbes">


When I looked at the gocomics.com html source, I noticed that there's also a link to a larger version of the comic immediate following, which I would like to display instead:

Code: Select all

<div style="display: none;" id="mutable_972267"><img src="http://assets.amuniversal.com/b1fb3b1047f1013010fe001dd8b71c47?width=900.0" class="strip" alt="B1fb3b1047f1013010fe001dd8b71c47?width=900"></div>


However, after doing some debugging on the plugin, it seems that for some reason the xpath query "//img[@src]" does not find this entry. I was ultimately able scrape it out using "/div/img", but I was wondering if someone could help me understand why this is not found by an "//img" query (that should be pulling all <img> tags).

Cheers.

lotrfan
Bear Rating Disaster
Bear Rating Disaster
Posts: 73
Joined: 18 Mar 2013, 04:42

Re: af_gocomics plugin and xpath non-matching?

Postby lotrfan » 13 Jul 2013, 04:05

The XPath should find it, but the filtering loop (the "foreach ($entries as $entry) { ...") stops on the first img that has a src url matching "http://assets.amuniversal.com/..." .

I updated the plugin to try to find the larger image, and then fall back on the regular size if necessary: https://github.com/lotrfan/Tiny-Tiny-RSS/commit/dc4dbdf5e209b6966c313a1c7c1e760dc46b1589

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: af_gocomics plugin and xpath non-matching?

Postby fox » 14 Jul 2013, 07:28

Thanks, I'll merge it. It would probably be faster to do specific xpath queries instead of iterating but eh.


Return to “Development”

Who is online

Users browsing this forum: No registered users and 3 guests