af_readability oddity on fresh install

Support requests, bug reports, etc. go here. Dedicated servers / VDS hosting only
Not a robot
Bear Rating Trainee
Bear Rating Trainee
Posts: 8
Joined: 04 Oct 2016, 13:53

af_readability oddity on fresh install

Postby Not a robot » 21 Apr 2017, 19:09

Hi, not sure if this should be in Support or Themes and plugins, but...

I'm wondering if anyone else is having problems with the built in plugin af_readability? I have a fresh install (details below), and wasn't having problems on another server using PHP 5.4, MySQL and whatever version of tt-rss was current on git in December.

The standard format for Arstechnica posts for example, is a picture, followed by text. When af_readability is enabled, one of two things happen in tt-rss. The initial picture appears, then either the post starts from the first <a href> link, missing all of the previous text, or all of the text up to that link will appear as a link to code.google.com/<the actual post URL>. If I disable af_readability the post looks fine. I can live without af_readability, I'm just worried that something is broken on my install.

Feed: http://feeds.arstechnica.com/arstechnica/index

On io9 af_readability doesn't seem to be working at all, yet it was on the older install. As someone mentioned elsewhere, Gawker seem to have screwed up the "VIP" RSS feeds, so I switched to "full" last year.

Feed: http://feeds.gawker.com/io9/full

Myfeedsucks looks fine for both feeds.

Using: Centos 7.3.1611, PostgreSQL 9.2.18, PHP 7.1.4, nginx/1.12.0, Tiny Tiny RSS v17.1 (0eed023) via git.

Is anyone else seeing the same or have I broken something? Could the OPML "with settings" import have caused an issue? I'm happy to do some debugging or try a different plugin.

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: af_readability oddity on fresh install

Postby fox » 21 Apr 2017, 19:42

i've just subbed to your ars technica feed, and it looks normal to me, after readability:

https://fakecake.org/uploads/2017/04vd7Geb.png

maybe something else is screwing with things, but i'm not sure what it could be. it would helped if you shared the post and linked it here so i could take a look at the html.

readability fails sometimes because it's all about guessing. however, in this case i don't think the library is at fault, it's something else.

>Is anyone else seeing the same or have I broken something? Could the OPML "with settings" import have caused an issue? I'm happy to do some debugging or try a different plugin.

all of this is *extremely* unlikely, especially the opml

Not a robot
Bear Rating Trainee
Bear Rating Trainee
Posts: 8
Joined: 04 Oct 2016, 13:53

Re: af_readability oddity on fresh install

Postby Not a robot » 21 Apr 2017, 21:54

Yeah, I couldn't see it being the OPML import, but something's up.

Example:
https://arstechnica.com/science/2017/04 ... e-warming/

The attached image shows what I get if I hover over the text at the start of the post, html does too (below as I can't attach it). Sorry, it is plus.google.com not code.google.com So it looks like maybe one of the "share this post" type things is breaking it.

Ah ha! Disabling af_zz_imgproxy resolves the Arstechnica issue! No idea why. I'll do some experimenting.

Code: Select all

<div itemprop="articleBody" readability="116.93038316244">

<figure><img src="https://read.xxxxxx.co.uk/tt-rss/public.php?op=pluginhandler&amp;plugin=af_zz_imgproxy&amp;pmethod=imgproxy&amp;url=https%3A%2F%2Fcdn.arstechnica.net%2Fwp-content%2Fuploads%2F2017%2F04%2Fnasa_viz_model_proj-800x450.jpg"><figcaption></figcaption></figure><aside><div>
  <span>Share this story</span>
  <a href="https://www.facebook.com/sharer.php?u=https%3A%2F%2Farstechnica.com%2F%3Fpost_type%3Dpost%26p%3D1081227" target="_blank" title="Share on Facebook" rel="noopener noreferrer">
  </a><a href="https://twitter.com/share?text=Once+more+with+feeling%3A+Climate+models+don%E2%80%99t+exaggerate+warming&amp;url=https%3A%2F%2Farstechnica.com%2F%3Fpost_type%3Dpost%26p%3D1081227" target="_blank" title="Share on Twitter" rel="noopener noreferrer">
  </a><a href="https://www.reddit.com/submit?url=https%3A%2F%2Farstechnica.com%2F%3Fpost_type%3Dpost%26p%3D1081227&amp;title=Once+more+with+feeling%3A+Climate+models+don%E2%80%99t+exaggerate+warming" target="_blank" title="Share on Reddit" rel="noopener noreferrer">
  </a><a href="https://plus.google.com/share?url=https%3A%2F%2Farstechnica.com%2F%3Fpost_type%3Dpost%26p%3D1081227" target="_blank" title="Share on Google+" rel="noopener noreferrer">
</a></div><a href="https://plus.google.com/share?url=https%3A%2F%2Farstechnica.com%2F%3Fpost_type%3Dpost%26p%3D1081227" target="_blank" title="Share on Google+" rel="noopener noreferrer">
</a></aside><!-- cache hit 323:single/related:1e0942011d827bff7d205b333b5b9471 --><!-- empty --><p><a href="https://plus.google.com/share?url=https%3A%2F%2Farstechnica.com%2F%3Fpost_type%3Dpost%26p%3D1081227" target="_blank" title="Share on Google+" rel="noopener noreferrer">If you follow climate science news, you know that one of the hotter topics is “climate sensitivity”—the precise amount of warming you get for a given increase of greenhouse gases. A few years ago, a couple papers caused a stir by trying to estimate this sensitivity based on simple equations for the recent past, coming up with a lower warming sensitivity than numerous other studies based on climate models or paleoclimate records. The last IPCC report even </a><a href="https://arstechnica.com/science/2013/12/your-questions-about-the-new-ipcc-climate-change-report-answered/" rel="noopener noreferrer" target="_blank">widened its estimated range</a> slightly to encompass these studies, which proved controversial.</p>

etc...

I'll dig more to work out what's wrong with io9 too, but that's probably unrelated.
Attachments
Screenshot from 2017-04-21 19-21-19.png
Screenshot from 2017-04-21 19-21-19.png (539.94 KiB) Viewed 5219 times

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: af_readability oddity on fresh install

Postby fox » 21 Apr 2017, 22:35

i dunno, i have both plugins enabled and this post displays just fine. proxy plugin works on images only and - like other plugins in tt-rss - uses DOM. therefore, it can't possibly screw up document markup.

i suggest you try a clean browser profile / incognito to rule out a badly coded browser addon or maybe privoxy or something gone bad.

Not a robot
Bear Rating Trainee
Bear Rating Trainee
Posts: 8
Joined: 04 Oct 2016, 13:53

Re: af_readability oddity on fresh install

Postby Not a robot » 21 Apr 2017, 23:40

I was getting the same Arstechnica issue on Firefox & Chrome on multiple OS / PCs, so couldn't see a connection. You're spot on though, disabling the EFF's "HTTPS Everywhere" *seems* to have solved the Arstechnica issue. Quite what it is doing, I don't know, as it isn't showing anything as blocked. The same config was ok with the old server too.

Thanks for the push in the right direction!

Not a robot
Bear Rating Trainee
Bear Rating Trainee
Posts: 8
Joined: 04 Oct 2016, 13:53

Re: af_readability oddity on fresh install

Postby Not a robot » 22 Apr 2017, 18:00

Unfortunately, it isn't (just?) HTTPS Everywhere causing this. It seemed that various addins caused the "Share on Google+" linking at the start some stories, so I tried a virgin addin-less Chromium Guest session and saw the issue on the two pages below.

If anyone is bored, what's different about these 2 links from most of the other Arstechnica posts?
https://arstechnica.com/information-tec ... -noticing/
https://arstechnica.com/gaming/2017/04/ ... -struggle/

Not a big problem, I'm just curious at this point. Disabling either af_readability or af_zz_imgproxy resolves the issue.

Another html example for the bored...

Code: Select all

<div itemprop="articleBody" readability="191.84944550241">
         
<figure><img src="https://read.xxxxxx.co.uk/tt-rss/public.php?op=pluginhandler&amp;plugin=af_zz_imgproxy&amp;pmethod=imgproxy&amp;url=https%3A%2F%2Fcdn.arstechnica.net%2Fwp-content%2Fuploads%2F2017%2F04%2FDoW3-3-800x450.jpg"><figcaption readability="0.92222222222222"><div readability="29.511111111111"><a href="https://cdn.arstechnica.net/wp-content/uploads/2017/04/DoW3-3.jpg" data-height="1080" data-width="1920" rel="noopener noreferrer" target="_blank">Enlarge</a> <span>/</span> <em>Dawn of War 3</em>'s multiplayer takes a lot of ideas from MOBAs like <em>LoL</em> and <em>Dota 2</em>.</div></figcaption></figure><aside><div>
  <span>Share this story</span>
  <a href="https://www.facebook.com/sharer.php?u=https%3A%2F%2Farstechnica.com%2F%3Fpost_type%3Dpost%26p%3D1080583" target="_blank" title="Share on Facebook" rel="noopener noreferrer">
  </a><a href="https://twitter.com/share?text=%3Cem%3EWarhammer+40K%3A+Dawn+of+War+3%3C%2Fem%3E+review%3A+Twilight+struggle&amp;url=https%3A%2F%2Farstechnica.com%2F%3Fpost_type%3Dpost%26p%3D1080583" target="_blank" title="Share on Twitter" rel="noopener noreferrer">
  </a><a href="https://www.reddit.com/submit?url=https%3A%2F%2Farstechnica.com%2F%3Fpost_type%3Dpost%26p%3D1080583&amp;title=%3Cem%3EWarhammer+40K%3A+Dawn+of+War+3%3C%2Fem%3E+review%3A+Twilight+struggle" target="_blank" title="Share on Reddit" rel="noopener noreferrer">
  </a><a href="https://plus.google.com/share?url=https%3A%2F%2Farstechnica.com%2F%3Fpost_type%3Dpost%26p%3D1080583" target="_blank" title="Share on Google+" rel="noopener noreferrer">
</a></div><a href="https://plus.google.com/share?url=https%3A%2F%2Farstechnica.com%2F%3Fpost_type%3Dpost%26p%3D1080583" target="_blank" title="Share on Google+" rel="noopener noreferrer">
</a></aside><!-- cache miss 257:single/related:31bd3308024b2e24dacb2e811e8edfef --><!-- empty -->


And then the third "Share on Google+" which never receives a closing </a>

Code: Select all

<a href="https://plus.google.com/share?url=https%3A%2F%2Farstechnica.com%2F%3Fpost_type%3Dpost%26p%3D1080583" target="_blank" title="Share on Google+" rel="noopener noreferrer">


Then the rest of the page:

Code: Select all

<em>Dawn of War 3</em> is a pretty obvious callback to the original <em>Warhammer 40,000: Dawn of War</em>, the game that first interpreted the grim darkness of the far future through real-time strategy. It features the requisite base and unit building and resource collecting you’d expect from a real-time strategy game, but it's still not quite the pure, old-school expression of that formula you might expect.
<p><em>Dawn of War 2</em>’s extremely ambitious (though somewhat flawed) campaign was sort of like a multi-character <em>Diablo</em> with loot and branching missions. That plan is gone altogether in the latest iteration. In its place is the linear story of Acheron, the Wandering Planet, and the three factions <em>Warhammer 40K </em>faithful should probably expect at this point.</p>
etc, etc


Return to “Support”

Who is online

Users browsing this forum: No registered users and 2 guests