Page 1 of 1

PHP execution hangs (?)

Posted: 22 Mar 2012, 17:56
by flosch
Since this morning, I've had problems with my tt-rss installation. Opening the main page (https://this.domain/tt-rss/), the loading bar hangs for 30 seconds at 75%. This seems to coincide with error messages in my apache error.log:

PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 493, referer: DOMAIN/tt-rss/index.php


After those 30 seconds, tt-rss loads, but I am greeted by an empty article pane with the message "Could not update headlines (invalid object received - see error console for details)". The firefox error console doesn't seem to contain any useful information. No errors, at least, just some warnings about things like deprecated functions.

The apache error messages have a tendency to occur every 30 seconds from then on until I close the tab, but sometimes it stops at 1 error message. Article browsing works reasonably well afterwards, regardless of whether PHP spews error messages.

The error line noted above is by far the most common one, but every now and then (I guess that line in MakeWellFormed.php is computationally expensive?), but every now and then, the script gets interrupted in other places. If you want, I can post a full error log; here's an excerpt:
[Thu Mar 22 11:54:00 2012] [error] [client IP] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 493, referer: DOMAIN/tt-rss/index.php
[Thu Mar 22 11:54:30 2012] [error] [client IP] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 493, referer: DOMAIN/tt-rss/index.php
[Thu Mar 22 11:55:00 2012] [error] [client IP] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 493, referer: DOMAIN/tt-rss/index.php
[Thu Mar 22 11:55:37 2012] [error] [client IP] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 493, referer: DOMAIN/tt-rss/index.php
[Thu Mar 22 11:55:38 2012] [notice] child pid 1787 exit signal Segmentation fault (11)
[Thu Mar 22 11:56:07 2012] [error] [client IP] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 493, referer: DOMAIN/tt-rss/index.php
[Thu Mar 22 11:56:08 2012] [notice] child pid 3503 exit signal Segmentation fault (11)
[Thu Mar 22 11:56:37 2012] [error] [client IP] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 493, referer: DOMAIN/tt-rss/index.php
[Thu Mar 22 11:57:07 2012] [error] [client IP] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 493, referer: DOMAIN/tt-rss/index.php
[Thu Mar 22 11:57:37 2012] [error] [client IP] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 493, referer: DOMAIN/tt-rss/index.php
[Thu Mar 22 11:58:12 2012] [error] [client IP] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 493, referer: DOMAIN/tt-rss/index.php
[Thu Mar 22 11:58:42 2012] [error] [client IP] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/ChildDef/Table.php on line 129, referer: DOMAIN/tt-rss/index.php
[Thu Mar 22 11:59:12 2012] [error] [client IP] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 511, referer: DOMAIN/tt-rss/index.php
[Thu Mar 22 11:59:42 2012] [error] [client IP] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 493, referer: DOMAIN/tt-rss/index.php
[Thu Mar 22 12:00:15 2012] [error] [client IP] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 493, referer: DOMAIN/tt-rss/index.php
[Thu Mar 22 12:00:16 2012] [notice] child pid 3966 exit signal Segmentation fault (11)
[Thu Mar 22 12:00:49 2012] [error] [client IP] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 493, referer: DOMAIN/tt-rss/index.php
[Thu Mar 22 12:01:19 2012] [error] [client IP] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 493, referer: DOMAIN/tt-rss/index.php
[Thu Mar 22 12:01:49 2012] [error] [client IP] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 493, referer: DOMAIN/tt-rss/index.php
[Thu Mar 22 12:02:20 2012] [error] [client IP] PHP Fatal error: Maximum execution time of 30 seconds exceeded in /var/www/tt-rss/lib/htmlpurifier/library/HTMLPurifier/Strategy/MakeWellFormed.php on line 511, referer: DOMAIN/tt-rss/index.php


The segfaults worry me a bit, too. I already tried to increase the execution time to 60 seconds, which didn't solve the problem; the initial page load just hangs for 60 seconds at 75% instead of 30 seconds.

I then went and replaced the database by last night's backup. It works again now. So I assume some post in one of my feeds most contains something that makes tt-rss hang and/or even crash. Any idea how to find out which one that is, easily? If possible, I'd rather not manually bisect my feed list (restore backup, delete half of my subscriptions, see whether those break, over and over again).

Re: PHP execution hangs (?)

Posted: 23 Mar 2012, 21:05
by fox
It seems that some HTML code in the article makes HTMLPurifier go crazy. It's probably something in your Fresh feed, otherwise it wouldn't be loaded on startup.

You can open tt-rss with developer console open (F12 in chrome) and see the last query that timeouts.

Re: PHP execution hangs (?)

Posted: 24 Mar 2012, 12:35
by flosch
That was it. I experimented around a bit more, and could narrow it down to a specific feed relatively quickly. It was a news feed that contained several posts with a LOT of links. After I realized that that, I decided to just try my luck, set max_execution_time to 3600 and let it run. It took about 20 minutes or so, then it was done. I marked those posts as read, and now tt-rss is as fast as ever (I just shouldn't "show all" while these posts are still in the DB.)

Sorry for the false alarm.

Though, many links or not, I'm still surprised those few posts made the difference between a couple of seconds of loading time, and 20 minutes.

edit: Do you know whether it is normal that Apache children just segfault under these conditions?

Re: PHP execution hangs (?)

Posted: 21 Oct 2012, 17:13
by paddlaren
I have some feeds making heavy use of links. The only thing that seems to solve my problems is to disable htmlpurifyer. I tried to use longer execution time without success.

This is copied from my lighttpd error log:
2012-10-21 13:49:25: (mod_fastcgi.c.2543) unexpected end-of-file (perhaps the fastcgi process died): pid: 0 socket: unix:/run/php-fpm/php-fpm.sock
2012-10-21 13:49:25: (mod_fastcgi.c.3329) response not received, request sent: 1386 on socket: unix:/run/php-fpm/php-fpm.sock for /backend.php?, closing connection


The question I now have is if it is possible to run the html-purifier off line or as background task to allow longer execution times. Can it be part of the update.php?
When is the actual feed read? Is the feed stored in database or only the item id:s? If the feed is read and buffered in the database until purging I supose it would be possible to make purufication as a separate task or as part of the update task.

I am currently running updates every 4 minutes instead of every 60 seconds.

BR
Erik

Re: PHP execution hangs (?)

Posted: 21 Oct 2012, 18:06
by fox
I can of course use purifier while updating, the only difference would be that it would hang the daemon or your cronjob, which is not exactly better.

The correct solution would be for htmlpurifier to fix the hanging or for me to replace it with something else, but I'm not aware of any alternatives and I don't think anyone ever submitted the bug to them.

Re: PHP execution hangs (?)

Posted: 21 Oct 2012, 20:03
by paddlaren
What is the risks I take by not using it?

Re: PHP execution hangs (?)

Posted: 21 Oct 2012, 20:13
by fox
XSS attacks, this sort of thing. If you don't read questionable feeds, you'll be fine.

Re: PHP execution hangs (?)

Posted: 28 Oct 2012, 12:42
by fox
As of current trunk, I have replaced htmlpurifier with htmLawed, http://www.bioinformatics.org/phplabwar ... /htmLawed/

Also, sanitizing of content now happens while updating which should speed up things a bit.

Re: PHP execution hangs (?)

Posted: 28 Oct 2012, 17:44
by paddlaren
Miss the like button in the forum ;)

Are looking forward to next release.