Fix proposal for update deamon memory leaks

Development-related discussion, including bundled plugins
landure
Bear Rating Trainee
Bear Rating Trainee
Posts: 25
Joined: 15 Jan 2008, 12:09

Fix proposal for update deamon memory leaks

Postby landure » 23 Jan 2008, 01:10

My proposal is to split the run of the update daemon.

The memory leaks come from the RSS update code. Since this code is in a PHP loop, PHP script never exit until memory is out limit.

My fix is to run the RSS update code in a new php process at each loop.

Put the RSS update code in a update_daemon_loop.php file
Edit update_daemon.php to replace loop contents by :

Code: Select all

passthru('/usr/bin/php update_daemon_loop.php')


Since the php process where the memory leak is located is renewed at each loop, the memory leak has no more impact.

This update_daemon.php does not seems to terminate unexpectedly.

Here the source for the fix :

http://www.landure.org/uploads/ttrss-daemon.tar.gz

archive contents :
    update_daemon.php : the new update_daemon.php file
    update_daemon_loop.php : The RSS updating code that is looped on by update_daemon.php

    updatedaemon/tt-rss : a init.d script for update_daemon.php
    updatedaemon/default : /etc/default/tt-rss configuration file for /etc/init.d/tt-rss
    updatedaemon/install.sh : a install script for the init.d script.



I know that it is quite dirty for a fix, but it is working, and i find this nicer than the cron job to restart a the update_daemon.php script each time it fails.

What do you think of this solution ?

Pierre-Yves Landuré
http://howto.landure.fr/

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Fix proposal for update deamon memory leaks

Postby fox » 23 Jan 2008, 01:47

That's an interesting idea. It would be even better if we could run several such loops in parallel (locking issues aside for now). I think it should be possible with some fork() magic.

landure
Bear Rating Trainee
Bear Rating Trainee
Posts: 25
Joined: 15 Jan 2008, 12:09

looping

Postby landure » 23 Jan 2008, 08:11

What is the goal of forking ? This process does not seems to be cpu nor memory intensive, and is running every 5 minutes with a limit of updated feeds.

Forking could be used to have differents update process for each Tiny Tiny RSS users, but it is better to have a global one that smartly update feeds contents for each users that are subscribed to one feed.

I don't know how really work the update daemon, but i don't think forking is a good idea.

Pierre-Yves Landuré
http://howto.landure.fr/

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: looping

Postby fox » 23 Jan 2008, 09:39

landure wrote:What is the goal of forking ? This process does not seems to be cpu nor memory intensive, and is running every 5 minutes with a limit of updated feeds.


I think that having just one synchronous update cycle could let update latency increase too much when having a lot of users/feeds. Running several update processes would be a (relatively) easy way to combat the problem.

Forking could be used to have differents update process for each Tiny Tiny RSS users, but it is better to have a global one that smartly update feeds contents for each users that are subscribed to one feed.


Yeah, reworking processing for users, subscribed to one feed, is also something I have in mind. It won't help when there are few or no duplicates, though.

landure
Bear Rating Trainee
Bear Rating Trainee
Posts: 25
Joined: 15 Jan 2008, 12:09

Postby landure » 23 Jan 2008, 12:21

The solution to fork for many user / feeds is a solution that would only work for multi CPU or multi core system.

On single core / single CPU, this would have no effect at all. In my opinion, in case you have many users, you need to tweak the update_daemon settings.

Pierre-Yves Landuré
http://howto.landure.fr/

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Postby fox » 23 Jan 2008, 13:01

The solution to fork for many user / feeds is a solution that would only work for multi CPU or multi core system.


Yep. It would allow tt-rss to scale better on SMP machines. That's the plan.

On single core / single CPU, this would have no effect at all.


Actually, I don't think so. Daemon wastes quite a lot of time fetching stuff from the internet synchronously. Having two tasks doing that in parallel would speed up things quite a bit, even on a single-core machine.

In my opinion, in case you have many users, you need to tweak the update_daemon settings.


The latency would go up anyway. That's why I'm very careful about increasing user quota on online.tt-rss.org.

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Postby fox » 23 Jan 2008, 14:13

OK, this is more or less the model I have in mind:

Code: Select all

#!/usr/bin/php
<?php
   declare(ticks = 1);

   define('MAX_JOBS', 5);
   define('CLIENT_PROCESS', './client.php');
   define('SPAWN_INTERVAL', 10);

   $running_jobs = 0;
   $last_checkpoint = -1;

   function sigchld_handler($signal) {
      global $running_jobs;
      $running_jobs--;
      print posix_getpid() . ": SIGCHLD received, jobs left: $running_jobs\n";
      pcntl_waitpid(-1, $status, WNOHANG);
   }

   pcntl_signal(SIGCHLD, 'sigchld_handler');

   while (1) {

      print "running jobs: $running_jobs\n";

      if ($last_checkpoint + SPAWN_INTERVAL < time()) {

         for ($j = $running_jobs; $j < MAX_JOBS; $j++) {
            print "spawning client $j...";
            $pid = pcntl_fork();
            if ($pid == -1) {
               die("fork failed!\n");
            } else if ($pid) {
               // parent
               $running_jobs++;
               print "OK [$running_jobs]\n";
            } else {
               // child
               pcntl_signal(SIGCHLD, SIG_IGN);
               passthru(CLIENT_PROCESS);
               exit(0);
            }
         }

         $last_checkpoint = time();
      }

      sleep(1);
   }
?>


client.php is a dummy script which sleeps() for random (1..20) amount of seconds to simulate update job.

landure
Bear Rating Trainee
Bear Rating Trainee
Posts: 25
Joined: 15 Jan 2008, 12:09

Postby landure » 23 Jan 2008, 15:07

Hum, well, this is interessing. Actually, you don't need anymore another PHP file in this case. The update_daemon.php can be self contained.

The last_checkpoint var can be replaced by a sleep(SPAWN_INTERVAL)

I've also changed the loop start to a backuped running_jobs value. Otherwise , it keep on forking endlessly.

So here the result of my changes. What do you think of it ?

Code: Select all

#!/usr/bin/php
<?php
   declare(ticks = 1);

   define('MAX_JOBS', 5);
   define('SPAWN_INTERVAL', 10);
   define('DAEMON_SLEEP_INTERVAL', 120);

   $running_jobs = 0;
   $last_checkpoint = -1;

   function sigchld_handler($signal) {
      global $running_jobs;
      $running_jobs--;
      print posix_getpid() . ": SIGCHLD received, jobs left: $running_jobs\n";
      pcntl_waitpid(-1, $status, WNOHANG);
   }

   pcntl_signal(SIGCHLD, 'sigchld_handler');

   while (1) {

     print "running jobs: $running_jobs\n";

     $currently_running_jobs = $running_jobs;

     for ($j = 0; $j < MAX_JOBS - $currently_running_jobs; $j++) {
        print "spawning client $j...";
        $pid = pcntl_fork();
        if ($pid == -1) {
           die("fork failed!\n");
        } else if ($pid) {
           // parent
           $running_jobs++;
           print "OK [$running_jobs]\n";
        } else {
           // child
           pcntl_signal(SIGCHLD, SIG_IGN);

           sleep(rand(5, 60));

           // Replace by the RSS update code.

           exit(0);
        }

        sleep(SPAWN_INTERVAL);
     }

     print sprintf("Spawn loop complete, sleeping for %ds", DAEMON_SLEEP_INTERVAL);
     for($j = 0; $j < DAEMON_SLEEP_INTERVAL; $j++)
     {
       # Fix for sleep being interupted by SIGCHLD signal.
       sleep(1);
     }
   }
?>


edit : fix the problem that sleep is interupted by SIGCHLD signal. May not the best way to fix it (loop on sleep(1)) but, it is working.
Last edited by landure on 23 Jan 2008, 15:46, edited 1 time in total.

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Postby fox » 23 Jan 2008, 15:44

Here is what actually went into trunk:

http://tt-rss.org/trac/browser/update_daemon2.php
http://tt-rss.org/trac/browser/update_d ... client.php

Feel free to experiment. It requires a schema update since .19, though. It has been running for half an hour and seems to work okay.

The update_daemon.php can be self contained.

Pretty much so. Since PHP can do pcntl functions anyway, it's easier for me to keep stuff in one language when possible, tho.

The last_checkpoint var can be replaced by a sleep(SPAWN_INTERVAL)

Yeah, that was the first thing I tried. Doesn't seem to work - SIGCHLD interrupts the sleep() and it gets to fork more than necessary.

I've also changed the loop start to a backuped running_jobs value. Otherwise , it keep on forking endlessly.

As far as I understand, I did a similar correction in trunk.

Ed: NB, the client is still separated because of memory leak issues discussed above. It seems to segfault when exiting, which puzzles me.

landure
Bear Rating Trainee
Bear Rating Trainee
Posts: 25
Joined: 15 Jan 2008, 12:09

Postby landure » 23 Jan 2008, 15:48

here is a fix proposal for the sleep interruption problem.

Code: Select all

 
for($j = 0; $j < DAEMON_SLEEP_INTERVAL; $j++)
     {
       # Fix for sleep being interupted by SIGCHLD signal.
       sleep(1);
     }



(crossposting)

landure
Bear Rating Trainee
Bear Rating Trainee
Posts: 25
Joined: 15 Jan 2008, 12:09

Postby landure » 23 Jan 2008, 15:56

Ok, i can see one quick fix for once :

Code: Select all

define('CLIENT_PROCESS', '/usr/bin/php update_daemon2_client.php SRV_RUN_OK');


i do not like the idea of a php script with a +x flag.

We should run the script by calling it with the php binary.

I will try the new code tonight.

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Postby fox » 23 Jan 2008, 16:01

i do not like the idea of a php script with a +x flag.


Why not? It's not like it's going to be executed by the web server.

landure
Bear Rating Trainee
Bear Rating Trainee
Posts: 25
Joined: 15 Jan 2008, 12:09

Postby landure » 23 Jan 2008, 18:00

because i'm a paranoid freak :D

For example, my tt-rss install is on a filesystem with a no-exec flag. So the executable php script will never work.

In a very very very paranoid setup, only bin directories are executables.

So, if it is possible do not use the +x flag. :D

Pierre-Yves
http://howto.landure.fr/

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Postby fox » 23 Jan 2008, 20:28

In my book, noexec is not worth the trouble. It used to be stupidly easy to work around it calling ld-linux.so directly, no idea if that was fixed. :)

Either way works for me, though, so let it be your way. I'll fix this in trunk.

landure
Bear Rating Trainee
Bear Rating Trainee
Posts: 25
Joined: 15 Jan 2008, 12:09

Postby landure » 23 Jan 2008, 21:58

Ok, i've got time to take a look at update_daemon2.php

And here is the result :

http://www.landure.org/uploads/update_daemon3.php.txt

What have i done ?

- update_daemon2.php and update_daemon2_client.php in the same file
- removed stamp file code that is of no use now.
- removed $last_purge code that is also of no use now. (unless it is done by the main daemon).
- added a cache in order to update the last_update_started BEFORE we actually start to download feeds : it avoid to have to forked process that update the same feed (i think).
- added a database test before starting to fork , in order to avoid fork when there is database problems.
- added a sleep(1) before exit(0) in fork to avoid the fork to exit to quickly (if it exit to quickly, the exit is not taken in count by the main process).

I'm currently running it now, and seems to be ok.

Pierre-Yves
http://howto.landure.fr/


Return to “Development”

Who is online

Users browsing this forum: No registered users and 3 guests