Update daemon scalability ?

Support requests, bug reports, etc. go here. Dedicated servers / VDS hosting only
ldidry
Bear Rating Trainee
Bear Rating Trainee
Posts: 27
Joined: 11 Jun 2013, 20:43

Update daemon scalability ?

Postby ldidry » 11 Jun 2013, 21:44

Hello,

I'm the manager of framanews.org, a new public instance of Tiny Tiny Rss and we've got a scalabilty problem.
Here's the hardware and software situation :
  • Ubuntu 12.04.2 LTS
  • Tiny Tiny Rss 1.7.9
  • Mysql 5.5.29
  • 24Go RAM
  • proc i7 4x2,80GHz

As you can see, we don't have shitty hardware :D

We have 300 users now, with 23 131 feeds (16 994 distincts).

The problem is the update daemon. I can't have all the threads updated in the last half hour (default setting).
My update daemon uses 7 threads, with an interval of 1.

Is there some magic trick I didn't see ? Will using Postgresql help me ? (I have already tuned mysql a lot, it was my first bottleneck, due to disk access).

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Update daemon scalability ?

Postby fox » 11 Jun 2013, 22:06

Your feed update speed could be limited by network throughput and mysql being shit.

Also, take note that daemon will not handle arbitrarily large workloads. This is by design. The ideal use-case for tt-rss is, as far as I'm concerned, small to medium groups or family/single user instances potentially united by a mesh of syndicated data.

If someone wants to make another feedly or whatever off my work, they will have to heavily invest into the backend system scalability, trunk is not going to give it to them on a silver platter. I don't like cloud services where people are the product and I am not going to waste my personal time helping people create more of them.

HTH.

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Update daemon scalability ?

Postby fox » 11 Jun 2013, 22:30

I just want to add that obviously this has nothing to do with you personally or your service, it's just the reasoning behind me not bothering with tt-rss scalability beyond a specific limit.

ldidry
Bear Rating Trainee
Bear Rating Trainee
Posts: 27
Joined: 11 Jun 2013, 20:43

Re: Update daemon scalability ?

Postby ldidry » 11 Jun 2013, 22:55

I don't think network is the problem since it's an dedicated server (100Mb/s up and down). Mysql's a shit, yeah, that's what I thought.

We are not a company, we are an association which militate for an open internet and for free softwares (as in freedom of speech). We propose some services like etherpad-lite, ethercalc, opensondage and other stuff, to make people discover these services and install it themselves.

Don't worry, if I were you, doing a software for me (at the beginning), I don't think I would work beyond a certain limit, not too far from my use of my software.

Anyway, thanks for ttrs, it's awesome !

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Update daemon scalability ?

Postby fox » 11 Jun 2013, 23:17

You can try enabling additional debugging by defining DAEMON_EXTENDED_DEBUG (warning: this will break frontend-initiated feed updated by e.g. double clicking on a feed). In my experience mostly feed parsing was limited by network and simplepie being terrible slow shit, simplepie is not used anymore, so it should be network and DB bound. If you are network-bound (which is a possibility, you might be on a great pipe, but feedburner will rate limit you anyway) you can try running more tasks in parallel.

Personally I would switch to postgres because it always performed faster for me. Also, you can gain some performance by mandating article purging (FORCE_ARTICLE_PURGE) so that your database wouldn't get too big which also affects feed importing.

You can tweak caching to be more aggressive, but this would require actual code changes.

User avatar
sleeper_service
Bear Rating Overlord
Bear Rating Overlord
Posts: 884
Joined: 30 Mar 2013, 23:50
Location: Dallas, Texas

Re: Update daemon scalability ?

Postby sleeper_service » 12 Jun 2013, 00:29

ldidry wrote:The problem is the update daemon. I can't have all the threads updated in the last half hour (default setting).
My update daemon uses 7 threads, with an interval of 1.


why only 7 threads? they're not very cpu intensive, given how much network latency you've got waiting for the other end to spit back data.

ldidry
Bear Rating Trainee
Bear Rating Trainee
Posts: 27
Joined: 11 Jun 2013, 20:43

Re: Update daemon scalability ?

Postby ldidry » 12 Jun 2013, 02:28

The update.php is not cpu intensive, indeed, but mysql is. That's why I don't use 8 threads : I would not be able to do something else on the server.

User avatar
sleeper_service
Bear Rating Overlord
Bear Rating Overlord
Posts: 884
Joined: 30 Mar 2013, 23:50
Location: Dallas, Texas

Re: Update daemon scalability ?

Postby sleeper_service » 12 Jun 2013, 05:27

ldidry wrote:The update.php is not cpu intensive, indeed, but mysql is. That's why I don't use 8 threads : I would not be able to do something else on the server.

huh, I just forked off 20 threads on mine, and didn't see server util hit over 25%, and my system's slower than yours.

though, I did recently convert over to postgres... i never tried that when running mysql. might be worth looking into a conversion.

ldidry
Bear Rating Trainee
Bear Rating Trainee
Posts: 27
Joined: 11 Jun 2013, 20:43

Re: Update daemon scalability ?

Postby ldidry » 12 Jun 2013, 11:40

And the bottleneck is… the cache ! I put it on tmpfs and waooo ! Hit the lightspeed, Chewie !
Here's my /etc/fstab relevant line :

Code: Select all

none            /var/www/ttrss/cache/ tmpfs defaults,uid=1000,gid=1000,mode=750,size=4G 0 0


(Yeah, 4G is a lot, but we have a lot of users and feeds)

Here's a post on how not to loose the files while rebooting : http://blog.sebian.fr/linux-sur-disque-ssd/

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Update daemon scalability ?

Postby fox » 12 Jun 2013, 14:19

Strange, cache writing shouldn't really generate that much I/O.

ldidry
Bear Rating Trainee
Bear Rating Trainee
Posts: 27
Joined: 11 Jun 2013, 20:43

Re: Update daemon scalability ?

Postby ldidry » 12 Jun 2013, 19:34

Maybe the heavy cache reading ? When I put the cache on tmpfs, a lot of feeds get updated really fast.

Whatever, I just put another instance with postgresql. It's night and day. I can use more threads for my update daemon (6/7 was the max with mysql) and the load is more steady than with mysql.
Well, all 300 users haven't migrated yet, but it seems to be on a good way.

User avatar
fox
^ me reading your posts ^
Posts: 6318
Joined: 27 Aug 2005, 22:53
Location: Saint-Petersburg, Russia
Contact:

Re: Update daemon scalability ?

Postby fox » 12 Jun 2013, 19:43

Yeah innodb is shit and its performance is terrible.

User avatar
frzguida
Bear Rating Disaster
Bear Rating Disaster
Posts: 50
Joined: 14 May 2013, 18:59
Contact:

Re: R: Update daemon scalability ?

Postby frzguida » 19 Jul 2013, 07:00

there is a way to schedule running of more update.php concurrents?
example: i notice that after few time two instances running, update the same feed at the same time generating an sql error of duplicate key for an hash.
I think second instance doesn't know which feeds the first one is going to update...
how can i do it?
Inviato dal mio GT-S5570 (2.3.6 XWKTH e baseband XWKT8) con Tapatalk 2

AngryChris
Bear Rating Master
Bear Rating Master
Posts: 135
Joined: 08 Apr 2013, 02:42

Re: Update daemon scalability ?

Postby AngryChris » 19 Jul 2013, 15:50

frzguida wrote:there is a way to schedule running of more update.php concurrents?
example: i notice that after few time two instances running, update the same feed at the same time generating an sql error of duplicate key for an hash.
I think second instance doesn't know which feeds the first one is going to update...
how can i do it?
Inviato dal mio GT-S5570 (2.3.6 XWKTH e baseband XWKT8) con Tapatalk 2

You need to be using update_daemon2.php then. You can change the number of concurrent children changing the MAX_JOBS value inside the script itself. You should not be running parallel copies of update.php. If you change the number of MAX_JOBS in update_daemon2.php, then I'd suggest renaming it with the new value (e.g.; update_daemon5.php).

User avatar
sleeper_service
Bear Rating Overlord
Bear Rating Overlord
Posts: 884
Joined: 30 Mar 2013, 23:50
Location: Dallas, Texas

Re: Update daemon scalability ?

Postby sleeper_service » 19 Jul 2013, 16:09

AngryChris wrote: You can change the number of concurrent children changing the MAX_JOBS value inside the script itself. You should not be running parallel copies of update.php. .


you can also use the --tasks switch if you don't feel like having to edit every time you update.


Return to “Support”

Who is online

Users browsing this forum: No registered users and 10 guests