denise: Image: Me, facing away from camera, on top of the Castel Sant'Angelo in Rome (Default)
Denise ([staff profile] denise) wrote in [site community profile] dw_maintenance2011-07-28 03:56 pm

site slowness: it's complicated!

The site slowness for the past day or so has been due to a bug somewhere in our code that's causing our webserver processes to run out of memory too quickly and lock up the machine.

[personal profile] alierak has been staying on top of things and tweaking the webserver settings to keep things running and to make sure that the settings we're using have the best chance of not running into the "run out of memory, lock up machine" problem. Unfortunately, this means that -- in order to minimize the chance that the site is down entirely -- we've had to seriously lower the number of webserver processes that are running at any time and lower the amount of time before they restart by themselves (and free up the locked-up memory). This means that there are fewer webserver processes available to accept your requests and serve you pages from the site.

Basically, at this point it's a case of "down because of the problem or slow because of the steps we're taking to fix the problem"!

Since it's obvious at this point that just webserver tweaks isn't going to cut it for now, we're doing two things to get the site back to its usual zippy self:

a) Trying to find the root cause of the bug that's making our webserver processes freak out. Memory leaks are really hard to find and debug, which is why it's taking so long. We have a few ideas on how to find what's causing it, and [personal profile] fu is concentrating on that end.

b) Seeing what we can do to get more resources into the webserver pool so that even though the webservers are running out of memory quickly and we have to resource-starve them in order to keep them from checking out entirely, we'll still be able to get pages to load quickly without the delay we're experiencing right now. There's an easy way and a hard way for this, too. (And hopefully, the easy way will help enough that we won't have to get to the hard way.)

(This sort of thing always happens when [staff profile] mark is literally unreachable -- he's on vacation for two weeks in remotest Alaska, with no cell phone reception -- but I wanted to specifically give a massive thank you to [personal profile] alierak, our backup sysadmin, who is doing wonders with the problem.)
keris: Keris with guitar (Default)

[personal profile] keris 2011-07-29 09:19 am (UTC)(link)
My Perl isn't so much rusty as idiosyncratic *g*. I just looked at the Bugzilla instance...
pauamma: Cartooney crab wearing hot pink and acid green facemask holding drink with straw (Default)

[personal profile] pauamma 2011-07-29 02:49 pm (UTC)(link)
I think I remember you from somewhere, for some reason. You ever hung out in a.c, a.s.r, or #c?
keris: Keris with guitar (Default)

[personal profile] keris 2011-07-29 03:44 pm (UTC)(link)
The Scary Devil Monastery? A little, many years ago. Very briefly in alt.callahans (most recently in 2004 via a crosspost from several other newsgroups), I found the volume of traffic there overwhelming. And several times intermittently on alt.comp*, comp.std.c and comp.std.c++ and the like (and in all sort of odd places in alt.* and rec.* and uk.*, particularly the filk and SF fandom related ones). Almost always with keris or keristor somewhere in the email address (and if anyone is using one of those the odds are good that it's me; on LJ keris is someone else, and .org I think is a German medical organisation).

Mind you, in the heyday of Usenet I didn't do Perl, I was C/C++ only (well, plus various assemblers and odd bits of scripting in bash and awk).