denise: Image: Me, facing away from camera, on top of the Castel Sant'Angelo in Rome (Default)
Denise ([staff profile] denise) wrote in [site community profile] dw_maintenance2012-02-09 11:08 pm

Record breaking traffic day!

I'll lead with the good news: we've been setting new records for traffic constantly for the past month or so, and just now we broke 30Mbps (megabits of traffic transfered per second) for the first time, which is an awesome milestone that we are totally dancing around and celebrating. People are using our baby!

So, if we're transfering record amounts of data, and if (as mentioned on our offsite status twitter) we added two new webservers today to help push out traffic, why has the site been so sluggish today?

As I mentioned in the 2 Feb dw-news post, the answer is very complex. There are a lot of underlying causes that can look to you guys (the people who are just trying to load your reading page and comment on various posts) like the site is just plain sucking, and I know it must be tempting to wonder: hey, this keeps happening, why can't they just fix it?



To over-simplify the explanation and use a metaphor: think of the site as the counter at a really busy deli at lunch rush. There's a bunch of people behind the counter taking customer orders, making sandwiches, etc, but there's a whole bunch of people waiting in line for their sandwiches, and every person who walks in joins a long line of people waiting. The deli owner can add more people behind the counter making sandwiches (more webservers, which we added today), they can improve the process of taking orders to speed it up by letting you order commonly-requested sandwiches by number (better caching of content that doesn't change often, like images and CSS and JavaScript), and they can rearrange the stuff behind the counter so sandwich makers can work more efficiently and get the sandwiches made faster (code optimization so the webservers complete requests faster) -- but sometimes there's still going to be a line anyway!

Today (the answer changes from day to day, and sometimes from hour to hour), the biggest slowdown comes from entries with large numbers of comments, because the code that generates those pages has a bunch of inefficiencies that don't start showing up until you see a whole lot of people loading those entries and interacting on them -- the code that generates the page takes time to run, and every additional microsecond it takes, it ties up a process on the webserver it's running on. The webservers can only run so many processes at once, so when there's no free webserver processes available, you ask the site for the page and it has to yell "Hang on a second, I'll get to you as soon as I've got a process free!" at you. Even if you aren't viewing or commenting on an entry with lots of comments, somebody else is, and so the servers wind up being busy. (This is like how if you walk into the deli and just want a root beer, you still have to wait in line with all the people who are buying sandwiches.)

We've known this was going to be a problem for a while, but fixing the underlying cause is going to take a lot of work, because there are a bunch of tiny inefficiences that add up to big problems when they're taken all together.

[staff profile] mark has been making a bunch of code fixes that will speed things up in the short term, and [personal profile] allen is working on the more sweeping code changes that will speed things up in the long term. The code changes Mark just made should help a lot, but high traffic periods (evenings, US time) may continue to be sluggish for a few more days.

We've also added more caching of frequently-accessed and unchanging data: CSS files, JavaScript files, images, and icons are all being cached so they load faster, and served from the fast static content frontend, which takes the burden of serving those off the webservers that are working to build pages. Meanwhile, [personal profile] alierak has been working on optimizing our server response so we can squeak every microsecond out of the servers themselves.

So, in short: We're busy! We're all really sorry about the slowdowns at peak traffic times, and we're working really hard to increase capacity and speed up site performance. Thank you all for your patience (you really are the best users a site ownership team could ask for!) and for continuing to use Dreamwidth. This is a very exciting problem to have. :)

Post a comment in response:

This account has disabled anonymous posting.
(will be screened if not validated)
If you don't have an account you can create one now.
HTML doesn't work in the subject.
More info about formatting

If you are unable to use this captcha for any reason, please contact us by email at support@dreamwidth.org