Aug. 19th, 2011

denise: Image: Me, facing away from camera, on top of the Castel Sant'Angelo in Rome (Default)
[staff profile] denise
The downtime we had earlier this morning was due to a server upgrade at our hosting company -- we asked them to upgrade one of our machines from 4GB of RAM to 8GB of RAM, and to do it this time between 5AM and 6AM CDT (their timezone) with 12 hours' notice so we could let y'all know about the planned downtime. Well, they caught the requested window of upgrade time this time, but they missed the "with 12 hours' notice" part!

The machine that was taken out of the pool in order to upgrade happened to be a critical part of our infrastructure (the load balancer -- the machine that "listens" for traffic and directs it to the appropriate places on the network). We have our servers set up so that an outage of one machine, even if it's the load balancer, shouldn't cause the site to be completely down, but this morning exposed a glitch in our configuration that caused the "failover" to the backup load balancer to not kick in the way it was supposed to.

So, once the upgrade was complete, [personal profile] alierak (who is rocking the sysadmin stuff this month) took some time out of his hectic morning to make sure everything went back to running along nicely so I didn't have to wake up [staff profile] mark in the middle of his night, and we'll do what we can to make sure that in the future, losing the main load balancer will cause traffic to fail over to our backup load balancer more seamlessly.

Thanks, as always, to y'all for your patience with us, and for your continued support!

Profile

Dreamwidth Maintenance

April 2025

S M T W T F S
  12345
6789101112
13141516171819
20212223242526
27282930   

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jun. 10th, 2025 02:17 pm
Powered by Dreamwidth Studios