Denise (
denise) wrote in
dw_maintenance2013-02-12 04:39 am
![[staff profile]](https://www.dreamwidth.org/img/silk/identity/user_staff.png)
![[site community profile]](https://www.dreamwidth.org/img/comm_staff.png)
Network service restored
At approximately 1AM EST / 6AM GMT, connectivity to the site was lost. It was entirely connectivity-based: our service provider had a networking issue with their own providers. So, our servers were just fine, they just couldn't talk to anybody!
Connectivity appears to have been restored at this point. We're really sorry for the inconvenience.
Connectivity appears to have been restored at this point. We're really sorry for the inconvenience.
no subject
no subject
This.
I was fast asleep during the outage, and saw the twitter notices on my stream this morning, I was forewarned, I felt fully informed, and if DW had still been down, it woulnd't have been a problem at all - technology happens.
no subject
no subject
no subject
Thank you for keeping us updated on Twitter.
no subject
Thank you for twitter, indeed!
no subject
no subject
no subject
You learn something every day...
It's a small (fannish) world after all...
Glad you're back!
I admit it felt very surreal posting to LJ because DW was down, instead of the reverse!
no subject
no subject
I did wonder where you'd gone first thing this morning (UK time).
no subject
no subject
no subject
no subject
no subject
no subject
In other words, the datacenter had previously made provisions for redundant circuits from separate providers so that an outage from one provider would not result in the datacenter being unreachable, but due to a not-yet-known cause, the maintenance that was conducted without notice by one of their circuit providers caused the datacenter to be unreachable. They are following up with their providers to determine what caused the problem and how to avoid it in the future.
No provider is able to guarantee 100% uptime, but we've always been extremely satisfied with Peer1's reliability and their responsiveness when issues do happen; they're definitely among the best options out there. In this case, they acted quickly and decisively, and we are confident they will handle the followup appropriately.
no subject
Perfection I do not expect. Response to problems, that I expect.
no subject
no subject
"The transport circuits that connect Dallas and San Antonio are supplied by different providers, and are specified as running on diverse fiber paths. Continuing our investigation, the Network Operations Center was able to determine that the connectivity loss was the result of a fiber maintenance being performed by an underlying long-haul carrier, for which PEER 1 Hosting was not notified. At 03:30 CT, the first transport circuit came back online, which fully restored network connectivity to the San Antonio Data Center.
At the moment it is unclear how separate providers with previously diverse circuits paths would both be impacted by the same fiber work. We will be following up with our San Antonio transport providers and assessing how this scope of work had the impact that it did and then take appropriate action to ensure circuit redundancy is reinstituted. Additionally, PEER 1 Hosting will be doing an audit of the current fiber paths being used throughout our transport links to ensure full network redundancy is available in all locations."
no subject
This sounds very much like an angry network admin with a hammer who is trying to sound professional before starting to hit things.
no subject
no subject
M
no subject
no subject
Thanks for keeping us updated, both here and on Twitter!
no subject
no subject
no subject
no subject
Although yes, go ahead and try it for in case it actually decides to like you and behave.
no subject
no subject
no subject
no subject
Never noticed a thing. Which shows how awesome you guys are.
no subject
no subject
no subject
no subject
no subject
no subject
I'm thinking an internal activity monitor that notices when load dips below a certain threshold for a certain length of time, and then starts doing some queued crunching that might have otherwise been saved for a maintenance window?
no subject
no subject