![[staff profile]](https://www.dreamwidth.org/img/silk/identity/user_staff.png)
![[site community profile]](https://www.dreamwidth.org/img/comm_staff.png)
Database maintenance
Hi all,
I did some database maintenance today -- moving our workers around! -- and this caused a glitch in the replication between our old databases and the new ones, so the new ones weren't getting all the updated data.
What this means to you: if you saw problems trying to update your access list or subscription filters, or with community invitations, or viewing support requests, that was caused by the glitch in replication. I'm really sorry for the inconvenience.
This particular issue won't recur, since it was caused by a very specific circumstance related to moving the workers around. Since I'm done moving them, the problem won't happen again.
Right now we're migrating from our old master databases (db01
and db02
) to the new pair (db05
and db06
). To do this sanely, I have it set up in a replication chain so that any changes made at the top will trickle down to the bottom ones, like this:
db01 -> db02 -> db05 -> db06
The idea is that, to migrate seamlessly from the old ones to the new ones, at some point in time I just change the configuration files that used to say 01/02 and make them say 05/06. Then, magically and nearly instantaneously, we're using the new databases and after some days I can get rid of the old ones.
Anyway, today I moved our TheSchwartz based workers (they do notifications, emails, and some other tasks). I switched them to the new database cluster -- but of course, nothing is actually instantaneous. What happened was that some of the web servers started using db05
a split-second (literally) before some of the others, so we had a few hundred milliseconds where db01
(the OLD master) received some writes after db05
(new one) did.
The problem was then that both databases assigned the same number to different jobs. (When a job gets inserted, it gets assigned an ID. Since both databases had a slight overlap where they both thought they were boss, both created the same ID!)
This is where the sadness happened, because when db05
tried to replicate the commands that db01
had done in that split-second, there was a conflict: two jobs had the same ID. So, db05
stopped replicating from db01
(technically db02
) and we didn't have an alert on it because it's going to be a master (i.e., it's not supposed to be replicating long term, so I never set up a replication alarm for it).
Anyway, someone reported an issue which I tracked down to a replication problem. It's been fixed, the database is now fully replicated, and the problem won't repeat because the switchover has already happened. db05
is the master for generating IDs for jobs now, db01
is deprecated.
Thanks for reading.
no subject
no subject
no subject
no subject
no subject
[/rant] By all of which I just mean to say, every business should be required by law to model its customer service on DW's. :)
no subject
Aww. We try!
no subject
no subject
no subject
no subject
Oh -- yes, cross-posting is a different system than importing. Sorry about the confusion!
That error message means that our servers weren't able to reach LiveJournal's servers at the time you tried to crosspost. It can happen for a number of different reasons, but it's usually a transient error having to do with LJ being unavailable at the exact second the crosspost attempted. The system will retry the crosspost up to five times, at progressively longer intervals, before failing (each attempt will be numbered in your inbox, so if you only get one failure rather than five, that means the second attempt was successful). After the fifth failure, it won't try again anymore; if that happens, doublecheck that your LJ password is correct, then edit the post and check the unchecked crosspost box to get it to try again. (Then, if you still keep getting failures after that, open a support request.)
no subject
no subject
no subject
That was an unrelated problem, and should be fixed now!
no subject
no subject
no subject
no subject
no subject
no subject
no subject
no subject
no subject
MySQL 5.5 is our platform, specifically we use the Percona build. All of our tables are InnoDB.
no subject