Denise (
denise) wrote in
dw_maintenance2011-12-21 05:39 pm
![[staff profile]](https://www.dreamwidth.org/img/silk/identity/user_staff.png)
![[site community profile]](https://www.dreamwidth.org/img/comm_staff.png)
import queue delays
Thanks to a few people reporting problems with their import jobs stuck on the "verify" step for an extended period of time, we discovered a bottleneck in the import process we hadn't realized. We've taken steps to fix it. If your import was showing as "ready to be inserted into the queue", those jobs are now being moved into the import queue more quickly. (That's why, if you've been watching the queue on the Import Journal page, the numbers just jumped like whoa.)
It will take time for the importer to process all the queued jobs -- whenever there's a surge in account creation, there's a corresponding surge in import jobs -- but fear not, once they're scheduled your import jobs will run. You don't have to leave the page open: just schedule the job and wander off, and sooner or later you will look at your journal and all of your stuff will be there like magic. :)
Basically, the problem was: we actually have two import queues. The first is the queue for the "import-scheduler" job: it verifies your username and password on the remote site (since there's no sense in retrying a job that's going to fail because the authentication is incorrect) and then puts the job into the scheduler queue. From there, a worker moves the job from the scheduler queue to the actual import queue for a TheSchwartz worker to actually do the import. (The reason why our worker-manager is known as TheSchwartz is a long, long story. *G*) The import queue that was showing on the import page is the TheSchwartz import queue, not the import-scheduler queue.
We discovered, when people reported problems with the "ready to be inserted into queue" issue, that the job that moves jobs from the import-scheduler queue to the TheSchwartz import queue was set to only move one job from queue to queue every 60 seconds. This is a delay we built into the system deliberately, and frankly, none of us remembered quite why -- it was either to avoid overwhelming things and was there from the beginning or was a kind of artificial delay that we put in during a period of LJ DDoS. Usually it's not a problem, because very few people are trying to import at any given time and both queues are usually at or close to 0.
So, as it turns out, that "one job every 60 seconds" combined with the high import traffic today meant that there were over 1200 jobs in the import-scheduler queue, being moved to the TheSchwartz import queue very very slowly while more and more came in. Hence the backlog!
We've removed the artificial delay, and jobs are now being moved from the import-scheduler queue into the TheSchwartz import queue as they come in and can be verified. So, the only limit now will be the speed at which the imports can run.
EDIT, 8:40PM EDT: Sorry about the rampant internal server error problems -- we thought it was a problem with the new webserver, but it turned out that imports were happening too fast and were locking up the database. Mark has throttled back the import speed enough that the errors should go away now. (This means that imports will be happening more slowly, but the queue's backed up enough right now that it probably won't make much difference anyway!)
EDIT, 4:30 PM EDT, 12/23: As always happens whenever we have an influx of new users, the import queue is very, very busy right now. Your import will almost certainly take at least a day to finish. Please be patient! Once your job is in the queue, it will complete eventually and you don't need to stay logged into the site or leave your computer on. Just start it and go do other things, and eventually your stuff will catch up with you. :)
It will take time for the importer to process all the queued jobs -- whenever there's a surge in account creation, there's a corresponding surge in import jobs -- but fear not, once they're scheduled your import jobs will run. You don't have to leave the page open: just schedule the job and wander off, and sooner or later you will look at your journal and all of your stuff will be there like magic. :)
Basically, the problem was: we actually have two import queues. The first is the queue for the "import-scheduler" job: it verifies your username and password on the remote site (since there's no sense in retrying a job that's going to fail because the authentication is incorrect) and then puts the job into the scheduler queue. From there, a worker moves the job from the scheduler queue to the actual import queue for a TheSchwartz worker to actually do the import. (The reason why our worker-manager is known as TheSchwartz is a long, long story. *G*) The import queue that was showing on the import page is the TheSchwartz import queue, not the import-scheduler queue.
We discovered, when people reported problems with the "ready to be inserted into queue" issue, that the job that moves jobs from the import-scheduler queue to the TheSchwartz import queue was set to only move one job from queue to queue every 60 seconds. This is a delay we built into the system deliberately, and frankly, none of us remembered quite why -- it was either to avoid overwhelming things and was there from the beginning or was a kind of artificial delay that we put in during a period of LJ DDoS. Usually it's not a problem, because very few people are trying to import at any given time and both queues are usually at or close to 0.
So, as it turns out, that "one job every 60 seconds" combined with the high import traffic today meant that there were over 1200 jobs in the import-scheduler queue, being moved to the TheSchwartz import queue very very slowly while more and more came in. Hence the backlog!
We've removed the artificial delay, and jobs are now being moved from the import-scheduler queue into the TheSchwartz import queue as they come in and can be verified. So, the only limit now will be the speed at which the imports can run.
EDIT, 8:40PM EDT: Sorry about the rampant internal server error problems -- we thought it was a problem with the new webserver, but it turned out that imports were happening too fast and were locking up the database. Mark has throttled back the import speed enough that the errors should go away now. (This means that imports will be happening more slowly, but the queue's backed up enough right now that it probably won't make much difference anyway!)
EDIT, 4:30 PM EDT, 12/23: As always happens whenever we have an influx of new users, the import queue is very, very busy right now. Your import will almost certainly take at least a day to finish. Please be patient! Once your job is in the queue, it will complete eventually and you don't need to stay logged into the site or leave your computer on. Just start it and go do other things, and eventually your stuff will catch up with you. :)