mark: A photo of Mark kneeling on top of the Taal Volcano in the Philippines. It was a long hike. (Default)
Mark Smith ([staff profile] mark) wrote in [site community profile] dw_maintenance2019-12-02 11:50 am

Notifications slow -- but recovering

Hi all,

Due to some behind the scenes maintenance last night, our notifications system got delayed. I've fixed the issue now and it's working on catching up.

For details -- I've been experimenting with Kubernetes as a way to make managing production easier (and hopefully reduce costs!), but it turns out that one of our worker jobs that handles notifications doesn't use much CPU (it mostly spends time waiting on the database).

This caused the pod autoscaler to reduce the size of that particular deployment below what we needed to sustain throughput on our notifications service. The temporary fix is to pin that deployment size to something much larger, the better fix will be to integrate Kubernetes' pod autoscaler with the ability to monitor the queue depth on our task queue.

Sorry for the trouble, and thank you for the person who pinged us on Twitter. When I checked last night, everything was working, but as traffic came back up we fell behind and I wasn't watching anymore. My bad.
ilyena_sylph: picture of Labyrinth!faerie with 'careful, i bite' as text (Default)

[personal profile] ilyena_sylph 2019-12-03 01:24 am (UTC)(link)
Hey, this is from a few months ago when Mark added an "/@username" feature in the Markdown step.

Last I heard, they're working on a fix for the stuf in textboxes and such, so, speaking as just another DW-izen... hang on a bit still?
Edited 2019-12-03 01:24 (UTC)
devilbear: (Nap Time (Juggie))

[personal profile] devilbear 2019-12-03 01:28 am (UTC)(link)
Ooh, shiny, thanks for telling me. I didn't really keep track of that since I wasn't dealing with journal stuff back then, so I guess that's why I never remembered or caught on that it caused an issue. (In fact, I assumed it was brand new, oops.) Good to know they're working on a fix already!