mark: A photo of Mark kneeling on top of the Taal Volcano in the Philippines. It was a long hike. (Default)
Mark Smith ([staff profile] mark) wrote in [site community profile] dw_maintenance2019-12-02 11:50 am

Notifications slow -- but recovering

Hi all,

Due to some behind the scenes maintenance last night, our notifications system got delayed. I've fixed the issue now and it's working on catching up.

For details -- I've been experimenting with Kubernetes as a way to make managing production easier (and hopefully reduce costs!), but it turns out that one of our worker jobs that handles notifications doesn't use much CPU (it mostly spends time waiting on the database).

This caused the pod autoscaler to reduce the size of that particular deployment below what we needed to sustain throughput on our notifications service. The temporary fix is to pin that deployment size to something much larger, the better fix will be to integrate Kubernetes' pod autoscaler with the ability to monitor the queue depth on our task queue.

Sorry for the trouble, and thank you for the person who pinged us on Twitter. When I checked last night, everything was working, but as traffic came back up we fell behind and I wasn't watching anymore. My bad.
madgastronomer: detail of Astral Personneby Remedios Varo (Default)

Re: Sending emails

[personal profile] madgastronomer 2019-12-09 02:51 am (UTC)(link)
Hey, look, I don't think you're coming across the way you want to. Maybe back off a bit and assume that people do know what they're doing, and ask questions that are phrased as being about helping you to understand what's happening. Many of the questions you've been asking sound confrontational, and as if you think you know better than the people who've been working with the codebase for years. It sounds like you could have some really good ideas, but right now I think you're alienating people.
dennisgorelik: 2020-06-13 in my home office (Default)

Re: Sending emails

[personal profile] dennisgorelik 2019-12-09 03:46 am (UTC)(link)
> Maybe back off a bit and assume that people do know what they're doing

I definitely assume that people who run website with 3 million/month total users -- know what they are doing. Most developers do not reach that.

I also assume that these same competent developers/devops -- make occasional mistakes. There is no shame in making mistakes. I make mistakes too.

I do not mean that my suggestions are necessarily correct.
Actually many of my suggestions are likely to be suboptimal for implementing on dreamwidth.org -- for one or another reason.
I do not know these reasons and looking forward for a feedback from somebody who would point me to these specific reasons.
That would allow me to adjust my suggestions so they fit better to what dreamwidth.org needs.
So, hopefully, some of my suggestions would, actually, help to make dreamwidth.org to work better.
kore: (Default)

Re: Sending emails

[personal profile] kore 2019-12-12 03:47 pm (UTC)(link)
Seconded.