Muli Ben-Yehuda's journal

February 28, 2019

Data center vignettes #1

Filed under: Uncategorized — Muli Ben-Yehuda @ 9:08 AM

It’s a slow Wednesday afternoon. The rain drips outside, collecting in large puddles. The data centers are humming along, the developers are drinking coffee and writing code, the customers’ orders keep coming in though the web sites. All is well in the world of Foo Corp’s infrastructure.

Suddenly, PagerDuty starts paging. The dashboards are turning red. There was a massive spike in demand, and Foo Corp’s databases are struggling to meet it. Request latency is shooting through the roof. Demand is high and growing higher and the systems are unable to handle it.

The ops team jolts into action and the database guys start flooding the relevant slack channels. In a few minutes, you see what happened: the storage systems everything is built on are no longer serving storage. It might be a network issue with the expensive RDMA network you put in; it might be an issue with the new NVMe SSDs you bought that take a looong tiiiime to run their garbage collection cycles. Maybe you’ll figure it out later and write a nice postmortem no one will read. But right now, whatever it is, it’s painful to leave customer orders on the floor because the infrastructure just can’t serve.

Sounds painful? We think so too. Good thing Lightbits LightOS is coming soon.

 

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: