Last week, OpenAI has suffered a several hours long outage and published a detailed postmortem about it. Highly recommend reading it. These technical reports are usually a gold mine for all large-scale Kubernetes users, as we all go through similar set of reliability issues running Kubernetes in production. Read More →