Рет қаралды 1,510
Don't miss out! Join us at our next Flagship Conference: KubeCon + CloudNativeCon North America in Salt Lake City from November 12 - 15, 2024. Connect with our current graduated, incubating, and sandbox projects as the community gathers to further the education and advancement of cloud native computing. Learn more at kubecon.io
The Party Must Go on - Resume Pods After Spot Instance Shut Down - Muvaffak Onuş, QA Wolf
Spot instances are about 60% cheaper but they frequently shut down and not every application can be resilient to handle it without data loss, especially long-running jobs like automated QA tests or data processing pipelines. What if you can migrate your container to another node with near zero-downtime when a shutdown signal is received? At QA Wolf, we heavily rely on spot instances due to their cost-effectiveness but the failures caused by shutdowns were significant enough for our customers to notice. We built a Kubernetes controller that orchestrates snapshot and recovery of containers of the failing nodes to another node where it can resume from the same state. In this talk, we will start with a demo, dive deep into the underlying mechanisms and see how much one can save in which scenario.