Рет қаралды 329
This episode is sponsored by Datadog - a single, unified platform for monitoring CoreDNS alongside the rest of your stack. Try it free for 14 days and get a free t-shirt datadoghq.com/kubefm
===
In this KubeFM episode, Faris shares his experience managing CoreDNS and scaling Kubernetes clusters with 900 nodes and 15k pods.
He shares the challenges and solutions encountered during an incident, providing valuable insights into maintaining a robust Kubernetes environment.
You will learn:
- The importance of scaling the Kubernetes control plane for large clusters.
- Strategies for optimizing CoreDNS to ensure efficient DNS resolution and prevent incidents.
- The pros and cons of using VictoriaMetrics versus Prometheus for monitoring and observability.
- Tips for maintaining a calm and effective team dynamic during high-stress situations.
Find all the links and info for this episode here: kube.fm/coredns-scaling-farris
===
Interested in sponsoring a KubeFM episode? kube.fm/sponsorships
===
CHAPTERS
=========
00:00 Intro
00:00 Emerging Tools: Karpenter, Cluster API, and Cilium
02:09 From middleware engineer to DevOps and observability Expert
03:44 Keeping up with Kubernetes
04:34 Start coding early
05:18 Scaling EKS with VictoriaMetrics
06:52 The journey from Prometheus to VictoriaMetrics
12:32 The incident
19:59 Stress-free problem solving
21:28 Optimizing CoreDNS for large clusters
24:19 Scaling and updating control planes
25:42 Challenges and costs of migrating clusters
28:07 Tips for CoreDNS scaling
29:52 Staying calm
31:27 Writing and community feedback
33:55 What's next?
34:15 Outro
LISTEN ON
=========
- Apple Podcast kube.fm/apple
- Spotify kube.fm/spotify
- Amazon Music kube.fm/amazon
- Overcast kube.fm/overcast
- Pocket casts kube.fm/pocket-casts
- Deezer kube.fm/deezer