Рет қаралды 146
David Josephsen's session from Monitorama PDX 2024.
At scale, all Observability projects are data-engineering problems, which require big-data tools and techniques to solve. But many of these tools -- particularly general-purpose streaming frameworks -- often make less than ideal trade-offs which negatively impact latency, or make troubleshooting difficult. Meanwhile, a lot of purpose-built monitoring tool-chain either doesn't scale as well as it claims, or requires nefarious hacks to reach your target ingestion rate. In either case, these systems invariably wind up costing more than you intended.
Whether you're designing a high-volume telemetry pipeline from scratch, or shoring up an existing system that's having trouble scaling, I want to share with you a powerful, reductive thought-pattern that has helped me build and maintain 5 different massive-scale telemetry pipelines in as many years. In this talk, I'll introduce you to my friends, Poe (point of enrichment) and Pug (Point uf aGGregation), and together we'll learn about how they can help you define your tooling requirements, reduce your end-to-end latency, and perhaps most importantly, control spend.