PromCon 2023 - Planetscale monitoring: Handling billions of active series with Prometheus and Thanos

  Рет қаралды 1,261

Prometheus Monitoring

Prometheus Monitoring

Күн бұрын

Speakers: Sebastian Rabenhorst & Mikołaj Liberski
Deploying and operating a highly available and distributed Prometheus setup at scale can present significant challenges. In this presentation, we will showcase an example of a globally distributed and highly scalable deployment at Shopify. This setup enables us to ingest billions of active time series with tens of millions of samples per second, coming from thousands of applications running in hundreds of Kubernetes clusters.
The main part of the presentation will cover the architecture of our current solution. We will demonstrate how we use Prometheus agents with custom service discovery to scrape and write metrics into regional Thanos receiver deployments. We will also explain how we leverage Thanos's long-term storage and distributed querying capabilities to enable long-term querying of billions of time series. Additionally, we will provide insight into how thousands of developers can query, explore, and configure their metrics and alerts through a customized Grafana deployment, and how our setup evaluates rules and alerts across our entire metrics dataset.
At the end of the presentation, we will emphasize some of the challenges we encountered during the time-intensive migration from a third-party monitoring vendor to our current solution.

Пікірлер: 1
@powersurge5576
@powersurge5576 8 ай бұрын
Use can use Opentelemetry Target Allocators to dynamically distribute jobs to HPA collectors that will remote write to LTS
PromCon 2024 - Inside a PromQL Query: Understanding the Mechanics
28:47
Prometheus Monitoring
Рет қаралды 243
PromCon 2023 - Perses: The CNCF candidate for observability visualisation
27:10
إخفاء الطعام سرًا تحت الطاولة للتناول لاحقًا 😏🍽️
00:28
حرف إبداعية للمنزل في 5 دقائق
Рет қаралды 63 МЛН
My Daughter's Dumplings Are Filled With Coins #funny #cute #comedy
00:18
Funny daughter's daily life
Рет қаралды 24 МЛН
Which One Is The Best - From Small To Giant #katebrush #shorts
00:17
PromCon 2024 - Practical Anomaly Detection at Scale With PromQL
29:46
Prometheus Monitoring
Рет қаралды 285
PromCon 2024 - Prometheus 3.0 Overview
27:50
Prometheus Monitoring
Рет қаралды 354
SANS & CCB Cloud Security Event - Part 1
1:10:08
Centre for Cybersecurity Belgium
Рет қаралды 7
PromCon 2024 - Why Not Just Dots? UTF-8 Support in Prometheus 3.0
32:34
Prometheus Monitoring
Рет қаралды 84
PromCon 2024 - Harnessing the Potential of Prometheus Agent Mode
29:11
Prometheus Monitoring
Рет қаралды 122
PromCon 2024 - Applying GitOps principles for central alert management
28:47
إخفاء الطعام سرًا تحت الطاولة للتناول لاحقًا 😏🍽️
00:28
حرف إبداعية للمنزل في 5 دقائق
Рет қаралды 63 МЛН