Observability at Google: Massive LOG STREAMING at 11 Gbps

  Рет қаралды 8,234

Gaurav Sen

Gaurav Sen

Күн бұрын

Пікірлер: 23
@gkcs
@gkcs Ай бұрын
My website got a new homepage 😊 interviewready.io/ The system design series will have a new episode today at 7 PM IST interviewready.io/learn/system-design-course/building-an-ecommerce-app-1-to-1m/1-what-is-system-design Cheers!
@praveenkurapati7300
@praveenkurapati7300 Ай бұрын
How will the collectors know the log line is pertained to a specific Span or Trace(if multiple spans)? They get logs from multiple clients. There can be duplicate requests as well. Do we store the trace events directly in the Big Table wrt trace ID ? Given 1TB data per day and assuming huge no of logs. How is the TraceID scalable? Wont there be collisions ?
@gkcs
@gkcs Ай бұрын
The collectors have the span and trace id of every log line added by the Dapper client. Collisions aren't likely since the number of requests is in billions while the id space is much larger (even accounting for the birthday problem).
@dipanshuc
@dipanshuc Ай бұрын
If we sample requests at a rate of 1/1024, wouldn't we risk missing important spans or traces, such as those containing errors?
@gkcs
@gkcs Ай бұрын
At Google scale, 1/1024 was frequent enough. The paper suggests that the request volumes are so large that an error does come up even with such large sampling rates.
@coder3101
@coder3101 Ай бұрын
At work, we also have a sampled traces, usually traces help you find latencies or bottleneck and for errors we usually check logs.
@jitxhere
@jitxhere Ай бұрын
@@coder3101 so what tools do you use for both these cases? Like Prometheus grafana??
@coder3101
@coder3101 Ай бұрын
We use light step for traces, Victoria Metrics for metrics (visualised on grafana), Typical elastic stack for logs.
@itjustworks4824
@itjustworks4824 Ай бұрын
Gaurav is going after google aggressively
@shubhamdebnath714
@shubhamdebnath714 Ай бұрын
How are Prometheus or datadaog or kamon different from this ?
@gkcs
@gkcs Ай бұрын
They all solve similar problems. The scale of Google Dapper is much larger though, with petabytes of request data sampled per day.
@sujoyhalder4735
@sujoyhalder4735 Ай бұрын
Could you please make a video on SEPA payments architecture
@gkcs
@gkcs Ай бұрын
Sorry, but how is this related to software engineering?
@sujoyhalder4735
@sujoyhalder4735 Ай бұрын
@@gkcs I have recently joined an Ireland-based insurance project as a backend developer, where all payments are handled through SEPA. I would like to learn more about the high-level architecture of the system.
@er.sahilmd
@er.sahilmd Ай бұрын
Commenting to come back here
@maddymadanraj
@maddymadanraj Ай бұрын
if to put it in simpler words its prometheus of google ?
@gkcs
@gkcs Ай бұрын
Prometheus is a time-series monitoring system for collecting and querying metrics. Dapper is a request tracing system. Prometheus is similar to Monarch: kzbin.info/www/bejne/hKmzhYmemJanfKM
@praveenkurapati7300
@praveenkurapati7300 Ай бұрын
Who assigns the Trace ID to these spans? If a request enters system which has subsequent services in it (Service A -> Service B -> Service C -> Service A). Does the logger service will add the Trace ID to these logs in the system before the collectors process them (which consolidates based on the SpanIDs, TraceIDs)
@AnandKumar-cc3gs
@AnandKumar-cc3gs Ай бұрын
The agent attached to the service will do add these info
@manikmehta2727
@manikmehta2727 Ай бұрын
Coincidentally I was reading the codebase of jaeger and then saw this video:)
@gkcs
@gkcs Ай бұрын
Cheers!
@timorrs
@timorrs Ай бұрын
Does Dapper leverages Otel?
@gkcs
@gkcs Ай бұрын
github.com/DapperLib/Dapper/issues/1355
Google Monarch: PetaByte Scale Timeseries Database
15:18
Gaurav Sen
Рет қаралды 5 М.
Facebook TAO: 1 BILLION GRAPH QUERIES in 1 second
24:59
Gaurav Sen
Рет қаралды 8 М.
ССЫЛКА НА ИГРУ В КОММЕНТАХ #shorts
0:36
Паша Осадчий
Рет қаралды 8 МЛН
Google Authorization: 1 TRILLION ACCESS CONTROL LISTS
17:02
Gaurav Sen
Рет қаралды 21 М.
Modern Observability with OpenTelemetry
11:49
Lightstep is now ServiceNow Cloud Observability
Рет қаралды 22 М.
Five common system design interview mistakes 😅
7:24
Gaurav Sen
Рет қаралды 117 М.
Apache Kafka Fundamentals You Should Know
4:55
ByteByteGo
Рет қаралды 78 М.
Facebook Memcache: PETABYTE SCALE KEY-VALUE STORE
31:53
Gaurav Sen
Рет қаралды 13 М.
I Studied Data Job Trends for 24 Hours to Save Your Career! (ft Datalore)
13:07
Thu Vu data analytics
Рет қаралды 293 М.
Saga Pattern | Distributed Transactions | Microservices
17:18
When to Use Kafka or RabbitMQ | System Design
8:16
Interview Pen
Рет қаралды 157 М.
Caching in distributed systems: A friendly introduction
11:25
Gaurav Sen
Рет қаралды 14 М.
ССЫЛКА НА ИГРУ В КОММЕНТАХ #shorts
0:36
Паша Осадчий
Рет қаралды 8 МЛН