Observability at Google: Massive LOG STREAMING at 11 Gbps

Рет қаралды 8,234

Күн бұрын

Пікірлер: 23

@gkcs Ай бұрын

My website got a new homepage 😊 interviewready.io/ The system design series will have a new episode today at 7 PM IST interviewready.io/learn/system-design-course/building-an-ecommerce-app-1-to-1m/1-what-is-system-design Cheers!

@praveenkurapati7300 Ай бұрын

How will the collectors know the log line is pertained to a specific Span or Trace(if multiple spans)? They get logs from multiple clients. There can be duplicate requests as well. Do we store the trace events directly in the Big Table wrt trace ID ? Given 1TB data per day and assuming huge no of logs. How is the TraceID scalable? Wont there be collisions ?

@gkcs Ай бұрын

The collectors have the span and trace id of every log line added by the Dapper client. Collisions aren't likely since the number of requests is in billions while the id space is much larger (even accounting for the birthday problem).

@dipanshuc Ай бұрын

If we sample requests at a rate of 1/1024, wouldn't we risk missing important spans or traces, such as those containing errors?

@gkcs Ай бұрын

At Google scale, 1/1024 was frequent enough. The paper suggests that the request volumes are so large that an error does come up even with such large sampling rates.

@coder3101 Ай бұрын

At work, we also have a sampled traces, usually traces help you find latencies or bottleneck and for errors we usually check logs.

@jitxhere Ай бұрын

@@coder3101 so what tools do you use for both these cases? Like Prometheus grafana??

@coder3101 Ай бұрын

We use light step for traces, Victoria Metrics for metrics (visualised on grafana), Typical elastic stack for logs.

@itjustworks4824 Ай бұрын

Gaurav is going after google aggressively

@shubhamdebnath714 Ай бұрын

How are Prometheus or datadaog or kamon different from this ?

@gkcs Ай бұрын

They all solve similar problems. The scale of Google Dapper is much larger though, with petabytes of request data sampled per day.

@sujoyhalder4735 Ай бұрын

Could you please make a video on SEPA payments architecture

@gkcs Ай бұрын

Sorry, but how is this related to software engineering?

@sujoyhalder4735 Ай бұрын

@@gkcs I have recently joined an Ireland-based insurance project as a backend developer, where all payments are handled through SEPA. I would like to learn more about the high-level architecture of the system.

@er.sahilmd Ай бұрын

Commenting to come back here

@maddymadanraj Ай бұрын

if to put it in simpler words its prometheus of google ?

@gkcs Ай бұрын

Prometheus is a time-series monitoring system for collecting and querying metrics. Dapper is a request tracing system. Prometheus is similar to Monarch: kzbin.info/www/bejne/hKmzhYmemJanfKM

@praveenkurapati7300 Ай бұрын

Who assigns the Trace ID to these spans? If a request enters system which has subsequent services in it (Service A -> Service B -> Service C -> Service A). Does the logger service will add the Trace ID to these logs in the system before the collectors process them (which consolidates based on the SpanIDs, TraceIDs)

@AnandKumar-cc3gs Ай бұрын

The agent attached to the service will do add these info