"How NOT to Measure Latency" by Gil Tene

  Рет қаралды 103,422

Strange Loop Conference

Strange Loop Conference

8 жыл бұрын

Time is Money. Understanding application responsiveness and latency is critical but good characterization of bad data is useless. Gil Tene discusses some common pitfalls encountered in measuring latency and response time behavior. He introduces how simple, open sourced tools can be used to improve and gain higher confidence in both latency measurement and reporting.
Gil Tene
AZUL SYSTEMS
@giltene
Gil Tene is CTO and co-founder of Azul Systems. He has been involved with virtual machine and runtime technologies for the past 25 years. His pet focus areas include system responsiveness and latency behavior. Gil is a frequent speaker at technology conferences worldwide, and an official JavaOne Rock Star. He pioneered the Continuously Concurrent Compacting Collector (C4) that powers Azul's continuously reactive Java platforms. In past lives, he also designed and built operating systems, network switches, firewalls, and laser based mosquito interception systems.

Пікірлер: 16
@pranytt3485
@pranytt3485 Жыл бұрын
Key takeaways for me : 1. Most of the tools that capture the response times, report 99 percentile latency of every 30 sec duration. For example prometheus metrics are scraped every one minute. But the real thing to look at is the Max response time. 2. Gatling fixed the co-ordinated omission problem. Most of the other tools like Jmeter, etc still have this problem. So use Gatling for your load generation and reporting purposes. 3. Didn't understand co-ordinated omission fully. But I'm now informed that it is bad and needs to be looked out for. 4. When a graph shows sudden spike, it is an indication of a 'possible' coordinated omission. If a graph is smoothly growing it is an indication that there is no bad data. Exceptions maybe there to this rule. 5. There is no point in looking at percentile graphs if you don't have performance goals set for your service. If you are comparing two systems and your target is 20ms, then you could plot graphs and see what is the maximum throughput each system supports while maintaining latency at 20 ms.
@TheSuckerOfTheWorld
@TheSuckerOfTheWorld 8 жыл бұрын
10 Minutes in and I already see the very obvious flaw that +Gil Tene pointed out in my day-to-day monitoring. Great talk!
@whitegelfling
@whitegelfling 8 жыл бұрын
Coordinate emission: One issue here is one that is often encountered in metrics in business, and that is that the bosses want simple, easy, and reliable numbers to look at. To the guy behind the project it is seen as a system that ions out a rare case, without understanding the maths behind it.
@timothydsears
@timothydsears 8 жыл бұрын
Terrific talk about load testing and lazy thinking. The early part probably applies to anyone thinking about metrics for a complex system.
@TestAutomationTV
@TestAutomationTV Жыл бұрын
Nice talk, I've read good things about it. Now starting to listen, looking forward to finding some good stuff about performance testing.
@WilsonMar1
@WilsonMar1 8 жыл бұрын
[6:52] I don't have the data. A common problem we have is we plot only what is convenient. We only plot what gives us nice colorful charts. We choose the noise to display.
@Turalcar
@Turalcar Жыл бұрын
I'm more used to graphs being split for request kinds. To me the first thing that jumped out was the large difference between 50th and 75th percentile.
@minimaddu
@minimaddu 8 жыл бұрын
Great talk! I'm curious, we get most of our production response time stats from AWS load balancer logs. Is that an accurate measure of response time?
@ruimeireles1695
@ruimeireles1695 3 жыл бұрын
Anyone can write all the tool names mentioned in the presentation? I can't find some of them, probably because I'm not writing the name correctly.
@ericj1380
@ericj1380 2 жыл бұрын
@12:04, is this because of 5 page loads/40 resources per page increasing the chance of hitting above p99? If that’s the case couldn’t you just adjust each graph to be on a per-resource or per-page basis? Which seems like it would directly reflect the percentile.
@whitegelfling
@whitegelfling 8 жыл бұрын
Ok, i'm only a few mins in and my brain hurts.. I can't belive that people seriously ignore the max in things like this.. scary.
@MikkoRantalainen
@MikkoRantalainen 4 жыл бұрын
I agree. Only maximum (worst case latency) and median latency are worth wathing. Everything else is just noise.
@MikkoRantalainen
@MikkoRantalainen 4 жыл бұрын
Note that "median" is not the target, the diffence between the worst case latency and median latency is the part of the picture that could get better if you fix the bad stuff. Getting median latency downwards often requires LOTS of changes to the system.
@MikkoRantalainen
@MikkoRantalainen 4 жыл бұрын
All well made latency graphs should have number of the requests per second on the horizontal axis and maximum response time on vertical axis. The number of requests per second that gets the maximum response time too high is the limit.
@GeorgeTsiros
@GeorgeTsiros Жыл бұрын
that, is why "how to measure", by itself, is an entire class in physics (at least) courses.
@tirumaraiselvan1
@tirumaraiselvan1 6 ай бұрын
19:34 should be 100 measurements of 100s each , no? 100 requests will be sent that second and each will be stalled for 100s.
"Stop Rate Limiting! Capacity Management Done Right" by Jon Moore
42:20
Strange Loop Conference
Рет қаралды 35 М.
"Performance Matters" by Emery Berger
42:15
Strange Loop Conference
Рет қаралды 480 М.
KINDNESS ALWAYS COME BACK
00:59
dednahype
Рет қаралды 148 МЛН
I CAN’T BELIEVE I LOST 😱
00:46
Topper Guild
Рет қаралды 117 МЛН
How I Learned to Stop Worrying and Love Misery by Gil Tene
22:49
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
How I CTO • Gil Tene • YOW! 2019
32:09
GOTO Conferences
Рет қаралды 4,4 М.
"Propositions as Types" by Philip Wadler
42:43
Strange Loop Conference
Рет қаралды 126 М.
"Turning the database inside out with Apache Samza" by Martin Kleppmann
47:43
Strange Loop Conference
Рет қаралды 184 М.
"I See What You Mean" by Peter Alvaro
52:29
Strange Loop Conference
Рет қаралды 55 М.
"The Mess We're In" by Joe Armstrong
45:50
Strange Loop Conference
Рет қаралды 378 М.
"Aeron: Open-source high-performance messaging" by Martin Thompson
42:31
Strange Loop Conference
Рет қаралды 49 М.
Hammock Driven Development - Rich Hickey
39:49
ClojureTV
Рет қаралды 288 М.
code::dive conference 2014 - Scott Meyers: Cpu Caches and Why You Care
1:16:58
NOKIA Technology Center Wrocław
Рет қаралды 185 М.
Hisense Official Flagship Store Hisense is the champion What is going on?
0:11
Special Effects Funny 44
Рет қаралды 3,1 МЛН
Clicks чехол-клавиатура для iPhone ⌨️
0:59
Samsung Galaxy Unpacked July 2024: Official Replay
1:8:53
Samsung
Рет қаралды 23 МЛН
iPhone 16 с инновационным аккумулятором
0:45
ÉЖИ АКСЁНОВ
Рет қаралды 8 МЛН
Я УКРАЛ ТЕЛЕФОН В МИЛАНЕ
9:18
Игорь Линк
Рет қаралды 111 М.
Собери ПК и Получи 10,000₽
1:00
build monsters
Рет қаралды 2,6 МЛН