I dramatically sped up my code

  Рет қаралды 25,893

Tsoding Daily

Tsoding Daily

Ай бұрын

Previous Episodes: • Seam Carving
References:
- Source Code: github.com/tsoding/seam-carving
- dl.acm.org/doi/10.1145/127580...

Пікірлер: 105
@kirilldanshin
@kirilldanshin Ай бұрын
that issue tracker joke is just pure gold
@bebre_2288
@bebre_2288 Ай бұрын
Watching on 2x to help zozin
@Pi7on
@Pi7on Ай бұрын
🙏
@smashedpotato278
@smashedpotato278 Ай бұрын
I think that kind of does opposite for him I think - kzbin.info/www/bejne/npTFn4quiJ2Le7Msi=VC2ZmMRGZfq2tqSH
@user-ec3rm9wr1n
@user-ec3rm9wr1n Ай бұрын
What's his name !?!?
@OCPyrit
@OCPyrit Ай бұрын
thank
@rubyciide5542
@rubyciide5542 Ай бұрын
​@@user-ec3rm9wr1nTerryChev DavisKov
@tianned
@tianned Ай бұрын
17:20 finally someone has implemented this useful feature in c
@cobbcoding
@cobbcoding Ай бұрын
this is the only secs that C programmers know of.
@Njinx_
@Njinx_ Ай бұрын
Been waiting for the sex RFC to get implemented
@sqlexp
@sqlexp Ай бұрын
He should have used division.
@TurtleKwitty
@TurtleKwitty Ай бұрын
To be fair to chat about the emacs terminal, it WAS .5seconds so ~13.5% that's not entirely insignificant. Worse UX casue having a progress indicator is nice but not unfair to say that emacs terminal was causing some slowdown
@Ash-qp2yw
@Ash-qp2yw Ай бұрын
Yo! I've never noticed before - the twitch chat is the video's captions! 10/10 work there mr zozin!
@damiankaczmarczyk9872
@damiankaczmarczyk9872 Ай бұрын
When you accidentally type `rm -rf /` on you machine and see increasing free disc space: 13:25
@angelcaru
@angelcaru Ай бұрын
lol i was there
@Marhaenism1930
@Marhaenism1930 Ай бұрын
best solution ever...😂
@lievenpetersen
@lievenpetersen Ай бұрын
When working on a grid, making a drawing can help quite a bit with the thought process. I learned that in high performance computing class, were we also did optimisations on grids with mock weather calculations.
@ProjectV95
@ProjectV95 Ай бұрын
0:10 RETURN OF THE RED CAR
@apppples
@apppples Ай бұрын
construct set up tuples of indicies for sobel points to recalculate as you remove the seam, cutting off the top of the eight neighborhood, now you have a set of indicies to recalculate the sobel filter at
@machine.angel.777
@machine.angel.777 Ай бұрын
UIUA MENTIONED
@sossupummi
@sossupummi Ай бұрын
waiting for the episode
@fennecbesixdouze1794
@fennecbesixdouze1794 Ай бұрын
I don't know if he ever corrected himself on this in the video but right off the bat: no, they were not resizing in real time in the 2005 demo. It was all pre-computed, all the seams were stored and the resizing just added and deleted from the array of pre-computed seams.
@rogo7330
@rogo7330 Ай бұрын
32:03 or you can use '--fwhole-program' gcc flag to make compiler know that it's the final binary and objects do not need to be visible outside. Honestly I can't imagine what use can be from executable ELF file with main function that have it's objects available outside, unless you're doing very weird thing like dlopen this file despite it not being shared object.
@yjlom
@yjlom Ай бұрын
to embed it or part of it in other applications?
@RandomGeometryDashStuff
@RandomGeometryDashStuff Ай бұрын
34:46 just recompute 2 pixels to right and 2 pixels to left or seam? edit: 36:36 chat
@salim444
@salim444 Ай бұрын
can someone explain why removing sqrtf would break the algorithm? I understand that sqrt is not linear but it should be a close enough energy function
@ryan-heath
@ryan-heath Ай бұрын
Another counterintuitive optimization might be the removals of the seam in separate loops. Now in one loop three different memory structures are loaded and unloaded into the cpu cache. When using three separate loops the cpu cache have a lot less cache hit misses, improving the speed considerably depending on the image size.
@TurtleKwitty
@TurtleKwitty Ай бұрын
And can be parallelized
@ryan-heath
@ryan-heath Ай бұрын
@@TurtleKwittyhe tried that using open mp or something like that. But the overhead of thread creation actually slowed it all down.
@TurtleKwitty
@TurtleKwitty Ай бұрын
@@ryan-heath He tried the single pass for all three with openmp, but throwing each in its own loop so only creating two threads so one for each mat would likely work better. The best would be for openmp to have s thread pool model since it's constantly being used but don't know openmp
@ryan-heath
@ryan-heath Ай бұрын
@@TurtleKwitty Yes, true, that might be possible. I dont know about openmp either.
@RalphScott-wu8ei
@RalphScott-wu8ei Ай бұрын
That pull of the ponytail behind the back at 18:52 makes you look so mature I can't breathe
@mbarrio
@mbarrio Ай бұрын
29:26 There is some noise in the webcam. Does anyone know the cause of this noise?
@blackbeard3449
@blackbeard3449 Ай бұрын
40:00 Removing logs made it go from 3.7 to 3.2. How is that not significant?
@remrevo3944
@remrevo3944 Ай бұрын
x86_64 does only have 48bits (physical) memory space. (Though virtual memory addresses still need to be 64 bit.) Though there might be some memory alignment bits you can safe on. Especially when using a GC, it might be better to have a higher alignment.
@hubstrangers3450
@hubstrangers3450 Ай бұрын
Hi Tsoding, Sorry disturbe you, could you consider your thoughts for a Advance Database System...The Photon Paper (presentation on yt =v=PwnUf6_0H0k), discuss the reseach paper and few others lead to Photo paper...these leads towards....think you will able deduce that scenario....thx
@user-ec3rm9wr1n
@user-ec3rm9wr1n Ай бұрын
The Cutest ever 🐱
@yt.Interest
@yt.Interest Ай бұрын
17:21 Finally someone added this!!!
@ATAG-yn5pd
@ATAG-yn5pd Ай бұрын
7:05 🤣 watching tsoding videos can also help :D
@hubstrangers3450
@hubstrangers3450 Ай бұрын
Thank you...
@xbelanch
@xbelanch Ай бұрын
Every time I watch him I think he's the Slavoj Žižek but code programming "and so on and so on"
@lievenpetersen
@lievenpetersen Ай бұрын
13:35 "Oh shit Oh Fuck" xD
@vicenteeduardo5598
@vicenteeduardo5598 Ай бұрын
Hey @tsoding i think that you can remove the sqrt because tge algorith only requires you to find the minimum value, not the accurate value
@BramBolder
@BramBolder Ай бұрын
That's not really correct, as the terms are summed in the DP algorithm. The minimum of a sum of roots does not need to be the minimum of a sum of the original values. As example, take the values [ 4,4,4,4,1 ], then their sum is 17 and the sum of roots is 9. Compare that to the values [ 1,1,1,1,16 ], their sum is 20 and the sum of roots is 8. When using the sum of roots, the second list is the minimum (9>8), but with your suggestion the first list would win (17= 0.0, so you could have used a simple negative float value instead of the integer NaN-cast. Even though sx ang sy might be negative, sqrt( sx*sx+sy*sy) never is.
@vicenteeduardo5598
@vicenteeduardo5598 Ай бұрын
@@BramBolder Thanks for making me realize that. And i do think it doenst change that much as you said, and tsoding just changed the repo and is not using fsqrt anymore, and it works just fine!
@XeZrunner
@XeZrunner Ай бұрын
red car
@mndtr0
@mndtr0 Ай бұрын
Do you work as a programmer in industry or just do it for fun?
@fg786
@fg786 Ай бұрын
I think he's unemployed, at least he said so in his Q&A video.
@TheTormozz
@TheTormozz Ай бұрын
How the fuck he can be unemployed if he is unemployment, what are we?
@Ash-qp2yw
@Ash-qp2yw Ай бұрын
@@TheTormozz By choice, I think. He prefers to work on his own projects and will work for short periods to make enough money to sustain his frugal lifestyle for 6 months to a year
@mixtoor
@mixtoor Ай бұрын
Another optimization that comes to mind is to memmove only when you absolutely must, and only as much as you need.
@remrevo3944
@remrevo3944 Ай бұрын
Isn't that already what's happening? The most he could change is not to move the things that have to be recomputed anyway, but except for that I don't see anything that could be optimized that way.
@mixtoor
@mixtoor Ай бұрын
@remrevo3944 not exactly. You don't have to move the whole right side after the seam. Just the few pixels next to it. Or maybe none at all if you store rows of pixels in linked lists.
@remrevo3944
@remrevo3944 Ай бұрын
@@mixtoor I mean the picture gets slimmer, so *all* pixels have do be moved necessarily. Linked lists seem like a pretty big overhead, and memmove is generally pretty fast anyway, but I guess it would not hurt to generate a flame graph to look into where the program is actually spending its time, instead of just guessing. (I'm already through the phase of trying something without data backing me up. I tried removing the scanning for the 0xFFFFFF NaN floats and it barely made any impact.)
@mixtoor
@mixtoor Ай бұрын
​@@remrevo3944 Of course you should measure all the stuff first, but hypothtically. Imagine that instead of memoves you put a tombstone where the seam was (alpha=0 seems to be a good anough tombstone). Also for each row, record position of the leftmost tombstone. Now when you need to access pixels on the left of the leftmost tombstone - you don't have to do anything special at all. Otherwise - move all the tombstones between the leftmost tombstone position and the point you're accessing to the right of what you're accesssing, then update leftmost tombstone position. In the end - move the rest of the tombstones to the right. Something like that. Also I didn't think about it too much, but as soon as you create those tombstones, you actually free 3 bytes for your own use. May want to put some speedup structure there as well. The version with linked lists sounds like too much overhead, I agree. But it might allow to avoid this tombstone-craziness.
@mixtoor
@mixtoor Ай бұрын
​ @remrevo3944 Of course you'd have to measure things. But still I'll elaborate on the idea a little further. First on your "all pixels" argument. Of course eventually every pixels that needs to move would have to move. But in the version from the video they would move one position at a time. And with what I am suggesting, hopefully they can move in bigger steps. Instead of using memove on the right part of the row you would put a "tombstone" where the seam was. And also for each row record left-most tombstone position. Now continue the algorithm as usual with the following modifications. Whenever algorithm needs to access a pixel, it can be on the left of the leftmost tombstone or not. If it happens on the left - don't need to do anything special. Otherwise - the pixel we're accessing, is not what we wanted, so we have to fix the situation. To do that move all the tombstones between leftmost tombstone and accessed pixel position to the right of the accessed pixel. Then update the position of the leftmost tombstone. When the algorithm ends - move all the tombstones to the end. Note that as the algorithm progresses tombstones will tend to clump together in the middle of the image, therefore - you can move pixels more than one position at a time. How to denote tobmstones is up to implementer. Bitsets is one option. Another - some special pixel value (with alpha=0 or something) Linked list idea sounds indeed like too much overhead, but it is simpler. But you'd also have to measure to be sure :)
@Pi7on
@Pi7on Ай бұрын
Raylib visualization next?? 🥹
@biohazardouszxellium
@biohazardouszxellium Ай бұрын
Has Tsoding ever tinkered with embedded systems?
@RafaelHarth
@RafaelHarth Ай бұрын
Probably asked many times before but, why C instead of C++?
@caio757
@caio757 Ай бұрын
🏴󠁧󠁢󠁷󠁬󠁳󠁿
@jayceedee5807
@jayceedee5807 Ай бұрын
Weak 5x developer stuff. I once wrote code so bad it could be sped up by a factor of 3000!
@dkkogmaw1311
@dkkogmaw1311 Ай бұрын
I thought I would see some heavy SSE usage ngl prob a good use case for avx2
@rogo7330
@rogo7330 Ай бұрын
It was fast enough 20 years ago without SSE and, obviously, before any AVX. Just delete code that does room heating operation and program will be faster.
@dkkogmaw1311
@dkkogmaw1311 Ай бұрын
@@rogo7330 well ofc this is still a huge improvement by just refactoring but still would love to see some nice xmm/ymm register usage ;)
@lievenpetersen
@lievenpetersen Ай бұрын
Instead of writing 0.000000001 I like to write 1e9, because it eliminates the need to count zeros correctly.
@drdca8263
@drdca8263 Ай бұрын
1e-9 ?
@anon_y_mousse
@anon_y_mousse Ай бұрын
Agreed, scientific notation is very useful for speed, but unfortunately not for preventing errors. Always make sure to check that when you write your exponent that you don't neglect the minus symbol when you want a really small number.
@lievenpetersen
@lievenpetersen Ай бұрын
@@drdca8263 ehh, sorry yeah. Depending on / or *
@lievenpetersen
@lievenpetersen Ай бұрын
@@anon_y_mousse :D
@ryan-heath
@ryan-heath Ай бұрын
I always forget if the zero before the floating point is included or not ...
@ivanjermakov
@ivanjermakov Ай бұрын
Perf optimization without using a profiler?
@diegorocha2186
@diegorocha2186 Ай бұрын
He literally talked about it on the video. Have you watched the entire session?
@addcoding8150
@addcoding8150 Ай бұрын
you can the kinds of performance improvements he does without a profiler. Thats not even "real" optimization. He is just trying to reduce the obvious work-points. This can be done on a white board. The most common performance cost centers in that kind of program are memory layout anyways, so a normal profiler wouldn't help anyway.
@jimmyporter8941
@jimmyporter8941 Ай бұрын
@@addcoding8150 Hmm... memory layout. He's storing the bitmaps in horizontal lines. Perhaps he'd get more cache hits by having the bitmaps stored as vertical slices. Given that the seams are mostly vertical. (apols if he does that in the video. Haven't finished watching yet.)
@ivanjermakov
@ivanjermakov Ай бұрын
@@diegorocha2186 not the whole stream, but I left a comment seeing that he tries to measure alg speed by commenting out a part of a function. Not the most efficient way to find hot paths.
@XeZrunner
@XeZrunner Ай бұрын
1:05:12
@TankorSmash
@TankorSmash Ай бұрын
You'd make some good Uiua content if you ever did
@anon_y_mousse
@anon_y_mousse Ай бұрын
I really don't like KZbin's new layout. It is total garbage. I got started over a decade earlier than you, and as far as I can see, the only real difference between then and now is that now most people only really get into programming because they want to make the big bucks, and they often have negative passion for it and write code about as well as a blind squirrel running across the keyboard. As far as how I was treated back in school, no one cared. Where I went to school no one used nerd as an insult or even as a word at all. Maybe it was because I showed them the value in knowing how to program by modifying QBasic games, but who knows. Probably the thing that we all enjoyed the most was giant explosions in Gorilla, although sending messages over the network was used quite a bit once I showed them how.
@seepilot8
@seepilot8 Ай бұрын
Love his German scheiße word. 😂
@blackhaze3856
@blackhaze3856 Ай бұрын
The code is written before pressing the keys
@Mozartenhimer
@Mozartenhimer Ай бұрын
I put a pull request on bfjit that you may find entertaining.
@anon_y_mousse
@anon_y_mousse Ай бұрын
Interesting idea. Although, it does decrease the portability some the way you've implemented it, as long as it runs on Linux I've stopped caring and I'm certainly trying my best to get others to not care either.
@Lewdovico
@Lewdovico Ай бұрын
🚗
@r2com641
@r2com641 Ай бұрын
🇦🇿
@rvft
@rvft Ай бұрын
Ayda Azərbaycanlı nə gəzir burda heç gözləməzdim
@r2com641
@r2com641 Ай бұрын
@@rvft you are underestimating us bro 😉
@rvft
@rvft Ай бұрын
@@r2com641 further investigating to your channel, I feel like you don't actually "live" in Azerbaijan, I might be wrong tho
@r2com641
@r2com641 Ай бұрын
@@rvft yeah, but I am still originally from there
@OlivierDALET
@OlivierDALET Ай бұрын
Why not use NAN ans isnan from math.h?
Can you actually see more than 30 FPS?
1:41:36
Tsoding Daily
Рет қаралды 17 М.
Python programmer vs C programmer (speedrun)
1:43
Virbox
Рет қаралды 437 М.
ХОТЯ БЫ КИНОДА 2 - официальный фильм
1:35:34
ХОТЯ БЫ В КИНО
Рет қаралды 2,6 МЛН
Шокирующая Речь Выпускника 😳📽️@CarrolltonTexas
00:43
Глеб Рандалайнен
Рет қаралды 10 МЛН
Newton's Method in C
1:33:17
Tsoding Daily
Рет қаралды 18 М.
Hiding Data Inside of Executable Files
1:55:14
Tsoding Daily
Рет қаралды 24 М.
Greatest Mathematicians and their Discoveries - Part 2
9:40
ThoughtThrill
Рет қаралды 7 М.
Why do C Programmers Always Obfuscate Their Code?
2:01:57
Tsoding Daily
Рет қаралды 49 М.
I regret doing this...
1:20:07
Tsoding Daily
Рет қаралды 62 М.
This is NVIDIA’s new GPU
12:58
Linus Tech Tips
Рет қаралды 299 М.
the TRUTH about C++ (is it worth your time?)
3:17
Low Level Learning
Рет қаралды 595 М.
How I made a world with trillions of gems in C with Raylib
10:09
Lingon Studios
Рет қаралды 10 М.
Is this the Future of Programming Languages?
2:53:18
Tsoding Daily
Рет қаралды 62 М.
Making a Game With C++ and SDL2
5:34
Goodgis
Рет қаралды 577 М.
Xiaomi Note 13 Pro по безумной цене в России
0:43
Простые Технологии
Рет қаралды 2 МЛН
Карточка Зарядка 📱 ( @ArshSoni )
0:23
EpicShortsRussia
Рет қаралды 462 М.
Очень странные дела PS 4 Pro
1:00
ТЕХНОБЛОГ ГУБАРЕВ СЕРГЕЙ
Рет қаралды 275 М.