Impressive Anthony. That's quite the rabbit hole of debugging! Really enjoy your videos please keep posting.
@kRySt4LGaMeR2 ай бұрын
When you said you had a follow-up didn't expect a hour-long in-depth look into the issue. Thanks so much!
@NestiGX2 ай бұрын
Wow, that's impressive. You summed it up very well, quite enjoyable to watch :D
@Reecepbcups2 ай бұрын
Amazing!! such an amazing debug session, thanks for the walk though here. You know its good when you have to break out gdb across py versions, phew
@shadowviper952 ай бұрын
very cool to see a "real world" case like this in such detail!
@ilovepeaceandplaying89172 ай бұрын
An interesting bug to debug for you and it was really interesting to watch you debug it step by step like crime investigation, loved it, every time I watch your videos, I learn new things. Thanks for putting this as a video.
@MattLayman2 ай бұрын
That was epic @anthonywritescode! I had to grab some popcorn. :) I love that you not only fixed the problem, but then kept going until you truly understood why things were ok in 3.11 but not in 3.12.
@chadobrien3352Ай бұрын
I definitely expected a pen flip in this session. Video brought me back to Mr Swoyer's AP Calc classes.
@anthonywritescodeАй бұрын
yoooooo how are you doing!! I was actually doing some pen flips on my last stream :D hope you're doing well!
@chadobrien3352Ай бұрын
@@anthonywritescode I'm doing pretty well! Thanks. Our group just picked up (inherited) a python project so been using your channel to catch up on the python goodness. I actually found your channel through @ThePrimeagen who did a reaction video.
@mrswats2 ай бұрын
This was super interesting! I watched the stream where you were setting all this up, happy to see the conclusion of jt!
@teejadedАй бұрын
Great fun! Debugging interpreted languages can be very challenging! I spent quite a lot of time improving the compatability of the gopherlua socket package. A lot of that reminds me of this, staring at the luasocket c code and comparing the behavior to the go port.
@RealMineManUK2 ай бұрын
Great video, showing the whole process from the beginning till the end
@djchainz2 ай бұрын
Wow, this was epic. Thank you for recording all this, I hope it was cathartic.
@applePrincess2 ай бұрын
Amazing video as always. Yes, fflush is required since buffering mode on stdout is platform independent as per C spec or POSIX. most *modern* platform does line buffering though. Or you could call setvbuf before calling printf to set line buffered
@redark72 ай бұрын
This was quite deep! Changing the thread state and doing fork sounds very dangerous… without very clear of what and why.
@malteplath2 ай бұрын
Thanks for getting to the bottom of this. I watched your first video on chasing this bug, and I could not wait to see who the culprit was.
@weekendforever2 ай бұрын
Thanks for sharing. It's a shame one can only subscribe once to a channel and not do it twice for more content. ;)
@anthonywritescode2 ай бұрын
there's always twitch and @anthonywritescode-vods ;)
@xfilinxcl24812 ай бұрын
А потом менеджер увидит это мизерное изменение в коде и подумает, что ты не работаешь, потому что код для поиска бага нигде не учитывался :)
@GiovanniBarillari-z9f22 күн бұрын
On one side, this was super-interesting to watch, and a good "excuse" to learn about several CPython internals. On the other side, I don't get why such a vast portion of the Python community still relies on uwsgi given the amount of weird stuff it does with interpreter state and the GIL - not to mention the usual segfaults - in place of some modern alternatives :/
@anthonywritescode22 күн бұрын
there isn't a non-async performant replacement.
@anthonywritescode22 күн бұрын
and arguably any equivalent thing (say apache mod-wsgi) has to do exactly the same dance
@er634382 ай бұрын
Hats off! Spectacular.
@wagneralberto54562 ай бұрын
amazing video, thank you, i had learned so much.
@ericng88072 ай бұрын
me wishing i was advanced enough to watch the video
@mrswats2 ай бұрын
Man, same
@Ash-qp2yw2 ай бұрын
Me wishing this was the top comment. :p
@Quarky_2 ай бұрын
Amazing debugging adventure! I'm wondering how long it took you from identifying the bug to fixing and understanding it? Did you work on other things in between, or were you on this uninterrupted? EDIT: I guess what I'm interested in is, if you had a lot of uninterrupted time for this adventure, or if you had to find time here and there. Me personally, I find any deep debugging difficult if I get interrupted.
@anthonywritescode2 ай бұрын
I answered this in a comment below but most of it was in a 5 hour stream
@Quarky_2 ай бұрын
@@anthonywritescode thanks, found that comment! Thanks for sharing this kind of work in your videos :-)
@InkFPS2 ай бұрын
Very impressive engineering. 👏
@TheJobCompany2 ай бұрын
Any particular reason why you named your functions with leading underscores in that bisect script? Sureely lexical visibility shouldn't matter when you're quickly hacking together a temporary script... right?
@anthonywritescode2 ай бұрын
no benefit to cutting corners. I try and do things professionally even if it's just a hacky script (like there's no reason for type annotations there either, or functions even!)
@TheJobCompany2 ай бұрын
@@anthonywritescode love that, I'm guilty of doing the same thing. This is just the first time I've ever thought of every single function in a single-file script as part of its private api haha
@anthonywritescode13 күн бұрын
today it's a script but you never know how it may evolve in the future. no reason to cut corners only some of the time
@yungouda2 ай бұрын
To me the bug is actually in python. No need to swap thread states when oldts == newts. Early return instead.
@anthonywritescode2 ай бұрын
there should never be a normal case where you swap to yourself
@BohdanBorkivskyi2 ай бұрын
I've got a question about the final fix. So before the fix there was pyuwsgi_setup that was calling uwsgi_setup and two other functions pyuwsgi_init and pyuwsgi_run are calling pyuwsgi_setup meaning they were calling uwsgi_setup. Now that uwsgi_setup is moved out, behavior of pyuwsgi_init and pyuwsgi_run did not change - they still call both pyuwsgi_setup and uwsgi_setup. But calling pyuwsgi_setup does not result in calling uwsgi_setup anymore - is it ok?
@anthonywritescode2 ай бұрын
technically in the old code if you ran setup without anything else you'd create zombie children -- calling setup without run didn't make any sense
@ruroruro2 ай бұрын
I still don't quite understand why does the new version of PyThreadState_Swap cause this issue. If it ends up being called with oldts == newts, shouldn't it just unlock and then relock the same GIL? Or am I missing something?
@anthonywritescode2 ай бұрын
the deadlock is in PyThreadState_Restore (when it restores an already locked thread state), not in swap. the self-swap is weird but not the problem
@ruroruro2 ай бұрын
@@anthonywritescode ah I see, I thought that you tried recompiling 3.12 with the NoGIL version of swap and that got rid of the deadlock, but instead segfaulted at a later point. I guess, I am still a bit confused about how PyThreadState_Swap is relevant to this bug. Or was it just a red herring all along?
@anthonywritescode2 ай бұрын
(this is in the video but I'll reiterate) the swap happens in the parent process after the first fork so it affects the child only after reload. it stashes a locked thread state (where it didn't before)
@ruroruro2 ай бұрын
@@anthonywritescode Ah, I think I finally get it. The docs for PyThreadState_Swap say that "The global interpreter lock must be held and is not released". Previously, uwsgi called it without holding the GIL, but that wasn't an issue, presumably because this only occurred during reloading (at which point the python threads aren't running, so there is no possibility of any race conditions). In 3.12 the Swap logic changed so that it released and reacquired the GIL. I am assuming that attempting to release the GIL when it is already released is a no-op, so the sum total effect of this was that now the Swap would acquire the GIL, if it was not already held (of course, calling it without holding the GIL is UB according to the docs). Did I get that right? If so, I wonder if PyThreadState_Swap should have a sanity check/assert that would verify that the called does indeed already hold the GIL.
@con-f-use2 ай бұрын
"micro whiskey" 😁
@nexovec2 ай бұрын
How much time did you spend on this?
@anthonywritescode2 ай бұрын
the video? about 15 minutes of prep before a one-take the whole fix? checking the vods it looks like 2.5 hours for the bisect and 5 hours for the actual fix. though that doesn't include the sprinkling of small bits of time over weeks trying to narrow down the actual cause
@nexovec2 ай бұрын
@@anthonywritescode It would probably take me 7 hours to set up the bisect 😆
@TigerWalts2 ай бұрын
Just started watching and noticed the video length. If it takes that long to explain then it must have been a doozy.
@king403422 ай бұрын
Titillating write-up!
@FocusAccount-iv5xe2 ай бұрын
+
@CyberbeniАй бұрын
Could've just jumped from 10:17 to 57:30 by starting with reading the documentation.
@anthonywritescodeАй бұрын
eh not quite -- I don't think that line in the docs would have meant anything to me until I learned how all the other pieces fit together