ONE TERABYTE of RAM saved with a single line of code (advanced) anthony explains

Рет қаралды 55,585

anthonywritescode

Күн бұрын

Пікірлер: 78

@htol78 10 ай бұрын

The thing i would really enjoy is troubleshooting process which lead to this solution.

@anthonywritescode 10 ай бұрын

you'll want to check out next week's video then :)

@avapsilver 10 ай бұрын

i work at datadog and its so cool seeing you use it and visualize everything nicely!!

@Slangs 10 ай бұрын

It will always feel good to see our product being used in the wild even when working for major companies, great job guys, amazing product

@myalpaca5 10 ай бұрын

How do you locate the position in the code where optimization is possible? Do you learn about gc.freeze() somewhere else first and then realize it could be used in the project? Or you notice there is high memory usage for the services and then actively looking for potential solutions and encounter gc.freeze()?

@anthonywritescode 10 ай бұрын

it depends on the framework and how things are set up. usually you want it as late in the parent process before forking as possible. I've known about this particular function for a while (even made a video on it a year or so ago). I'm currently trying to upgrade python and was hunting for a memory leak and decided to try this out for fun (and profit). had some success with this and similar approaches at previous employers

@lucaalberigo6302 10 ай бұрын

For me to locate a problem usually it is a mix of debugging, experience(checking known bottlenecks for your application, example: access to disk, API interactions, parsing of big data sources, DB queries), and bench-marking; running operations containing different data to evaluate response times. You follow the data step by step until usually you hit a performance drop on a specific function(rarely your hole chain of calls is equally slow in all parts ). The whole optimization process usually goes like this: optimization is needed for a certain piece of code, because is too slow/resource consuming; we analyze the code to try to understand the cause of the issue (eg. inefficient algorithm, too much memory used, slow operation because of too many api/database requests.. ). We first try to just make the code better, . If not sufficient then we try to apply known but maybe more complex optimization methods (if appropriate) like caching, optimizing external interactions. if we are not satisfied we try to find new solutions, by studying existing libraries, or checking if we need to use some new tools or libraries, or even restructure part of the code/ infrastructure. It is a set of skills that you acquire with study (knowing the industry way to do something) and knowing the tools at your disposal by reading documentation of your libraries; then with time you build a set of solutions, at least for many common problems.

@redcrafterlppa303 10 ай бұрын

Isn't that why you generally avoid fork and use threads instead? All threads live in the same process sharing the heap while having their unique stack.

@JohnZakaria 10 ай бұрын

But python can't run true parallelism when you use threads. Maybe the new subinterpreter might deliver the solution

@redcrafterlppa303 10 ай бұрын

@@JohnZakaria I would say that's a design flaw in the language. Just another reason to hate on python 😂

@JohnZakaria 10 ай бұрын

Python was designed in the time where single core CPUs were the norm. Yeah it might be a problem now. Yes they could release python 4 and break everything for that to work, but that's painful for everyone

@wernersmidt3298 10 ай бұрын

@@JohnZakaria Wasn't there some news that they are going to remove the GIL?

@JohnZakaria 10 ай бұрын

You're right i forgot about pep 703. I think it was more for library devs. The pep by itself wouldn't speedup code. If I remember correctly it would slow down regular code

@codeman99-dev 10 ай бұрын

Talk about some great numbers to add to the resume!

@ember2081 10 ай бұрын

you've got to be so proud of yourself jesus

@glichking6812 4 ай бұрын

I know nothing about any code or programming but I keep getting this video and still have no idea what's being said or how the solution worked

@pieter5466 10 ай бұрын

4:40 Oh this is cool, I really need to learn more about the C implementation underlying Python. edit: now I wonder how a circular garbage collector works...

@lonterel4704 10 ай бұрын

Generation algorithm

@Barteks2x 10 ай бұрын

I don't know for sure ow it's implemented in python, but in general a GC works not by deleting stuff that needs to be deleted, and instead attempts to find everything that is referenced, and keeps that (by just traversing the object graph and keeping everything that is reachable)

@throwaway3227 10 ай бұрын

It's not the way Python does it, but Floyd’s Cycle Finding Algorithm is a pretty interesting way of finding circular references.

@vinitkumar2923 10 ай бұрын

Could we use in any Django project that uses celery or is it only specific to Sentry?

@anthonywritescode 10 ай бұрын

should be pretty universally useful, yeah

@jlowe_n 9 ай бұрын

Hey Anthony - the just found your last few videos and they have been great - I've been using memray cprofile pystack a lot the last year and its good to see how other folks are using it. One question on gc.freeze() --- I've tried to recreate the standard Python behavior with CoW and fork with a basic example. (load a handful of modules, fork, do some minor calculations, force gc.collect). Examining the shared memory unique memory set in Debian, I don't seem to be able to recreate the issue in trivial cases.

@anthonywritescode 9 ай бұрын

it's impossible to tell without seeing your setup

@skreftc 10 ай бұрын

This is a great video. Could you mention whether you saw a visible change in CPU usage and task latency? We implemted this at work and we did see a decrease in memory consumption but the CPU increased quite a bit. Which is also seen by some tasks taking twice as much time.

@anthonywritescode 10 ай бұрын

our CPU didn't change noticeably, if anything it improved a tiny bit (which is what I expect)

@eduardmart1237 10 ай бұрын

Can you make a guide on how to use Celery with Flask and Django? Especially when you create celery workers and wait them in flask.

@anthonywritescode 10 ай бұрын

personally I would not recommend using celery. the architectural decision to use it at work predates me and is almost too big to change at this point

@eduardmart1237 10 ай бұрын

@@anthonywritescode what are the alternatives?

@anthonywritescode 10 ай бұрын

any work queue really

@itay51998 10 ай бұрын

I know some python but not so in-depth, can barely understand what you are showing in cpython. How would one learn this stuff?

@CouchPotator 10 ай бұрын

That would be because the cpython stuff is C code, not python. And must of that code are Macros ( the lines begin with a #) and, to simplify, that is code that is run before it's complied. Mostly it's checking what compiler and system it's going to be used on. ___GNUC___ being the GCC and ___CLANG___ being the Clang C Compilers, respectively . ___STDC_VERSION___ is the version of the C language standard being used. _MSC_VER is the version of Microsoft's Visual C complier.

@brookskd87 10 ай бұрын

Neat trick. Instead of using Celery prefork why not use the solo worker which is single process and let k8s scale the workers? This works well for our application and uses much less resources. The health probes and pod termination are tricky with long running tasks but possible by touching a file periodically. This way k8s handles hung tasks and more pods not worker processes is how you scale up.

@anthonywritescode 10 ай бұрын

in theory that's better. practically though there are memory leaks and significant (unused) overhead of just getting the django app initialized. so single worker would be pretty wasteful (that prefork had such an impact is kind of a testament to that) if each worker were a separate service that had very specific dependencies it would probably make sense? though that would involve tons of work since we have hundreds of different tasks

@spaghettiking653 6 ай бұрын

If you disable the GC at this point before the fork, doesn't that make your program never free memory at any point after the fork? Do you ever re-enable the GC?

@anthonywritescode 6 ай бұрын

gc freeze does not disable the gc

@lonterel4704 10 ай бұрын

I think you can also do this trick with gunicorn

@anthonywritescode 10 ай бұрын

yep! or really any prefork framework

@sepgh2216 10 ай бұрын

Exactly why I came to the comments. Wondering if anyone has tried this on Gunicorn and saw the results.

@australianman8566 7 ай бұрын

how did he open paint when he's on ubuntu?

@__Brandon__ 10 ай бұрын

great work

@ractheworld 10 ай бұрын

What a good engineer! This is why some guys rake in more dough than others.

@Jorge86797 10 ай бұрын

In my work I also noticed 9:25 this.. block algorithm specifics aligned for small objects optimisations. However I have a need to... optimize if for storing bigger objects. It's.. bytes and str objects with sizes up to 5-10 MB (to be precise - thousands of incoming and outcoming html responses) which as we know - immutable and require.. continuous block of big size to store. As result of this I have.. strange situation when process have for example total 50 MB of free RAM allocated to process but as It doesn't have free single continuous block with size of 5MB - process asks OS to allocate more RAM so I quickly run out of RAM with a lot of free memory I can't efficiently use.(All things inside single process) Where or How I can get more detailed info about this? And in what direction I need to route?

@anthonywritescode 10 ай бұрын

try jemalloc perhaps?

@Jorge86797 10 ай бұрын

@@anthonywritescode Thank You for advice. I will try that

@remboldt03 10 ай бұрын

You know how to make programs more efficient I know how to use Paint more efficiently We are not the same

@rkdeshdeepak4131 10 ай бұрын

Hey how do you use these windows apps directly on your linux desktop?

@drz1 10 ай бұрын

@rkdeshdeepak4131 10 ай бұрын

@@drz1 I know that , how does he make the individual apps appear directly on the linux desktop ? I have seen multiple times, e.g. paints in this video

@kamilogorek 10 ай бұрын

This is not linux desktop. It's Windows with Linux VM in fullscreen mode, so he can simply tab out to other window apps@@rkdeshdeepak4131

@anthonywritescode 10 ай бұрын

not even full screen either but yes -- I crop the obs scene to just the Linux vm

@shadowpenguin3482 10 ай бұрын

Had to think a bit to understand, to put it in other words, he does not have a Windows VM in Linux, but a Linux VM in Windows, OBS is running on windows and is cropped the area of the Linux VM. When he moves a windows window on top of the Linux VM window it is not in the VM but on top of it.

@trainerprecious1218 10 ай бұрын

i am sorry if i missed but what does "paging into those objects" mean?

@anthonywritescode 10 ай бұрын

without going into too much detail memory is segmented into chunks which are called pages. when paged in they become resident (copied from the parent process)

@Asgallu Ай бұрын

Great video

@smccrode 10 ай бұрын

Hope you get a raise or a bonus for this! ;)

@Rachelebanham 9 ай бұрын

dang python sucks at copy on write!!

@miguelborges7913 10 ай бұрын

Is that an ubuntu vm on windows?

@sconnz 10 ай бұрын

Jeeze what type of server has 6+ terabytes of ram 😮

@anthonywritescode 10 ай бұрын

not a single server, a kubernetes cluster

@sconnz 10 ай бұрын

@@anthonywritescode Thanks, that makes sense.

@danieloberhoff1 10 ай бұрын

hmm, tbh i would never runb something as big in python. maybe rather nodejs? but maybe that has another can of worms...still, the severe performance problems I keep running into with python would strongly disincentivise investing that deeply on it on a high performance server...

@robertfletcher8964 10 ай бұрын

at this point your looking at Rust, C++, or GO. all of which have their own worm cans. Really though I think this video proves that Python is currently doing the job at enormous scale, and its being used by allot of smart, and very experienced people.

@joshix833 10 ай бұрын

NodeJS has big performance problems too. Something native like Rust would be better

@protonjinx 10 ай бұрын

this just reinforces my belief that garbage collection based memory management is evil

@anthonywritescode 10 ай бұрын

a bit naive don't you think

@nezbrun872 9 ай бұрын

@@squishy-tomato Projection much? Yeah, just throw hardware at the problem. Cloud vendors must love you.

@andrey6104 10 ай бұрын

Нихуя не понял, но видос интересный, спасибо Антоха.

@k1zmt 9 ай бұрын

Ну чего ты не понял-то? Сказали сборщику мусора не отслеживать ссылки. Его структуры перестали копироваться в дочерние процессы.

@Terrados1337 10 ай бұрын

Imagine someone tries to learn python and they start on their merry way, learning the basics, building their first hello world. And then you run in Dumbledore style and ask them calmly: "HARRY! Did you waste a Terrabyte of RAM using garbage collection?!?!"

@djtomoy 10 ай бұрын

Huh?

@JarosławPorada-y5c 10 ай бұрын

Do you think that running gc.freeze after gc.collect would improve more memory usage? def _create_worker_process(self, i): worker_before_create_process.send(sender=self) gc.collect() # Issue #2927 return super()._create_worker_process(i) I put that signal just before collect and thats why this come into my thought.

@anthonywritescode 10 ай бұрын

collect will likely make it worse because it will make more holes in arenas