Had to do a fast and superficial analysis in a very long code. This video made it possible, thanks a lot pal.
@sheikhakbar20674 жыл бұрын
This channel needs a couple of millions subscribers... I always come back to it to learn those marvellous tips and tricks!
@Drahagoon4 жыл бұрын
Awesome video! Well explained, with a simple, clear and typical hands-on example illustration. Great work.
@MsSuyash19955 жыл бұрын
I came here to get a glimpse of how cProfile module but leaving here impressed with your final solution... I loved how you combined the zip() function after sorting the list... And, a great job in illustrating the importance of why profilers are an important tool in a programmer's armamentarium...
@mstr_rprochowicz5 жыл бұрын
It helped me a lot in tracking expensive functions that were unnecessary used 2 million times in a loop. Thanks for this useful tutorial!
@MrAmbarish7104 жыл бұрын
Man your videos are really really helpful! Best explanation of cProfile, profiling and optimization in python. Please keep posting videos...
@legau2k6 жыл бұрын
Awesome video. It was great watching you go step by step through the ode optimization. Your solution for finding duplicates also was very clever and elegant. Worth a subscribe ^.^
@mervynwinn18525 жыл бұрын
i love the way you say "popping"
@eilonavizemer77553 жыл бұрын
Great video! here is another, perhaps easier, solution to make this code's complexity linear: 1. lowercase all the movies 2. convert your movie list into a set (sets in python avoid duplicates) 3. convert the set back into a list and return it.
@liorbm13 жыл бұрын
It will be nice to see perf difference between his final code to your idea..
@vermarajat25962 жыл бұрын
i think conversion from set to list will take more time.
@cosminturtureanu6922 жыл бұрын
The goal is to find the duplicates, not to remove them
@balmittal17702 жыл бұрын
Nice code optimization. specially the last one.
@0versun07 жыл бұрын
More more more videos. Yours video is very helpful! Keep going
@abhishekpandey70966 жыл бұрын
Hey🌏🌏🏕️
@sailalmishra48605 жыл бұрын
Buddy this is Amazing, You should not be on such low subscriber count.. God bless
@simonbrecher8785 жыл бұрын
Good video, but it is actually quadratic (polynomial), not exponential. Quadratic is n^2, polynomial n^k, where k is a constant and exponential is k^n. n is length of input. You would not be able to do even near 5000 in exponential problem.
@daviddvorak32783 жыл бұрын
The final solution is also not linear, but n*log(n), since python sort is not linear.
@AyushMandowara_xx74 жыл бұрын
This helped me optimize code by about 50-75% depending on the file contents being scanned. Earlier it was consuming about a minute on a large scan, while now it takes about 20 at max. The average speed is reduced to 10secs from 25secs. All I did after analyzing was change my Pandas Series objects (generated from Google Spreadsheets) to tuples (lists would also have the same effect but my data never changes in a single run). Using cprofiler I could see that Pandas library was consuming loads of resources just to fetch values based on an index number. Thanks a ton!!
@prakharchaurasiya81074 жыл бұрын
Finally some optimization that is not too complex. Thanks.
@sm3801_smo7 жыл бұрын
I initially subbed because of your Biological Psychology videos, but I didn't know you're into programming, very useful video!
7 жыл бұрын
+Samuel Muñoz thanks ! Yes, the Bio Psy lectures are something new for me. Most of the videos are about Python and/or OpenSesame
@vinitkumar29232 жыл бұрын
Great video and explanation. Thanks for sharing this.
@IsmaelRDeMelo2 жыл бұрын
You earned my follow at 15:33
@tonyradice4166 Жыл бұрын
Outstanding presentation!!!
@AutoXplorerYT2 жыл бұрын
Well explained! Thanks for this video....
@Julien-hg8jh4 жыл бұрын
15:30 auto corection ! Nice video BTW :D
@yildirimicen7662 жыл бұрын
Hi Mr. Mathot, you are great, I love your Python sessions... :)
@marveltv53414 жыл бұрын
Careful... he is a hero 🙌
@maedehshahabi47442 жыл бұрын
Thank you sir for your clear explanation.
@hamol3d3 жыл бұрын
Great Video! Thank you.
@razintailor4 жыл бұрын
Great explanation. Lucid and fundamental. It is indeed helpful.
@ranelpadon88343 жыл бұрын
Good analysis and build up of improvements. Thanks!
@MultiRick152 жыл бұрын
Wow! Great explanation.
@marazDNG2 жыл бұрын
Great video man!
@ke30_4 жыл бұрын
I love this so much
@farooqseeru9486 жыл бұрын
Brillant. Clear explanation.
@Grimlor7 жыл бұрын
I've found this so useful! Thank you for this video. By analyzing my code and applying a little tweak, I've already managed to save 0.8 seconds of runtime. And I've only just started! :D
7 жыл бұрын
+Grimlor glad to hear it!
@DaanWaardenburg4 жыл бұрын
Keep coming back here when my code starts running slow :P
@nopo_b36454 жыл бұрын
Yeah I can imagine :-) First time here. Coming back to remember yourself that errr... yeah ... why on earth does it take so long and frustrates me ... how am I ever going to find out wth this code is slow. Is it me or is it some crazy circumstance that goes on in my libraries that I use. While waiting for your code to finnish you can actually study the problems and get them fixed. But I do think not by hand just by profilers
@15kasturi Жыл бұрын
I just subscribed you by watching this video, very informative and nice goggles!
@danyalt82212 жыл бұрын
It Was Great! Thank You.
@hrithiksharma20474 жыл бұрын
Great tut bro! Thank you
@arjunkirpal97766 жыл бұрын
Thank you Sebastiaan! Would love more Asyncio videos!
@onlymusic20054 жыл бұрын
Real treasure... bunch of thanx
@mahesh_kok5 жыл бұрын
This guy is crazy in coding , concepts and thinking...he brought down the execution time from 6 sec to .002 sec......this is insane ... tremendous work done bro...
@thecaveofthedead6 жыл бұрын
Excellent tutorial. Thanks.
@migovas14835 жыл бұрын
This was great and clear, right to the point!!
@xanterx4 жыл бұрын
Love your shades 🤘
@rgrapey7 жыл бұрын
Clear and informative!
@shivan24184 жыл бұрын
In case anyone in the future reads this I found that this method executes even faster than the method he ended up with. from collections import Counter def find_duplicate_words_counter(src='movies.txt'): return [movie for movie, count in Counter([movie.lower() for movie in read_movies(src)]).items() if count>1]
@Alister2222223 жыл бұрын
Was going to post this as well - converting the list into a Counter (e.g. a special dict type from the collections module) and running a comprehension to get back everything that had a count above 1 does seem to be the cleanest way to get to the solution, and I am pleased it is also the fastest!
@droit192 жыл бұрын
@@Alister222222 - I tried this and was 0.13 seconds faster or 44% faster than the Zip method
@acho83874 жыл бұрын
very good video! thanks!
@pygemssoftware42542 жыл бұрын
Great work and explanation. I would like to email my eyes to you as token of my appreciation😃
@nutcrackeroverdrive7 жыл бұрын
Thanx, Sebastiaan, very useful and helpful video.
@babuasian6 жыл бұрын
Appreciate it. Really useful for most of the programs..
@ЕвгенийТитов-и9ю4 жыл бұрын
This is super nice video, thank you sir
@botenbireu78757 жыл бұрын
Thank you a lot! very clear explanation!!!
@deividaspelakauskas93944 жыл бұрын
Underrated.
@haonanqiu42513 жыл бұрын
thanks a lot!
@jeremyalvaprathama40694 жыл бұрын
Awesome work! I just subscribed
@benedictcoltman19834 жыл бұрын
Superb! Thanks
@drewduncan57746 жыл бұрын
12:54 Quadratic, not exponential.
5 жыл бұрын
NO it's in O(n*ln(n)) because of the Sort()
@stephenaiesi60735 жыл бұрын
With Big O notation we are really ony concerend with the term with the highest power. An algorithm on the order of O(3x² + 2x + 11) is usually reduced down to to O(3x²). I've seen books drop the coefficient as well but that has a fairly large impact on the accuracy of the expression in my opinion. So in terms of Big-O, an algorithm on the order of a quadratic equation is usually considered to be on the order of its highest term. If you think about comparing two algorithms, one operating at O(3x² + 2x + 11) to one that runs at O(3x²), let's see how different they really are: So given the following equations: f(x) = 3x² + 2x + 11 g(x) = 3x² Let's see how they correspond given a single input (n=1) f(1) = 16 g(1) = 3 The ratio between these two results is 5.33 and would go to show that quadratic and exponential are not swappable in this context Now lets scale it to 100 inputs, n=100 f(100) = 30211 g(100) = 30000 Now they are operating at a ratio of 1.007. Not identical, but damn near close dependng on the precision needed. In terms of making algorithms efficient with computers 100 inputs is not considered much anyways. Now let's scale it to 1,000,00 inputs f(1,000,000) = 3000002000011 g(1,000,000) = 3000000000000 Ratio of 1.00000066 The difference in comparing these two with without the extra terms is often negligible when comparing them to algorithms on the order of a different exponential power. Run the same exeriment with comparing f(x²) and g(x³), with and without extra quadatic terms and you can see that dropping the lower terms, though not exact, is definitely enough to compare the efficiency of algorithms. So as the size of the inputs grows, paritculary towards quantities where optimization is necessary, we are usually dealing with such vast amounts of data that including the lower terms of the quadratic formula in our assessment of an algorithms efficiency does not necessarily provide extra insight. ps: i'm fully aware this isn't the case in every domain, but it is for the most part how it is done and definitely applies to the kinds of problems in this video.
@_treed15 жыл бұрын
Lol these comments. It's a loop in a loop which is n * n so n^2 tadah
@__gavin__4 жыл бұрын
@@stephenaiesi6073 > I've seen books drop the coefficient as well but that has a fairly large impact on the accuracy of the expression in my opinion. Big O notation has a formal mathematical definition. A function f(x) is said to be O(g(x)) if |f(x)| =x_0, where A and x_0 are some constant values. Hence, when considering big O notation, it really doesn't matter if you drop the 3 or not. If f(x) = O(3x^2) then all we are saying is that there exists some x_0 such that for all x>=x_0, |f(x)|
@calebmunuru35984 жыл бұрын
Stephen Aiesi Thanks mate. This is a really good explanation
@kpespinosa5 жыл бұрын
great explanation! cheers
@blanky_nap7 жыл бұрын
Great video!
@TheFilipo26 жыл бұрын
Thank you, this was super helpful!
6 жыл бұрын
Good to hear!
@siddharthindora71824 жыл бұрын
Great Video...Thanks for explanation :)
@Jack-42424 жыл бұрын
Thank you, this helped me so much :)
@mohammedgt81022 жыл бұрын
Awesome video.
@AmrXcellent4 жыл бұрын
Good video but If I understand correctly the final change in code change does not account for a movie title that is duplicated more than once. So the first two iterations of the code are doing more functionality. All in all nice video, I learned something new watching it. so thank you for that.
4 жыл бұрын
That's correct: triple duplicates are not caught with this method. And thank you!
@deadman19992 жыл бұрын
yes, I was thinking the same thing, the final code was so fast because it only checked its 1st neighbor, taking into account that there were only 1 duplicate.
@anumsheraz46254 жыл бұрын
is there any tool to identify how much memory is consumed by the code ?
@АлексейТрофимов-ф5у2 жыл бұрын
thanks!
@svalaboj4 жыл бұрын
your video is very useful, thanks for the same.
@bunlonglay4633 жыл бұрын
Hey, shouldn't be your last solution, where you sort the movies list, O(n log(n)) and not as you said O(n)?. Sorting the movies list takes O(nlog(n)) time. Also when you use zip with slices of the movies arrays, copies of movies are created. This is also inefficient. Could someone maybe confirm what I said? Anyway, great video explaining the profiler
@fuanka17246 жыл бұрын
Loved this. Optimization is really important to me. Thanks.
@dhananjaykansal80975 жыл бұрын
YOU ARE JUST AWESOMEEEEEE
@alvaromartin63015 жыл бұрын
Excelente Content! New sub.
@sashkazayebashka6 жыл бұрын
Great video/ Thank you man!
@parietal1007 жыл бұрын
Thank you Sebastian
@mariusnorheim6 жыл бұрын
Hi Sebastian, I tried running the profile decorator, but 1) I'm using python 2.7 and 2) I'm running it in atom, not jupyter, so I get an error message. Would be awesome if you could post the code for python 2.7 as well in the file
@mariusnorheim6 жыл бұрын
Actually got it to work. It seems that you'll have to encode the Unicode strings to byte strings, and use io.BytesIO, instead of io.StringIO.
@BullishBuddy3 жыл бұрын
👍👍
@vyl67815 жыл бұрын
Saved my sanity.
@neelojp84607 жыл бұрын
thank you so much for your videos they are really very helpful! Do you have any own books about python ?
7 жыл бұрын
+post fix Thank you! No, I'm afraid that I do not have any Python books myself. But there are plenty of good free Python books out there, such as Byte of Python.
@neelojp84607 жыл бұрын
thank you for your answer, dank je wel :-).... you should wirte one about the tricks which you show us here... and here is the link for the Byte of Python for all others: www.gitbook.com/book/swaroopch/byte-of-python/details
@emasmach5 жыл бұрын
Nice. Excelent.
@alishermatkurbanov92056 жыл бұрын
What if the list has more than 1 duplicate, e.g. [1, 3, 1, 4, 1, 4, 4, 5] -> sorted [1, 1, 1, 3, 4, 4, 4, 5] -> zipped smth like this [(1, 1), (1, 1), (1, 3), (4, 4), (4, 4), (4, 5)], so 1 and 4 will be added to duplicates twice. Doesnt duplicates should be list with unique items?
6 жыл бұрын
That's correct. Triplicates will end up twice in the list of duplicates, which may not be what you want. An easy trick to get around that would be to use a set comprehension (kzbin.info/www/bejne/q4W4h2WbhLOkibM), rather than a list comprehension. Because sets by definition consist of unique items.
@luciano_remes3 жыл бұрын
Your last solution runs in NlogN time complexity, but you could actually make it faster by just using a set of found movies. It would run in Linear time and be way simpler: found = set() duplicates = [] for movie in movies: if movie not in found: found.add(movie) else: duplicates.append(movie)
@fantasdeck2 жыл бұрын
I like how you edited your video to hide the little typo you made. But, cool tool. Will be using...
@adityakushwaha36544 жыл бұрын
But how do you know which code will be more efficient wrt present code ?
@graycybermonk30685 жыл бұрын
You will kill me. Really Awesome.
@norwegiandud3 жыл бұрын
Helpful video, thanks! Just one 🐛 with the 007-method (or weird feature). If there are movies that are represented more than two times they appear as duplicates in the duplicates list. E.G. 'the phantom of the opera' appears five times in the TXT file, and four times in the list of duplicates. Now if this is a 🐛 or feature ... depends on who you ask.
@chunceywei82847 жыл бұрын
Thank you
@vanglequy78444 жыл бұрын
13:30 Who else pause the video and challenge yourselves? But beware of the tendency to jump into redesign the solution before profiling.
@Excess-qn7qh4 жыл бұрын
does the @profile annotaion only work with jUpiter?
@Jure12345673 жыл бұрын
Can I do it with wxwidgets classes and multithreading?
@nikithar36286 жыл бұрын
Awesome
@IsmaelRDeMelo2 жыл бұрын
"Well, you can see that our code it's taking about 0.00023 to execute. But if you're not satisfied with that..." lmao
@ikramu57195 жыл бұрын
Thank you for that explanation. Neat solution with the zip and slices too! ps The link for the movies file is now out of date though.
@deepak17256 жыл бұрын
Very Very nice
@yildirimicen7662 жыл бұрын
Hi Mr. Mathot, how about the following with "combinations" (you can even omit "movies.sort()"): from itertools import combinations # find duplicates in list of movies movies = ['abc','abc','xyz','ddddd','ddddd','star wars'] print([m for m,n in combinations(movies, 2) if m==n])
@fcoignmo6 жыл бұрын
Where did you get the "movies.txt" file (link)? Thank you for the vide, great work.
6 жыл бұрын
My reply is a bit late, but I got this data from here: osf.io/r73y9/
@pushpendrasingh18196 жыл бұрын
Bro it would give duplicates result if our file is containing movies that are there in file more than 2. Ex. movie.txt Hello Hello Hello then your code will print hello atleast twice.
@sailalmishra48605 жыл бұрын
Hey, Hope you understand this is for demonstration purpose. Would be good to concentrate on the technique rather than logic. u can probably go about refining the logic.. A solution is to use set and then extract elements with more than 1 occurance.
@Memfis05 жыл бұрын
you can use this: `duplicates = [name for name, count in Counter(movies).items() if count > 1]` instead of that zip zip method also this one doesn't require the list to be sorted remember to `from collections import Counter`
@vaibhavjain19143 жыл бұрын
Bruh in this video you are teaching code optimization but looking at your choice of wearable I feel I am learning how to assassinate enemy but amazing video 😀
@7aygames353 жыл бұрын
The 22 people who disliked are those who were writing bad code and when it was pointed out to them, they just got angry
@naughtybuddha39424 жыл бұрын
Where is the movies.txt? Please provide it, thanks.
@ВикторДзеба2 жыл бұрын
May you give us the movies.txt file please???
@DragonRazor92833 жыл бұрын
from 6 seconds to 0.007 seconds wow!
@xspager6 жыл бұрын
Awesome explanation but when you removed the function you also changed the way you do the searching, you stopped looping over all the movies and used the "in" operator
@bhupendram52122 жыл бұрын
Thanks a lot, but the duplicate logic is wrong. It will fail if the movies list has 3 or more consecutive duplicate movies.