Binary Search Algorithm - Computerphile

  Рет қаралды 153,860

Computerphile

Computerphile

6 ай бұрын

Back to basics as Dr Mike Pound explains a simple but incredibly useful algorithm, binary search.
#algorithm #ComputerScience
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharanblog.com
Thank you to Jane Street for their support of this channel. Learn more: www.janestreet.com

Пікірлер: 264
@craftmechanics6483
@craftmechanics6483 6 ай бұрын
I love teaching this to my students as a guessing game. I will think of a number from 1 to 100 and give them 8 guesses, but after each guess tell them if their guess was correct or lower/higher. After a few rounds of playing, most people figure out that the optimal strategy is guessing somewhere in the middle, and essentially discover the algorithm on their own. After that, we implement the gussing game on a computer. I find this a very fun way to learn about binary search.
@DavidLindes
@DavidLindes 6 ай бұрын
i'm imagining doing that in a classroom, and... yeah, sounds fun! Cool!
@Armageddon2k
@Armageddon2k 6 ай бұрын
I'm totally stealing this idea!
@mariolol8333
@mariolol8333 6 ай бұрын
imagine your teacher naming himself craftmechanics6483 with a minecraft pfp on youtube XD
@DarrenYung
@DarrenYung 6 ай бұрын
I've been doing this exact same lesson for years too! Once you've introduced variables, conditionals and looping then you have the basic building blocks of algorithms. Binary Searching with the guessing game is a nice easy way to introduce this fundamental algorithm.
@nashsok
@nashsok 5 ай бұрын
I audited MIT's 6.0001 (Introduction to Computer Science and Programming in Python) course through OCW and the professor did the exact same demo to introduce binary/bisection search!
@emrekantar5003
@emrekantar5003 6 ай бұрын
These men make me feel as if I am in a coffee house with my best friends talking
@queueeeee9000
@queueeeee9000 6 ай бұрын
Dr. Pound and Dr. Grimes from the numberphile videos are my absolute favorite!
@paulsutherland3813
@paulsutherland3813 6 ай бұрын
Dr Pound sounds and acts identically to someone who went to my secondary school in London, cheeky lad in IT class who aced all his assignments while getting his computer to run games we weren't supposed to have access to. So, yes.
@njd9828
@njd9828 6 ай бұрын
You spelt pub wrong.
@aaabucus3104
@aaabucus3104 6 ай бұрын
Anybody who sits around and talks like this has no friends in the first place and that's just science!
@queueeeee9000
@queueeeee9000 6 ай бұрын
@@njd9828 you spelt pubic wrong
@genelee7869
@genelee7869 6 ай бұрын
I'd love to see a video explaining the concepts behind how the different sorting algorithms work (Heap, Merge, Shell, etc)
@JanB1605
@JanB1605 6 ай бұрын
I'd love a video on hash maps!
@Loki-
@Loki- 6 ай бұрын
​@@JanB1605I loved my prof I had for computer science 2 which was lots of this. He was like this video, plus comedy, and simplified PowerPoint.
@jimwinchester339
@jimwinchester339 5 ай бұрын
They exist here on KZbin. I've seen them.
@user-zl4kk7wi5u
@user-zl4kk7wi5u 6 ай бұрын
Dr. Pound is excellent and entertaining as always. I found myself smiling and giggling throughout the whole video due to the way he handles his small mistakes and jokes around. Yet he also imparts wisdom!
@konstantinrebrov675
@konstantinrebrov675 6 ай бұрын
We need more videos explaining in low level details how various common algorithms work, such as file I/O, B trees, image compression, files representation (PDF, audio, video, archive, ELF), Huffman coding, hashing algorithms, md5sum, dynamic programming. I mean walk through the pseudocode line by line explaining what it does, and draw the memory diagrams, like what is happening to the data, how the bits or bytes are stored, loaded, moved around.
@ross951
@ross951 6 ай бұрын
Textbooks are your best bet for finding and understanding (easy to quickly reread sentences) that kind of material.
@konstantinrebrov675
@konstantinrebrov675 6 ай бұрын
@@ross951 Lectures are better because then we would be able to see exactly how all the bits and bytes are moved around in the memory.
@Aidiakapi
@Aidiakapi 6 ай бұрын
They've done videos on many of the topics you've mentioned.
@chessthecat
@chessthecat 6 ай бұрын
That's not what Computerphile is. Sounds like you should start your own channel.
@konstantinrebrov675
@konstantinrebrov675 6 ай бұрын
@@chessthecat Who are you to say?
@KurtisRader
@KurtisRader 6 ай бұрын
The point made at 14:30 is arguably the most important part of this video. Specifically, it is less important to know how to write, from scratch, a particular algorithm than it is to know that different algorithms have different tradeoffs. Knowing how to pick the best algorithm (and data structures) for a particular situation is more important than being able to implement an algorithm on a whiteboard. One of my interview questions explores whether an applicant understands that there are many sorting and searching algorithms and the nature of the data being manipulated, and other constraints, is important in deciding which algorithm to use.
@Gudsfralsare
@Gudsfralsare 6 ай бұрын
In these times when AI is all the hype, these are the videos we need. The fundamental algos from the 60's are here to stay and run the world.
@thatchessguy7072
@thatchessguy7072 5 ай бұрын
AI was used to help find an improvement on binary search recently. It’s covered in keynote talk of the 2023 CPP con.
@theblinkingbrownie4654
@theblinkingbrownie4654 5 ай бұрын
​@@thatchessguy7072wHAT?
@philroo1
@philroo1 5 ай бұрын
​@@thatchessguy7072 🤞 all watched over by machines of loving grace
@vatbub
@vatbub 6 ай бұрын
One thing to add: Once the data is sorted, and you need to add new data to it, binary search can actually tell you where you need to insert the new data in order to keep the data sorted. Therefore, the real issue is getting the data sorted in the first place (as mentioned in the video), but once it is sorted, binary search helps you find stuff fast and helps you keep your data sorted even when you modify it.
@NihongoWakannai
@NihongoWakannai 6 ай бұрын
Adding a new piece of data to the middle of a giant array would require changing a lot of memory though. Which is why we might use something like a binary tree instead.
@pacoalsal
@pacoalsal 6 ай бұрын
Depending on the problem the performance benefit of keeping the data adjacent in memory might offset the overhead of shifting it around.
@traywor1615
@traywor1615 6 ай бұрын
10:25 just wanna note here for anyone trying to implement this algorithm for very large arrays, be aware of the integer overflow. (l + r) // 2 will first calculate l + r and then try to divide it by two. however if your list is lets say 2/3rds of the size of your integer and if you then add l + r and l and r are quite large, then they will overflow. Then your m will be messed up and your binary search will be incorrect. And this is not a contrived scenario, there was this exact kind of bug in the java SDK, for 9 years! I was actually surprised that Mike didn't mentioned this :O
@jlappala
@jlappala 6 ай бұрын
We try to teach the l+(r-l)//2 -method to avoid this.
@solqr6521
@solqr6521 6 ай бұрын
It was in Python, where there wouldn’t be overflow
@johnvriezen4696
@johnvriezen4696 6 ай бұрын
@@solqr6521He should still write the code the l + (r-l)//2 way because code will inevitably be found and copied to a platform that has 32 bit integers (e.g. C) and then break.
@raphaelfranzen9623
@raphaelfranzen9623 6 ай бұрын
@@solqr6521 Why wouldn't there be overflow in Python ? An integer has only a finite size , so if you add more than it can hold , it should wrap around (overflow).
@krajsyboys
@krajsyboys 6 ай бұрын
@@raphaelfranzen9623 Integer overflows don't happen if you are using just python because they allocate arbitrary amount of data to the variables, so if a number should have overflowed, python allocates more momory to store the variable in
@Richardincancale
@Richardincancale 6 ай бұрын
One of my favourite books from almost 50 years ago - Sorting and Searching by Donald E Knuth - tells you all you need to know!
@GordonjSmith1
@GordonjSmith1 6 ай бұрын
This was great! It really highlighted the need for the development of 'logical' learning, and not just 'proceedual' or 'fact' learning. Actually a very 'key vlog', in all the excellent vlogs that have been made.
@kevinramadhan9871
@kevinramadhan9871 6 ай бұрын
I used to watch you all the time when i was in uni around 3-4 years ago. Just stumbled upon this video and glad you're still going strong! keep up the videos! it's always a treat to watch, even if i'm already familiar with the topic
@daveingerslev
@daveingerslev 5 ай бұрын
Finding theft (or damage) on CCTV footage is a fantastic real world use for binary search. As long as something visually changes (i.e. goes missing or gets broken), then it turns the task of watching the cause into a 5 min task instead of hours/unjustifiable.
@sambenjamin7515
@sambenjamin7515 6 ай бұрын
Always happy to see a Mike video! Great refresher. 😀
@SparkysWidgets
@SparkysWidgets 6 ай бұрын
I changed a linear search on a calibration step for a product on the production line to binary searching. This small change netted millions of minutes of production line weekly(test used to take minutes and we produced over 1million units a week) back to the CM. The main point being is it doesn't have to be a large data set it could be a small known dataset but that each takes time to search or in this case "test".
@agoatmannameddesire8856
@agoatmannameddesire8856 6 ай бұрын
More Dr Pound videos on search / sorting!
@chrisogonas
@chrisogonas 3 ай бұрын
Simply and elegantly illustrated! Thanks, folks 👍
@olafurfo89
@olafurfo89 6 ай бұрын
Currently doing my second year in computer science and you made me realize when I'd want to use BST in projects. Sort in morning, many lookups throughout the day. Makes so much sense but I hadn't realized it, Well done !!!
@doomanime61
@doomanime61 6 ай бұрын
I can sit and hear Dr Mike Pound talking all day
@michaelwilson5742
@michaelwilson5742 4 ай бұрын
Yeah, everyone on the corridor can sit and hear him talking all day
@logosimian
@logosimian 6 ай бұрын
I already knew everything in this video, but I still watched it.
@coolwarfare
@coolwarfare 6 ай бұрын
Thank you this channel is going to save me before GCSEs
@gerokatseros
@gerokatseros 6 ай бұрын
I really like the way you explain everything ... making it so simple
@artofcomputing-art
@artofcomputing-art 6 ай бұрын
Hell yes! Here comes another great Computerphile video 🎉
@thuokagiri5550
@thuokagiri5550 6 ай бұрын
prof Pound is the Richard Feynman of Computer science
@robblake8999
@robblake8999 6 ай бұрын
It's an excellent point to remember that sorting is O(n log n), wheras a linear search is O(n). Linear search could definitely prevail for a small number of lookups.
@IceMetalPunk
@IceMetalPunk 6 ай бұрын
That's *if* your data isn't already sorted, of course :)
@turdwarbler
@turdwarbler 5 ай бұрын
@@IceMetalPunkNo, for a small number of items, the exact number is hardware dependant a linear search will beat everything apart from a direct index access. linear searches on arrays are very cache friendly, as you walk the array, the CPU is already pre fetching 64 bytes ahead of you. We do this all the time for fast lookups of small tables.
@IceMetalPunk
@IceMetalPunk 5 ай бұрын
@@turdwarbler Cache lookups aren't considered in algorithmic runtime analysis.
@turdwarbler
@turdwarbler 5 ай бұрын
@@IceMetalPunk Well thats where theory and reality differ. In HF LL trading we use linear lookups all the time because they are faster period, for small arrays. Even without a cache they would still be faster, no set up time, no complicated logic, simples.
@sid6645
@sid6645 6 ай бұрын
Binary searches over some state space with a non trivial checking condition (like some monotonic equation) is also fascinatingly useful. Also if you rectify the state by changing basis or something. One of my favourite algorithms!
@Loki-
@Loki- 6 ай бұрын
14:15 My analogy for going through computer science is that we add a bunch of tools to a toolkit and learn when to use which tools.
@IceMetalPunk
@IceMetalPunk 6 ай бұрын
Wow, I can't believe there's never been a binary search Computerphile video before now! 🤯
@sandervdrie9182
@sandervdrie9182 6 ай бұрын
thanks Dr Mike and thanks the Computerphile team, for and good video would, have loved to have had a lesson from Dr Mike
@GilesBathgate
@GilesBathgate 6 ай бұрын
Don't forget you can also use binary search to implement a container that is sorted on insert. (although there are other approaches)
@tiarkrezar
@tiarkrezar 6 ай бұрын
Constructing a list that way is really inefficient. Even if the search is O(log(n)), each insertion will be O(n). At that point, just use some kind of a tree data structure instead.
@rmsgrey
@rmsgrey 6 ай бұрын
@@tiarkrezar It depends how your data is stored. For example, if you normally only store data in even-numbered positions, but insert into odd-numbered positions, then redo the entire list when you hit a collision, then insertion is only O(n^0.5). But, yeah, a naive array implementation of a sorted list will have problems with insertion speed.
@GilesBathgate
@GilesBathgate 6 ай бұрын
@@tiarkrezar I did say there are other approaches. Insertion sort, Self-balancing BSTs like AVL trees and Red-Black trees, etc...
@Chocoholic1337
@Chocoholic1337 6 ай бұрын
Love the algorithm explanation videos! Mooooore!!
@georgelza
@georgelza 6 ай бұрын
love these discussions.
@Lion_McLionhead
@Lion_McLionhead 6 ай бұрын
Pounder can make even this basic algorithm sound complicated.
@andrewharrison8436
@andrewharrison8436 6 ай бұрын
Built into COBOL as the SEARCH ALL command - often overlooked but fast if you know about it.
@atkelar
@atkelar 6 ай бұрын
I love to show these examples in my programming courses (day job); one of the arguments that sometimes is held against more complicated algorithms is that "yes, less steps but each step is significantly more complicated"; but then you do the math and compare 100M steps vs. 23 and see that they would have to be quite a bit more complicated to make the 23 step version perform worse than the linear search one. And I also encounter people who would solve a "find in 10 items" problem with such an algorithm... Cost/Benefit analysis gets quite complicated if you don't have an idea about the number of items involved, or the number of executions required.
@NihongoWakannai
@NihongoWakannai 6 ай бұрын
I think it's always important to learn that there is more to performance than just big O. With webdev being so popular, people focus a lot on scalability. But there are lots of situations with very predictable dataset sizes thus optimizing functions themselves becomes more important
@erichobbs4042
@erichobbs4042 6 ай бұрын
The point about knowing when to use an algorithm is the most important bit of knowledge to take away from this video. I've forgotten how to exactly write plenty of algorithms, but those things can always be looked up.
@felipeabraga
@felipeabraga 6 ай бұрын
Nice, was studying about it last monday... Thanks
@KeyboardSourceError
@KeyboardSourceError 6 ай бұрын
Thank you Dr. Pound!
@wybren
@wybren 6 ай бұрын
I love binary search. If look up times are really low binary search is as good as unbeatable because of computational simplicity.
@Salara2130
@Salara2130 5 ай бұрын
Well that was a nice refresher of the basics ^^
@allenkwan8310
@allenkwan8310 6 ай бұрын
This is the CS equivalent of asking a band to play the classics. Nice.
@-parrrate
@-parrrate 6 ай бұрын
for most problems, I prefer a strict L
@conorstewart2214
@conorstewart2214 4 ай бұрын
This is also how a type of analogue to digital converter (ADC) works, it is successive approximation (SAR). Interestingly when used for ADCs it corresponds directly to binary. The first step where you have a comparator with half the max voltage and the input voltage, the result from the comparison is the most significant bit (MSB) of the value, then when you perform the next comparison it becomes the next bit and that continues on until it reaches the closest that the ADC can get. This method relies on a way to create all the intermediate voltages for the comparison so SAR ADCs contain digital to analogue converters (DAC) too. Also due to the nature of analogue voltages, similar to real numbers, the search will only very rarely be able to end early, since a comparison between two analogue voltages or between two real numbers is very unlikely to result in them being equal and may be impossible if the comparator only output less than or greater than. This does mean that these ADCs and real numbers will pretty much always take the same number of steps to complete.
@LuigiElettrico
@LuigiElettrico 5 ай бұрын
Nice explanation, thanks.
@martijn3151
@martijn3151 6 ай бұрын
In theory, yes binary search is faster. In practice a linear search might actually be much much faster. The one thing that makes a big difference is cache. Computers read ahead memory. So it greatly helps if all entries are stored sequentially. If the binary search is implemented in a tree, the entries probably are all over the place in memory, not sequentially, causing cache miss after cache miss. This will severely impact the performance. Obviously when searching billions of entries, binary is most likely to win. But in a lot of real life situations, the lists are much smaller. Small enough to fit in a cache line. Making the search almost instant. So never just assume what you learn in computer theory is fact, use different setups for your problem and measure(!) to determine what solution works best.
@Quarky_
@Quarky_ 6 ай бұрын
Well, if someone thinks binary search is always faster, then they didn't really learn anything did they? The algorithm clearly gives you a speedup by halving the search space at every step, then obviously it is faster only for large numbers. One might even try to quantify this with: for n = 2^m, if m
@Pyrazahn
@Pyrazahn 6 ай бұрын
Another reason why linear search can be faster for small lists is that modern cpus heavily rely on pipelining and branch prediction. Basically for every if-else, the cpu guesses which of the two paths will be executed next and already executes the first instructions of that path in the background. If the guess was correct, then the results are already there for free. But if the guess was wrong, it is much more expensive to take the other path. In a linear search, if the cpu predicts "element not found, move to next item in list", then that will be correct in all but one case. But in a binary search, the expected results are 50% left and 50% right - so branch prediction fails 50% of the time instead of just once.
@martijn3151
@martijn3151 6 ай бұрын
@Quarkly_ No need to be pedantic here. Log n is always smaller than n. Also for small numbers. Using binary search on a list of 10 items, worst case is 3 steps. Worst case linear search is 10 steps. Yet, as I pointed out, linear might be much faster in that case even though theory says binary search is at least 3 times faster. So, at least to me, this isn’t trivial.
@AAjax
@AAjax 6 ай бұрын
Interesting! So what would the break-even size, between binary search and linear search? I had a look and the x64 cache line is typically 64 bytes, which wouldn't seem to allow for a very useful amount, but this is all new to me.
@krzychxyz8358
@krzychxyz8358 6 ай бұрын
Well, the break-even size depends on a lot of things. From your code efficiency, the programming language, the compiler you use all the way to the specifics of the data set (some data types use more computation power to compare than others). But i've never seen a case where that is above 60, so I guess it won't ever get big.
@nosuchthing8
@nosuchthing8 6 ай бұрын
The phone book analogy is SO powerful
@LiarsPendulum1
@LiarsPendulum1 6 ай бұрын
I am teaching myself programming and was literally practicing writing this kind of code yesterday, working on data sorting algorithms. So glad this is being written in Python. I started with C and this is a great window into how C maps on to Python.
@JuliusUnique
@JuliusUnique 6 ай бұрын
simple yet efficient way of searching, doing it myself that way since I basically can think
@hashtagPoundsign
@hashtagPoundsign 6 ай бұрын
There are 10 types of people, those who watch Binary Search videos, and those who don't.
@housellama
@housellama 6 ай бұрын
There are 10 types of people in the world: people who understand binary, those who don't, and those who know this joke is trinary.
@nosuchthing8
@nosuchthing8 6 ай бұрын
Clever
@LucasundAaron
@LucasundAaron 6 ай бұрын
in python i prefer using l=0 r=len(arr) now right is exclusive and points on invalid greater parts. the only +1 indexing will happen in the main loop now. you only check the end of valid array is 1 pos further with: „while l+1
@Lampe2020
@Lampe2020 6 ай бұрын
If the list is in order you can even make it simpler to determine whether you search further by opening the first and last box and if the target number isn't between those in value it's not going to be inbetween those boxes either.
@kuro4841
@kuro4841 6 ай бұрын
would it be an improvement to first check the first and last index to see if the given number is even in range before looking through the boxes? also would it be advisable to place the first lookup point based on where between highest and lowest it is? so if its closer to the lower bounds, then maybe starts dividing a little sooner like at 40% of the dataset? would this add too much performance overhead or would it possibly an improvement? i know that division is a comparatively costly operation. I would be super interested on someone elses input on this :)
@MuhammadRaza-yd6sg
@MuhammadRaza-yd6sg 6 ай бұрын
Been just thinking about this lately
@LupinoArts
@LupinoArts 6 ай бұрын
The box next to "16" could contain 16 again...
@gsestream
@gsestream 6 ай бұрын
make it faster by giving ranges for each of the boxes, thats then only one step to get to the right bix, which can be sub-divided to another range boxes, its better than log(n) of binary search, having ranges is better than just having the numbers in order, if you have different boxes for all different numbers, then the search is O(1), its bucket search, essential is that you pre-calculate the ranges and not during the search. works more like heap, which does the processing on the background and not during the searches/updates. if you have unique box for each different number, at a known index, which you can jump to, directly, then it works like radix sort for sorting. radix search for buckets might be super duper fast.
@Bianchi77
@Bianchi77 4 ай бұрын
Nice info, thanks :) 👍
@FalcoGer
@FalcoGer 5 ай бұрын
@7:00 l+r might be an integer overflow. Depending on your language that might be undefined.
@BrankoDimitrijevic021
@BrankoDimitrijevic021 6 ай бұрын
Interestingly enough, linear search is likely to be faster than binary search for a small array of integers, due to prefetching and branch prediction mechanisms of modern CPUs.
@cscscscss
@cscscscss 6 ай бұрын
Great video! I understood it very well.
@akiraincorrect4968
@akiraincorrect4968 6 ай бұрын
Did you watch it on 13x ?
@cscscscss
@cscscscss 6 ай бұрын
​@@akiraincorrect4968 Yes
@heliospace
@heliospace 6 ай бұрын
This is the first time an algorithm seemed useful
@MegaDonaldification
@MegaDonaldification 6 ай бұрын
First of all, thank you for agreeing that the mere fact one is enhancing their skills makes it a career move. Most employers are notorious for despising early beginnings.
@diagorasofmel0s
@diagorasofmel0s 6 ай бұрын
This was so informative and capturing, surprisingly my attention span gets a bit longer when prof Mike Pound speaks lol.
@CoreyOgburn
@CoreyOgburn 6 ай бұрын
Fun fact: if you're searching through a list of unknown size (maybe a stream of data) that's ordered then you can do a binary search by checking powers of 2 in a similar way. If the value is greater than m, m *= 2.
@bishboria
@bishboria 6 ай бұрын
I taught this to my 1st year students less than a week ago.
@JeffKaylin-ft5cx
@JeffKaylin-ft5cx 6 ай бұрын
If the search gets REALLY big, I assume the data will be read in a record of fixed length. Would checking the Last data in the record be useful? Like if you're in the library and you may have to switch rooms or floors, should you check the last book so you won't have to come back to that room later when you notice you were SO close?
@pandaqwanda
@pandaqwanda 6 ай бұрын
i love Computerphile :)
@cannaroe1213
@cannaroe1213 4 ай бұрын
If you use the values of l and r (call them L and R), then you can do a weighted binary search which usually performs a little bit better than one that defines m = (l+r)//2 So at 3:13 a weighted binary would say a = 17 , l = 4 (0-based), L = 16, r = 7, R = 9000 (unchecked in video so i made it up) Because a is very close to L and very far from R, you find m as usual but then bias it more in favour of L than R to the point it shakes out as "5" and so finds a immediately, and you don't have to make L = 10 :P
@Mode_na
@Mode_na 6 ай бұрын
Shouldn't you check the first and last item in a sorted list to make sure the number you're searching for is in range? It should be more efficient by potentially circumventing the need of running a search algorithm all together.
@f.f.s.d.o.a.7294
@f.f.s.d.o.a.7294 6 ай бұрын
Only if that is frequently the case.
@ligmoes
@ligmoes 6 ай бұрын
Indeed. when its out of range it shrinks in a linear fashion until the ranges swap and returns false but also works for when is in range and is not on the list. However, that is a matter of implementation. Here's same function with your solution: def binary_search(lst, a): l = 0 r = len(lst) - 1 if a > lst[r] or a < lst[l]: return print("Out of Range") while l lst[m]: l = m + 1 elif a < lst[m]: r = m - 1 else: return True return False
@neilmehra_
@neilmehra_ 6 ай бұрын
It doesn't really impact performance as log time complexity is minuscule.
@bryan69087
@bryan69087 6 ай бұрын
MORE MIKE POUND
@ANoBaka
@ANoBaka 6 ай бұрын
If you for some reason have knowledge of the distribution of the values in the collection, perhaps by calculating it while sorting it, would it be worth it to use the distribution to find the number faster by making more educated guesses?
@TwoTeaTee
@TwoTeaTee 6 ай бұрын
Calculating M can cause overflow. So, a better version could be -> M= FLOOR(L+ (R - L) / 2)
@mightytitan1719
@mightytitan1719 6 ай бұрын
I watched your video of steganography can you make that in detail to see how algorithm works
@lasersimonjohnson
@lasersimonjohnson 6 ай бұрын
Love a good Pound...ing 😁
@jeromethiel4323
@jeromethiel4323 6 ай бұрын
You don't need to sort all of the time. Before you put new data into the list, you search first, then insert the data where it belongs in the list. So you are using a binary search to insert an item into an already sorted list. Which is much more efficient than adding data, then resorting. Heck, i implemented this in basic on an Apple ][ back in the day, because sorting routines were just slow. Searching and copying to make a hole for the new data was faster.
@neilmehra_
@neilmehra_ 6 ай бұрын
You don't really want to be using a list if you have a large number of insertions, since what you're describing has a time complexity of O(q(n + logn)). A better approach is using a Red-Black Tree or Balanced BST, for O(log(n)) insertion and query.
@garcipat
@garcipat 5 ай бұрын
If you use a random function it probably will have numbers evenly distributed of order them by the numbers. that means you could also just number them from 1-1000000. it would take the same amount of time more or less i think.
@goguhu
@goguhu 6 ай бұрын
And obvious if you have a type of distribution (random, bell curve, ...) you could estimate the location based on the first/last value in the range to check ...
@mjbirdClavdivs
@mjbirdClavdivs 6 ай бұрын
I think you get the color on your hands from touching the paper before the ink has dried. How humid are the rooms you're in?
@justovision
@justovision 6 ай бұрын
I think you need to follow up with bubble sort. Recursion is maybe the most important thing to learn.
@GTCrais
@GTCrais 6 ай бұрын
Not only did you swap X for A, but you also called your list variable "lst", which in that font looks like "1st", which makes it even more confusing :D
@o-hogameplay185
@o-hogameplay185 5 ай бұрын
would it be a good thing to check the last element of the array, and see if it is bigger than the number we want to serach? because if it is smaller, than the list do not contain the number, so it can just return with false
@BeesAndSunshine
@BeesAndSunshine 6 ай бұрын
One way you could make a presorted list of random numbers is to use randint to increase an int at the index in an array equal to the generated number. Then once you have generated the quantity of integers you want you go through that array to set the values of a new array inserting x 0s where x is the value at index 0, x 1s where x is the value at index 1, etc. This would be slower than generating an unsorted list but is significantly faster than generating an unsorted list and then sorting it, because technically it's still an O(n) algorithm, and no sorting algorithm is O(n).
@Yewtewba
@Yewtewba 6 ай бұрын
Assuming int size 4 bytes, generating ints up to 2 billion, it looks like you're trying to create about a 7.5 gigabyte array. It's infeasible for large numbers. It would run on O(m+n), where m is the size of the largest element, whereas generating and sorting would most likely run on nlog(n), where n is the number of elements. It's important to recognize you're still essentially just sorting a random list, just with your own custom sorting algorithm. Edit: I didn't know it had a name, it's called "counting sort", and it's apparently commonly used as part of Radix Sort. Both of those might provide you with an interesting read.
@idonno87
@idonno87 6 ай бұрын
My take would be to create a list starting with a random number between 0 and, say 100, then add a number by adding a random number from 0 to 100 to the previous list. Runs in O(n), not great, but hey.
@Yewtewba
@Yewtewba 6 ай бұрын
@@idonno87 add numbers one at a time in the correct order? Should be O(nlogn), binary search to find the insertion point over n elements.
@Tumbolisu
@Tumbolisu 6 ай бұрын
No *comparison-based* sorting algorithm is O(n). Radix-Sort does not use comparisons and achieves O(n). It can't be used for every type of data, but it works perfectly on integers.
@user-vv3tc7yk7f
@user-vv3tc7yk7f 5 ай бұрын
Ideally, if a student was learning this, even tho it seems that iteration is the most obvious way to implement this, would u encorage them to make a recursive functin?
@Jet-Pack
@Jet-Pack 5 ай бұрын
If you don't just half the array but instead assume an even distribution and do a linear interpolation between the item left, right and the searched number... how much does it speed up the algorithm and what is that called? Interpolation search?
@mikefochtman7164
@mikefochtman7164 2 ай бұрын
Somewhat related, "How do you keep the list sorted?" Why, use a binary tree of course! Somewhat like picking up a hand of playing cards one at a time. Find the spot where it belongs using binary search, and insert the new item where it belongs. Yes, I know there are more sophisticated trees, (i.e. red-black, avt, etc...), but the crux of the matter is, if you insert items in order in the first place, then you never have to sort the list.
@j7ndominica051
@j7ndominica051 6 ай бұрын
A typical table will have entries in added in chronological order, and the primary key doesn't have a particular meaning. Does making presorted copies of the database for every column pay off in practical applications?
@astropgn
@astropgn 6 ай бұрын
If you try it, don't convert to list and then sort it, like it was done in this video. Instead, sort the array first using numpy (np.sort()). It is way faster because numpy has less overhead. It always knows the type of every element so it doesn't make type comparisons like python does in the background.
@pettere8429
@pettere8429 6 ай бұрын
Or just made the random numbers smaller and run a cumulative sum.
@user-lw7fj4pq9c
@user-lw7fj4pq9c 6 ай бұрын
So I built this with the explanation before I found out he had actual python code in there, and I did it slightly differently. Why use floor division when you can just do a binary shift on "m"? Wouldn't a binary shift be exponentially faster considering that computers aren't really optimized for division?
@dimitrys.6563
@dimitrys.6563 6 ай бұрын
What about « ponderating » the « middle » by adding two extra steps at the beginning, looking for boundaries values. With that known, instead of taking the middle for next « check » we could check closer from the left or right if the needed number is closest to the given boundary. Do you think it would gain probability to go faster ? If yes, how much ? Second idea : to build the random order list, could be as idea to build it sequentially ba pushing new item in list which would be « new pushed item is last one plus a random number between 1 and … let say… 3 ». You would get an ordered list without doubles and spaced from 1 to 3.
@TheKilledDeath
@TheKilledDeath 6 ай бұрын
The first idea would work as long as the numbers are uniformly distributed among the list. Which probably happens with how this list was produced (lots of random = linear distribution). But in any case where the list is not uniformly distributed it would get a lot slower. And lots of lists are not uniformly distributed. Take a look at Benford's Law for an example.
@willhendrix86
@willhendrix86 6 ай бұрын
As a business intelligence professional I can conclusively state how important by-section of data is to resolving problems. You discard half the data from the issue each iteration.
@aikumaDK
@aikumaDK 5 ай бұрын
O(log2(n)) I wonder how much this would change, on average, if you start by seeing if your number is smaller than the first entry or bigger than the last entry.
@emilemil1
@emilemil1 6 ай бұрын
Could have made a fake list that just returns a value y(x) when accessing index x. That way you're not memory bound since the values can be calculated as they are accessed.
@silaspoulson9935
@silaspoulson9935 6 ай бұрын
Interesting video, thanks mike! Surprised python doesn't mark variables as sorted
@traywor1615
@traywor1615 6 ай бұрын
I guess there would be lots of overhead and bookkeeping, since every operation on the list could destroy the sorted property.
@seapearl3175
@seapearl3175 26 күн бұрын
So maybe reading the r and do a rough percentage estimation might be even faster, right.
@peterboccia
@peterboccia 6 ай бұрын
Great video, but I sincerely appreciate when I studied it in C, more pure to me. But anyway it's the same. Great job
@redaragon88
@redaragon88 6 ай бұрын
13:06 ;-; bit painful to immediately convert this lovely numpy array to a list, to sort it immediately after, instead of using the (much faster) np.sort first haha
@davidgillies620
@davidgillies620 6 ай бұрын
Binary search is a notoriously hard algorithm to get right on the first attempt. The key insight is that one of l or r must change at every iteration, but there are subtleties to do with things like integer overflow, caching and so on that make it an extremely bad idea to write your own version. You should almost always use a library version of the code (these days that's true for most well-known algorithms).
@rmsgrey
@rmsgrey 6 ай бұрын
Yeah, you shouldn't write your own code to do something the CPU, OS, compiler, or an available library will do for you (unless you know what you're doing and have a compelling reason). At best, you create something marginally better optimised for your specific use case; at worst, you create a buggy mess that ends up fighting the system as both it and your code attempt to achieve the same things in different ways.
@EtzEchad
@EtzEchad 5 ай бұрын
I worked for a company where they had a binary search subroutine that they used for everything. The worst case was when they copied a column of values to a local array, SORTED it, and then called the binary search routine. So, thats O(N) for the copy, O(NlogN) for the sort, and O(logN) for the seach. A linear search would've been O(N) - the same as the copy! (The worst code I ever saw.)
@Hobbitstomper
@Hobbitstomper 6 ай бұрын
Phone books... something my 2 year old son will only hear about, from what he soon will refer to as, "older people".
@Rayer24
@Rayer24 6 ай бұрын
Before he showed the code I went and tried it out myself to see if I could do it and my code became roughly 100,000 times faster searching the number 888,456,123 in a list of 1 billion numbers ranging from 1 to 1 billion.
@FinepixF30
@FinepixF30 5 ай бұрын
This looks easy on a sorted array but what about unsorted data? sorting it first in memory takes up computation cycles not considered here.
Mike's Cube Code - Computerphile
15:15
Computerphile
Рет қаралды 119 М.
Garbage Collection (Mark & Sweep) - Computerphile
16:22
Computerphile
Рет қаралды 230 М.
Who enjoyed seeing the solar eclipse
00:13
Zach King
Рет қаралды 120 МЛН
The World's Fastest Cleaners
00:35
MrBeast
Рет қаралды 127 МЛН
4. python 2024: погружаемся в алгоритмы и сложности
59:50
Лекции одного дата-шрушера
Рет қаралды 21
Glitch Tokens - Computerphile
19:29
Computerphile
Рет қаралды 310 М.
How Binary Search Makes Computers Much, Much Faster
6:51
Tom Scott
Рет қаралды 1,4 МЛН
Power LED Attack - Computerphile
12:05
Computerphile
Рет қаралды 253 М.
Hacking Out of a Network - Computerphile
25:52
Computerphile
Рет қаралды 236 М.
C++ Programming: Binary Search Algorithm
14:53
ReelLearning
Рет қаралды 215 М.
Binary Tree Algorithms for Technical Interviews - Full Course
1:48:53
freeCodeCamp.org
Рет қаралды 680 М.
Square & Multiply Algorithm - Computerphile
17:35
Computerphile
Рет қаралды 271 М.
Cracking Enigma in 2021 - Computerphile
21:20
Computerphile
Рет қаралды 2,4 МЛН
Has Generative AI Already Peaked? - Computerphile
12:48
Computerphile
Рет қаралды 325 М.