Binary Search Algorithm - Computerphile

Рет қаралды 165,906

Күн бұрын

Пікірлер: 270

@craftmechanics6483 Жыл бұрын

I love teaching this to my students as a guessing game. I will think of a number from 1 to 100 and give them 8 guesses, but after each guess tell them if their guess was correct or lower/higher. After a few rounds of playing, most people figure out that the optimal strategy is guessing somewhere in the middle, and essentially discover the algorithm on their own. After that, we implement the gussing game on a computer. I find this a very fun way to learn about binary search.

@DavidLindes Жыл бұрын

i'm imagining doing that in a classroom, and... yeah, sounds fun! Cool!

@Armageddon2k Жыл бұрын

I'm totally stealing this idea!

@mariolol8333 Жыл бұрын

imagine your teacher naming himself craftmechanics6483 with a minecraft pfp on youtube XD

@DarrenYung Жыл бұрын

I've been doing this exact same lesson for years too! Once you've introduced variables, conditionals and looping then you have the basic building blocks of algorithms. Binary Searching with the guessing game is a nice easy way to introduce this fundamental algorithm.

@nashsok Жыл бұрын

I audited MIT's 6.0001 (Introduction to Computer Science and Programming in Python) course through OCW and the professor did the exact same demo to introduce binary/bisection search!

@emrekantar5003 Жыл бұрын

These men make me feel as if I am in a coffee house with my best friends talking

@queueeeee9000 Жыл бұрын

Dr. Pound and Dr. Grimes from the numberphile videos are my absolute favorite!

@justawatchin2 Жыл бұрын

Dr Pound sounds and acts identically to someone who went to my secondary school in London, cheeky lad in IT class who aced all his assignments while getting his computer to run games we weren't supposed to have access to. So, yes.

@njd9828 Жыл бұрын

You spelt pub wrong.

@aaabucus3104 Жыл бұрын

Anybody who sits around and talks like this has no friends in the first place and that's just science!

@queueeeee9000 Жыл бұрын

@@njd9828 you spelt pubic wrong

@user-zl4kk7wi5u Жыл бұрын

Dr. Pound is excellent and entertaining as always. I found myself smiling and giggling throughout the whole video due to the way he handles his small mistakes and jokes around. Yet he also imparts wisdom!

@KurtisRader Жыл бұрын

The point made at 14:30 is arguably the most important part of this video. Specifically, it is less important to know how to write, from scratch, a particular algorithm than it is to know that different algorithms have different tradeoffs. Knowing how to pick the best algorithm (and data structures) for a particular situation is more important than being able to implement an algorithm on a whiteboard. One of my interview questions explores whether an applicant understands that there are many sorting and searching algorithms and the nature of the data being manipulated, and other constraints, is important in deciding which algorithm to use.

@vatbub Жыл бұрын

One thing to add: Once the data is sorted, and you need to add new data to it, binary search can actually tell you where you need to insert the new data in order to keep the data sorted. Therefore, the real issue is getting the data sorted in the first place (as mentioned in the video), but once it is sorted, binary search helps you find stuff fast and helps you keep your data sorted even when you modify it.

@NihongoWakannai Жыл бұрын

Adding a new piece of data to the middle of a giant array would require changing a lot of memory though. Which is why we might use something like a binary tree instead.

@pacoalsal Жыл бұрын

Depending on the problem the performance benefit of keeping the data adjacent in memory might offset the overhead of shifting it around.

@genelee7869 Жыл бұрын

I'd love to see a video explaining the concepts behind how the different sorting algorithms work (Heap, Merge, Shell, etc)

@JanB1605 Жыл бұрын

I'd love a video on hash maps!

@Loki- Жыл бұрын

@@JanB1605I loved my prof I had for computer science 2 which was lots of this. He was like this video, plus comedy, and simplified PowerPoint.

@jimwinchester339 Жыл бұрын

They exist here on KZbin. I've seen them.

@Gudsfralsare Жыл бұрын

In these times when AI is all the hype, these are the videos we need. The fundamental algos from the 60's are here to stay and run the world.

@thatchessguy7072 Жыл бұрын

AI was used to help find an improvement on binary search recently. It’s covered in keynote talk of the 2023 CPP con.

@theblinkingbrownie4654 Жыл бұрын

@@thatchessguy7072wHAT?

@philroo1 Жыл бұрын

@@thatchessguy7072 🤞 all watched over by machines of loving grace

@konstantinrebrov675 Жыл бұрын

We need more videos explaining in low level details how various common algorithms work, such as file I/O, B trees, image compression, files representation (PDF, audio, video, archive, ELF), Huffman coding, hashing algorithms, md5sum, dynamic programming. I mean walk through the pseudocode line by line explaining what it does, and draw the memory diagrams, like what is happening to the data, how the bits or bytes are stored, loaded, moved around.

@ross951 Жыл бұрын

Textbooks are your best bet for finding and understanding (easy to quickly reread sentences) that kind of material.

@konstantinrebrov675 Жыл бұрын

@@ross951 Lectures are better because then we would be able to see exactly how all the bits and bytes are moved around in the memory.

@Aidiakapi Жыл бұрын

They've done videos on many of the topics you've mentioned.

@chessthecat Жыл бұрын

That's not what Computerphile is. Sounds like you should start your own channel.

@konstantinrebrov675 Жыл бұрын

@@chessthecat Who are you to say?

@traywor Жыл бұрын

10:25 just wanna note here for anyone trying to implement this algorithm for very large arrays, be aware of the integer overflow. (l + r) // 2 will first calculate l + r and then try to divide it by two. however if your list is lets say 2/3rds of the size of your integer and if you then add l + r and l and r are quite large, then they will overflow. Then your m will be messed up and your binary search will be incorrect. And this is not a contrived scenario, there was this exact kind of bug in the java SDK, for 9 years! I was actually surprised that Mike didn't mentioned this :O

@jlappala Жыл бұрын

We try to teach the l+(r-l)//2 -method to avoid this.

@solqr6521 Жыл бұрын

It was in Python, where there wouldn’t be overflow

@johnvriezen4696 Жыл бұрын

@@solqr6521He should still write the code the l + (r-l)//2 way because code will inevitably be found and copied to a platform that has 32 bit integers (e.g. C) and then break.

@raphaelfranzen9623 Жыл бұрын

@@solqr6521 Why wouldn't there be overflow in Python ? An integer has only a finite size , so if you add more than it can hold , it should wrap around (overflow).

@krajsyboys Жыл бұрын

@@raphaelfranzen9623 Integer overflows don't happen if you are using just python because they allocate arbitrary amount of data to the variables, so if a number should have overflowed, python allocates more momory to store the variable in

@daveingerslev Жыл бұрын

Finding theft (or damage) on CCTV footage is a fantastic real world use for binary search. As long as something visually changes (i.e. goes missing or gets broken), then it turns the task of watching the cause into a 5 min task instead of hours/unjustifiable.

@doomanime61 Жыл бұрын

I can sit and hear Dr Mike Pound talking all day

@michaelwilson5742 Жыл бұрын

Yeah, everyone on the corridor can sit and hear him talking all day

@GordonjSmith1 Жыл бұрын

This was great! It really highlighted the need for the development of 'logical' learning, and not just 'proceedual' or 'fact' learning. Actually a very 'key vlog', in all the excellent vlogs that have been made.

@Richardincancale Жыл бұрын

One of my favourite books from almost 50 years ago - Sorting and Searching by Donald E Knuth - tells you all you need to know!

@olafurfo89 Жыл бұрын

Currently doing my second year in computer science and you made me realize when I'd want to use BST in projects. Sort in morning, many lookups throughout the day. Makes so much sense but I hadn't realized it, Well done !!!

@SparkysWidgets Жыл бұрын

I changed a linear search on a calibration step for a product on the production line to binary searching. This small change netted millions of minutes of production line weekly(test used to take minutes and we produced over 1million units a week) back to the CM. The main point being is it doesn't have to be a large data set it could be a small known dataset but that each takes time to search or in this case "test".

@robblake8999 Жыл бұрын

It's an excellent point to remember that sorting is O(n log n), wheras a linear search is O(n). Linear search could definitely prevail for a small number of lookups.

@IceMetalPunk Жыл бұрын

That's *if* your data isn't already sorted, of course :)

@turdwarbler Жыл бұрын

@@IceMetalPunkNo, for a small number of items, the exact number is hardware dependant a linear search will beat everything apart from a direct index access. linear searches on arrays are very cache friendly, as you walk the array, the CPU is already pre fetching 64 bytes ahead of you. We do this all the time for fast lookups of small tables.

@IceMetalPunk Жыл бұрын

@@turdwarbler Cache lookups aren't considered in algorithmic runtime analysis.

@turdwarbler Жыл бұрын

@@IceMetalPunk Well thats where theory and reality differ. In HF LL trading we use linear lookups all the time because they are faster period, for small arrays. Even without a cache they would still be faster, no set up time, no complicated logic, simples.

@coolwarfare Жыл бұрын

Thank you this channel is going to save me before GCSEs

@kevinramadhan9871 Жыл бұрын

I used to watch you all the time when i was in uni around 3-4 years ago. Just stumbled upon this video and glad you're still going strong! keep up the videos! it's always a treat to watch, even if i'm already familiar with the topic

@chrisogonas 11 ай бұрын

Simply and elegantly illustrated! Thanks, folks 👍

@-parrrate Жыл бұрын

for most problems, I prefer a strict L

@Loki- Жыл бұрын

14:15 My analogy for going through computer science is that we add a bunch of tools to a toolkit and learn when to use which tools.

@agoatmannameddesire8856 Жыл бұрын

More Dr Pound videos on search / sorting!

@thuokagiri5550 Жыл бұрын

prof Pound is the Richard Feynman of Computer science

@erichobbs4042 Жыл бұрын

The point about knowing when to use an algorithm is the most important bit of knowledge to take away from this video. I've forgotten how to exactly write plenty of algorithms, but those things can always be looked up.

@sid6645 Жыл бұрын

Binary searches over some state space with a non trivial checking condition (like some monotonic equation) is also fascinatingly useful. Also if you rectify the state by changing basis or something. One of my favourite algorithms!

@GilesBathgate Жыл бұрын

Don't forget you can also use binary search to implement a container that is sorted on insert. (although there are other approaches)

@tiarkrezar Жыл бұрын

Constructing a list that way is really inefficient. Even if the search is O(log(n)), each insertion will be O(n). At that point, just use some kind of a tree data structure instead.

@rmsgrey Жыл бұрын

@@tiarkrezar It depends how your data is stored. For example, if you normally only store data in even-numbered positions, but insert into odd-numbered positions, then redo the entire list when you hit a collision, then insertion is only O(n^0.5). But, yeah, a naive array implementation of a sorted list will have problems with insertion speed.

@GilesBathgate Жыл бұрын

@@tiarkrezar I did say there are other approaches. Insertion sort, Self-balancing BSTs like AVL trees and Red-Black trees, etc...

@wybren Жыл бұрын

I love binary search. If look up times are really low binary search is as good as unbeatable because of computational simplicity.

@IceMetalPunk Жыл бұрын

Wow, I can't believe there's never been a binary search Computerphile video before now! 🤯

@Salara2130 Жыл бұрын

Well that was a nice refresher of the basics ^^

@jakoon_the_manic Жыл бұрын

Always happy to see a Mike video! Great refresher. 😀

@logosimian Жыл бұрын

I already knew everything in this video, but I still watched it.

@atkelar Жыл бұрын

I love to show these examples in my programming courses (day job); one of the arguments that sometimes is held against more complicated algorithms is that "yes, less steps but each step is significantly more complicated"; but then you do the math and compare 100M steps vs. 23 and see that they would have to be quite a bit more complicated to make the 23 step version perform worse than the linear search one. And I also encounter people who would solve a "find in 10 items" problem with such an algorithm... Cost/Benefit analysis gets quite complicated if you don't have an idea about the number of items involved, or the number of executions required.

@NihongoWakannai Жыл бұрын

I think it's always important to learn that there is more to performance than just big O. With webdev being so popular, people focus a lot on scalability. But there are lots of situations with very predictable dataset sizes thus optimizing functions themselves becomes more important

@PowerShellWizard 3 ай бұрын

I wish I had this guy as an instructor when I was going to college. I may have stood a chance with some of the data structures and algorithms classes I was taking at the time :)

@Chocoholic1337 Жыл бұрын

Love the algorithm explanation videos! Mooooore!!

@gerokatseros Жыл бұрын

I really like the way you explain everything ... making it so simple

@hashtagPoundsign Жыл бұрын

There are 10 types of people, those who watch Binary Search videos, and those who don't.

@housellama Жыл бұрын

There are 10 types of people in the world: people who understand binary, those who don't, and those who know this joke is trinary.

@nosuchthing8 Жыл бұрын

Clever

@Lion_McLionhead Жыл бұрын

Pounder can make even this basic algorithm sound complicated.

@andrewharrison8436 Жыл бұрын

Built into COBOL as the SEARCH ALL command - often overlooked but fast if you know about it.

@LuigiElettrico Жыл бұрын

Nice explanation, thanks.

@artofcomputing-art Жыл бұрын

Hell yes! Here comes another great Computerphile video 🎉

@allenkwan8310 Жыл бұрын

This is the CS equivalent of asking a band to play the classics. Nice.

@conorstewart2214 Жыл бұрын

This is also how a type of analogue to digital converter (ADC) works, it is successive approximation (SAR). Interestingly when used for ADCs it corresponds directly to binary. The first step where you have a comparator with half the max voltage and the input voltage, the result from the comparison is the most significant bit (MSB) of the value, then when you perform the next comparison it becomes the next bit and that continues on until it reaches the closest that the ADC can get. This method relies on a way to create all the intermediate voltages for the comparison so SAR ADCs contain digital to analogue converters (DAC) too. Also due to the nature of analogue voltages, similar to real numbers, the search will only very rarely be able to end early, since a comparison between two analogue voltages or between two real numbers is very unlikely to result in them being equal and may be impossible if the comparator only output less than or greater than. This does mean that these ADCs and real numbers will pretty much always take the same number of steps to complete.

@FalcoGer Жыл бұрын

@7:00 l+r might be an integer overflow. Depending on your language that might be undefined.

@georgelza Жыл бұрын

love these discussions.

@cannaroe1213 Жыл бұрын

If you use the values of l and r (call them L and R), then you can do a weighted binary search which usually performs a little bit better than one that defines m = (l+r)//2 So at 3:13 a weighted binary would say a = 17 , l = 4 (0-based), L = 16, r = 7, R = 9000 (unchecked in video so i made it up) Because a is very close to L and very far from R, you find m as usual but then bias it more in favour of L than R to the point it shakes out as "5" and so finds a immediately, and you don't have to make L = 10 :P

@mjbirdClavdivs Жыл бұрын

I think you get the color on your hands from touching the paper before the ink has dried. How humid are the rooms you're in?

@Lampe2020 Жыл бұрын

If the list is in order you can even make it simpler to determine whether you search further by opening the first and last box and if the target number isn't between those in value it's not going to be inbetween those boxes either.

@martijn3151 Жыл бұрын

In theory, yes binary search is faster. In practice a linear search might actually be much much faster. The one thing that makes a big difference is cache. Computers read ahead memory. So it greatly helps if all entries are stored sequentially. If the binary search is implemented in a tree, the entries probably are all over the place in memory, not sequentially, causing cache miss after cache miss. This will severely impact the performance. Obviously when searching billions of entries, binary is most likely to win. But in a lot of real life situations, the lists are much smaller. Small enough to fit in a cache line. Making the search almost instant. So never just assume what you learn in computer theory is fact, use different setups for your problem and measure(!) to determine what solution works best.

@Quarky_ Жыл бұрын

Well, if someone thinks binary search is always faster, then they didn't really learn anything did they? The algorithm clearly gives you a speedup by halving the search space at every step, then obviously it is faster only for large numbers. One might even try to quantify this with: for n = 2^m, if m

@Pyrazahn Жыл бұрын

Another reason why linear search can be faster for small lists is that modern cpus heavily rely on pipelining and branch prediction. Basically for every if-else, the cpu guesses which of the two paths will be executed next and already executes the first instructions of that path in the background. If the guess was correct, then the results are already there for free. But if the guess was wrong, it is much more expensive to take the other path. In a linear search, if the cpu predicts "element not found, move to next item in list", then that will be correct in all but one case. But in a binary search, the expected results are 50% left and 50% right - so branch prediction fails 50% of the time instead of just once.

@martijn3151 Жыл бұрын

@Quarkly_ No need to be pedantic here. Log n is always smaller than n. Also for small numbers. Using binary search on a list of 10 items, worst case is 3 steps. Worst case linear search is 10 steps. Yet, as I pointed out, linear might be much faster in that case even though theory says binary search is at least 3 times faster. So, at least to me, this isn’t trivial.

@AAjax Жыл бұрын

Interesting! So what would the break-even size, between binary search and linear search? I had a look and the x64 cache line is typically 64 bytes, which wouldn't seem to allow for a very useful amount, but this is all new to me.

@krzychxyz8358 Жыл бұрын

Well, the break-even size depends on a lot of things. From your code efficiency, the programming language, the compiler you use all the way to the specifics of the data set (some data types use more computation power to compare than others). But i've never seen a case where that is above 60, so I guess it won't ever get big.

@laser-sj Жыл бұрын

Love a good Pound...ing 😁

@patientson Жыл бұрын

First of all, thank you for agreeing that the mere fact one is enhancing their skills makes it a career move. Most employers are notorious for despising early beginnings.

@LucasundAaron Жыл бұрын

in python i prefer using l=0 r=len(arr) now right is exclusive and points on invalid greater parts. the only +1 indexing will happen in the main loop now. you only check the end of valid array is 1 pos further with: „while l+1

@LiarsPendulum1 Жыл бұрын

I am teaching myself programming and was literally practicing writing this kind of code yesterday, working on data sorting algorithms. So glad this is being written in Python. I started with C and this is a great window into how C maps on to Python.

@BrankoDimitrijevic021 Жыл бұрын

Interestingly enough, linear search is likely to be faster than binary search for a small array of integers, due to prefetching and branch prediction mechanisms of modern CPUs.

@heliospace Жыл бұрын

This is the first time an algorithm seemed useful

@CoreyOgburn Жыл бұрын

Fun fact: if you're searching through a list of unknown size (maybe a stream of data) that's ordered then you can do a binary search by checking powers of 2 in a similar way. If the value is greater than m, m *= 2.

@sandervdrie9182 Жыл бұрын

thanks Dr Mike and thanks the Computerphile team, for and good video would, have loved to have had a lesson from Dr Mike

@felipeabraga Жыл бұрын

Nice, was studying about it last monday... Thanks

@KeyboardSourceError Жыл бұрын

Thank you Dr. Pound!

@JuliusUnique Жыл бұрын

simple yet efficient way of searching, doing it myself that way since I basically can think

@nosuchthing8 Жыл бұрын

The phone book analogy is SO powerful

@Mode_na Жыл бұрын

Shouldn't you check the first and last item in a sorted list to make sure the number you're searching for is in range? It should be more efficient by potentially circumventing the need of running a search algorithm all together.

@f.f.s.d.o.a.7294 Жыл бұрын

Only if that is frequently the case.

@ligmoes Жыл бұрын

Indeed. when its out of range it shrinks in a linear fashion until the ranges swap and returns false but also works for when is in range and is not on the list. However, that is a matter of implementation. Here's same function with your solution: def binary_search(lst, a): l = 0 r = len(lst) - 1 if a > lst[r] or a < lst[l]: return print("Out of Range") while l lst[m]: l = m + 1 elif a < lst[m]: r = m - 1 else: return True return False

@neilmehra_ Жыл бұрын

It doesn't really impact performance as log time complexity is minuscule.

@diagorasofmel0s Жыл бұрын

This was so informative and capturing, surprisingly my attention span gets a bit longer when prof Mike Pound speaks lol.

@kuro4841 Жыл бұрын

would it be an improvement to first check the first and last index to see if the given number is even in range before looking through the boxes? also would it be advisable to place the first lookup point based on where between highest and lowest it is? so if its closer to the lower bounds, then maybe starts dividing a little sooner like at 40% of the dataset? would this add too much performance overhead or would it possibly an improvement? i know that division is a comparatively costly operation. I would be super interested on someone elses input on this :)

@cscscscss Жыл бұрын

Great video! I understood it very well.

@akiraincorrect4968 Жыл бұрын

Did you watch it on 13x ?

@cscscscss Жыл бұрын

@@akiraincorrect4968 Yes

@gsestream Жыл бұрын

make it faster by giving ranges for each of the boxes, thats then only one step to get to the right bix, which can be sub-divided to another range boxes, its better than log(n) of binary search, having ranges is better than just having the numbers in order, if you have different boxes for all different numbers, then the search is O(1), its bucket search, essential is that you pre-calculate the ranges and not during the search. works more like heap, which does the processing on the background and not during the searches/updates. if you have unique box for each different number, at a known index, which you can jump to, directly, then it works like radix sort for sorting. radix search for buckets might be super duper fast.

@bishboria Жыл бұрын

I taught this to my 1st year students less than a week ago.

@GTCrais Жыл бұрын

Not only did you swap X for A, but you also called your list variable "lst", which in that font looks like "1st", which makes it even more confusing :D

@jeromethiel4323 Жыл бұрын

You don't need to sort all of the time. Before you put new data into the list, you search first, then insert the data where it belongs in the list. So you are using a binary search to insert an item into an already sorted list. Which is much more efficient than adding data, then resorting. Heck, i implemented this in basic on an Apple ][ back in the day, because sorting routines were just slow. Searching and copying to make a hole for the new data was faster.

@neilmehra_ Жыл бұрын

You don't really want to be using a list if you have a large number of insertions, since what you're describing has a time complexity of O(q(n + logn)). A better approach is using a Red-Black Tree or Balanced BST, for O(log(n)) insertion and query.

@RixtronixLAB Жыл бұрын

Nice info, thanks :) 👍

@mightytitan1719 Жыл бұрын

I watched your video of steganography can you make that in detail to see how algorithm works

@sparthir Жыл бұрын

The sentence "our problem is double as easy" is not something I've ever heard. 😂

@LupinoArts Жыл бұрын

The box next to "16" could contain 16 again...

@fuckYTIDontWantToUseMyRealName 5 ай бұрын

But now you've found your 16. If you need another one, you use a different algo.

@astropgn Жыл бұрын

If you try it, don't convert to list and then sort it, like it was done in this video. Instead, sort the array first using numpy (np.sort()). It is way faster because numpy has less overhead. It always knows the type of every element so it doesn't make type comparisons like python does in the background.

@pettere8429 Жыл бұрын

Or just made the random numbers smaller and run a cumulative sum.

@o-hogameplay185 Жыл бұрын

would it be a good thing to check the last element of the array, and see if it is bigger than the number we want to serach? because if it is smaller, than the list do not contain the number, so it can just return with false

@beepbop6697 Жыл бұрын

Anyone who has ever looked up a name in a phone book, or used a dictionary, intuitively knows what a binary search is. Keeping driving by half until you find the page you are looking for.

@mikefochtman7164 10 ай бұрын

Somewhat related, "How do you keep the list sorted?" Why, use a binary tree of course! Somewhat like picking up a hand of playing cards one at a time. Find the spot where it belongs using binary search, and insert the new item where it belongs. Yes, I know there are more sophisticated trees, (i.e. red-black, avt, etc...), but the crux of the matter is, if you insert items in order in the first place, then you never have to sort the list.

@TwoTeaTee Жыл бұрын

Calculating M can cause overflow. So, a better version could be -> M= FLOOR(L+ (R - L) / 2)

@The_Oracle Жыл бұрын

What’s a phone book.

@paultapping9510 Жыл бұрын

Assuming this isn't an ironic question: Pre-mobile phone, when hardwired dial-phones running over copper was the only telecommunications infrastructure, the only way to search a number (without phoning the operator), was to look it up in a phonebook. Which was a quarterly-printed physical book of everyones phone number in the immediate area, organised by surname.

@ragnkja Жыл бұрын

Imagine your “contacts” app, but with orders of magnitude more names and phone numbers, and printed in a big book (on really thin paper to make the book slightly less enormous).

@bryan69087 Жыл бұрын

MORE MIKE POUND

@justovision Жыл бұрын

I think you need to follow up with bubble sort. Recursion is maybe the most important thing to learn.

@willhendrix86 Жыл бұрын

As a business intelligence professional I can conclusively state how important by-section of data is to resolving problems. You discard half the data from the issue each iteration.

@JeffKaylin-ft5cx Жыл бұрын

If the search gets REALLY big, I assume the data will be read in a record of fixed length. Would checking the Last data in the record be useful? Like if you're in the library and you may have to switch rooms or floors, should you check the last book so you won't have to come back to that room later when you notice you were SO close?

@Jet-Pack Жыл бұрын

If you don't just half the array but instead assume an even distribution and do a linear interpolation between the item left, right and the searched number... how much does it speed up the algorithm and what is that called? Interpolation search?

@bolonabolona 2 ай бұрын

One the use case of binary search is for finding a squre root or cube root.

@Atrix256 Жыл бұрын

Do interpolation search next!

@BeesAndSunshine Жыл бұрын

One way you could make a presorted list of random numbers is to use randint to increase an int at the index in an array equal to the generated number. Then once you have generated the quantity of integers you want you go through that array to set the values of a new array inserting x 0s where x is the value at index 0, x 1s where x is the value at index 1, etc. This would be slower than generating an unsorted list but is significantly faster than generating an unsorted list and then sorting it, because technically it's still an O(n) algorithm, and no sorting algorithm is O(n).

@Yewtewba Жыл бұрын

Assuming int size 4 bytes, generating ints up to 2 billion, it looks like you're trying to create about a 7.5 gigabyte array. It's infeasible for large numbers. It would run on O(m+n), where m is the size of the largest element, whereas generating and sorting would most likely run on nlog(n), where n is the number of elements. It's important to recognize you're still essentially just sorting a random list, just with your own custom sorting algorithm. Edit: I didn't know it had a name, it's called "counting sort", and it's apparently commonly used as part of Radix Sort. Both of those might provide you with an interesting read.

@idonno87 Жыл бұрын

My take would be to create a list starting with a random number between 0 and, say 100, then add a number by adding a random number from 0 to 100 to the previous list. Runs in O(n), not great, but hey.

@Yewtewba Жыл бұрын

@@idonno87 add numbers one at a time in the correct order? Should be O(nlogn), binary search to find the insertion point over n elements.

@Tumbolisu Жыл бұрын

No *comparison-based* sorting algorithm is O(n). Radix-Sort does not use comparisons and achieves O(n). It can't be used for every type of data, but it works perfectly on integers.

@NunyaBidness-v1g Жыл бұрын

So I built this with the explanation before I found out he had actual python code in there, and I did it slightly differently. Why use floor division when you can just do a binary shift on "m"? Wouldn't a binary shift be exponentially faster considering that computers aren't really optimized for division?

@FinepixF30 Жыл бұрын

This looks easy on a sorted array but what about unsorted data? sorting it first in memory takes up computation cycles not considered here.

@ANoBaka Жыл бұрын

If you for some reason have knowledge of the distribution of the values in the collection, perhaps by calculating it while sorting it, would it be worth it to use the distribution to find the number faster by making more educated guesses?

@alibabascience9406 Жыл бұрын

we need to sort array in the first yes?

@MuhammadRaza-yd6sg Жыл бұрын

Been just thinking about this lately

@j7ndominica051 Жыл бұрын

A typical table will have entries in added in chronological order, and the primary key doesn't have a particular meaning. Does making presorted copies of the database for every column pay off in practical applications?

@JeanNoelAvila Жыл бұрын

What about n-ary search and what is the optimal n?

@goguhu Жыл бұрын

And obvious if you have a type of distribution (random, bell curve, ...) you could estimate the location based on the first/last value in the range to check ...

@seapearl3175 9 ай бұрын

So maybe reading the r and do a rough percentage estimation might be even faster, right.

@Samit-r3h Жыл бұрын

Ideally, if a student was learning this, even tho it seems that iteration is the most obvious way to implement this, would u encorage them to make a recursive functin?

@garcipat Жыл бұрын

If you use a random function it probably will have numbers evenly distributed of order them by the numbers. that means you could also just number them from 1-1000000. it would take the same amount of time more or less i think.

@pierreabbat6157 Жыл бұрын

How fast is the algorithm in Julia, compared to Python?

@zxuiji Жыл бұрын

9:38, gotta say, 0 point to "r - l" (you put + for some reason, why would anyone look for MORE length than actually have?), it's the call is more likely to look like this: "++middle_index; return bsearch( base + middle_index, length - middle_index, ... );"

@redaragon88 Жыл бұрын

13:06 ;-; bit painful to immediately convert this lovely numpy array to a list, to sort it immediately after, instead of using the (much faster) np.sort first haha