Dijkstra's Hidden Prime Finding Algorithm

  Рет қаралды 167,040

b001

b001

Күн бұрын

Пікірлер: 309
@yassine-sa
@yassine-sa 9 ай бұрын
Dijkstra's algorithm feels like it took the division approach and was like hold on we can memorize intermediate results of the division to avoid doing it all over everytime
@Vaaaaadim
@Vaaaaadim 9 ай бұрын
To me it looks like the sieve of eratosthenes(I'll abbreviate as SoE) approach, but instead of maintaining a boolean array, we keep track of what is the next prime multiple we would mark, and update when we need to. Something that makes it a little less obvious is how for SoE, the video doesn't explain that we can start at the squares of the prime. When we mark the multiples of 7 for instance, we can start marking from 49 onwards, because 14,21,28,35,42 will already be marked by multiples of 2,3,5. If we think of SoE as running a loop for multiples of 2,3,5,7,etc then Dijkstra's alg keeps these loops in an intermediate state and does their next iteration when necessary.
@dg636yt
@dg636yt 9 ай бұрын
You guys both just proved that dijkstra successfully blended the two together @@Vaaaaadim
@Vaaaaadim
@Vaaaaadim 9 ай бұрын
@@dg636yt Would you not say that SoE already does the "memorize intermediate results of the division" on its own?
@dg636yt
@dg636yt 9 ай бұрын
@@Vaaaaadim it's more of a tabulation approach than a memoization approach
@KRYPTOS_K5
@KRYPTOS_K5 5 ай бұрын
3 times faster. It also shows Euler primes. Memory usage is minimal. You can play with in your Android. Really fast. If automatic compiled to native Java even faster. Good game. ! PRIMES ! Brasil. R.H.Hodara ! To US Mathcircle ! Prime? Euler Prime? ! RFO!Basic! Android Native Offline fn.def isEuler(x) y=x^2-x+41 IsEuler=y Fn.rtn IsEuler Fn.end fn.def IsPrime(n) IF n
@mschoenert
@mschoenert 9 ай бұрын
Nice video - talking about this algorithm that is not so well known. Note that your implementation of Dijkstra's algorithm is different from his implementation in an important way - that has consequences for both the performance and the memory footprint. You put the multiples into a heap - while Dijkstra puts them into an array. So all updates take logarithmic time in your implementation, while they take constant time in Dijkstra's. It is actually a little bit more complicated, because in Dijkstra's implementation it needs to scan the array until it finds a divisor. But that doesn't cost that much time, because for most of the candidates one of the earliest primes will divide them (half of the numbers are divisible by 2, one-third are divisible by 3, etc.) So overall your implementation is slower than Dijkstra's implementation. And you are also putting primes and multiples into that heap that don't need to be there. If you are trying to find all the primes up to N, then you only need to put primes P with P*P
@prayagpatel9204
@prayagpatel9204 9 ай бұрын
Great insights thanks
@Jack-oi8gn
@Jack-oi8gn 9 ай бұрын
this was an amazing read, thank you!
@rchas1023
@rchas1023 9 ай бұрын
I have an elementary proof (from Erdos ) that there is a prime between N and 2N, for N >= 2, and I think that 2N
@trueriver1950
@trueriver1950 9 ай бұрын
A trivial efficiency boost is got by ignoring all even numbers. To do this we need to do two things in startup: put 2 into the list to be output, but ignore it on all the trial divisions (or whatever method). Then start the prime search at 3, incrementing always by 2. This simple optimization saves testing half the numbers, and adds no complexity to the loop: all the added code is in startup. The next optimisation is to step in multiples of 6, for each N being a multiple of six we only have to test N-1 and N+1 because the other odd number must be a multiple of three. The easiest way to start this is actually to dummy in 2, 3 (neither of which is ever tested) starting with N=6, where both N-1 and N+1 are not tested as we never test for divisibility by 2 or 3. Another option would be to dummy in 2, 3, 5, 7 and start from N=12. In a long run this saves one more iteration, which is trivial compared to saving a constant fraction. My feeling is that this is less elegant than starting from 6, and not worth the tiny saving. This cuts another 1/3rd off the number of primary tests, at the cost of adding door to the loop, and that we have to dummy in the primes 2 and 3 which would otherwise be skipped. Likewise for each prime we know up front we can cut down a further fraction of the results to be tested, testing more primes in each iteration but fewer overall. I'll leave it to you to figure out which numbers to test starting from N=30 going up in 30s (there are more than two in each iteration). Stepsize 6 is ready to code and worth doing. Stepsize 30 harder, and stepsize ?? harder still: at this point another optimisation kicks in: and that's programmer time vs computer run time. If you are paying the programmer and paying for computer time it's then a cost based decision. If you own the computer and are doing it for fun then that optimisation depends where complexity stops being fun. When optimisation becomes tedious let the computer run whatever stage you've got working and tested at that point... OR you then write a program to create the program for the next stepsize What you don't do is ask ChatGPT to write the program 😉
@trueriver1950
@trueriver1950 9 ай бұрын
Pedantic correction: there is a prime between every number >1 and it's square 😊
@juanmacias5922
@juanmacias5922 9 ай бұрын
Dang, Dijkstra wouldn't take common consensus as the absolute, that's awesome, thanks for sharing, super interesting video!
@spaghettiking653
@spaghettiking653 8 ай бұрын
That's what makes the greats so great :)
@stanleydodds9
@stanleydodds9 9 ай бұрын
There are several different ways to optimise the sieve of eratosthenes that massively reduce its space complexity, while keeping its time complexity the same. Note that the main benefit of this is actually to improve caching, and hence speed - memory in general isn't a problem, but not staying in the same place in memory for a while causes lookups to be much slower. The most obvious such approach is the segmented sieve. You divide your full range 1 through n into segments of size on the order of sqrt(n), and just sieve the blocks one at a time (with a small optimisation). The first segment is sieved as normal, to get an initial list of primes. Then for all subsequent segments, we already have all the primes needed to completely sieve them. So we only have to mark off composites using the primes we have already found, up to the square root of the largest value in the block. After that, we do one pass on the block, returning all the values that are prime (and adding them to the list of primes if needed).
@davidwillmore
@davidwillmore 9 ай бұрын
That's what I did many years ago and it works very well. Id memory does become an issue, you can actually integrate this algorithm a third time, but at that point storage of the discovered primes becomes a significant task.
@reeeeedmil
@reeeeedmil 9 ай бұрын
i like the dijkstra's algorithm the most tbh
@dernett
@dernett 5 ай бұрын
You can further limit the memory usage of Dijkstra's approach by only storing a new prime `i` in the pool if `i*i
@user-pw5do6tu7i
@user-pw5do6tu7i 9 ай бұрын
can't wait to be the smartass who asks the interviewer to specify which dijkstras algorithm I should implement
@Grumpy6
@Grumpy6 9 ай бұрын
One of the most elegant and easy to understand videos dealing with prime numbers. Congratulations!👍
@brian554xx
@brian554xx 9 ай бұрын
For all of these methods, you can advance by two (rather than one) at each step starting at 3. From 3, you check 5, 7, 9, 11, etc., skipping over 4, 6, 8, 10, etc. without even checking them. It's a trivial change to the code and eliminates half of the numbers you would check. Also, for trial division, you don't need to compare each prime to the square root. Just find the square root and start there, working downward in the list of primes. This is fun. I miss this stuff!
@Khwerz
@Khwerz 9 ай бұрын
It's faster than sieve too, but this depends entirely on your CPU and languages array removal features. if its C its probably fast. but the only way sieve is faster is if you have a file FILLED with bools, 0/1, then you just seek to a position of the number and check if its 0 or 1.
@chao.m
@chao.m 9 ай бұрын
Actually, beyond 6, you only need to test the numbers immediately surrounding a multiple of 6, eg, 11, 13, 17, 19, 23, 25, 29, 31… That way, you skip 2/3 of the natural numbers because you have skipped all multiples of 2 and 3. But it will be difficult to incorporate that into SoE or Djikstra’s algorithm
@przemekkobel4874
@przemekkobel4874 9 ай бұрын
Wouldn't that break algorithm's requirement to update list of 'smallest multiples' on each tested number?
@kvetter
@kvetter 8 ай бұрын
I implemented this feature of skipping even candidates--it was very good optimization. On my computer it reduced the time to find all primes under a million from 8.2 seconds to 3.8 seconds. It does break the "smallest multiple" requirement but the fix is to just update smallest multiple entry by 2*prime instead 1*prime.
@levieux1137
@levieux1137 7 ай бұрын
@@przemekkobel4874 yes it does. It still works to skip even values though, because for each equal value there are exactly two updates, so instead you perform your updates by adding twice the base value. But beyond that for now I failed to make it produce all good values (some are missing or some are extra depending on my tests).
@zeus7914
@zeus7914 9 ай бұрын
very nice. i hadnt heard of that method before. thanks for bringing it to light
@ezion67
@ezion67 9 ай бұрын
Your version of the "Sieve" seems to use a whole byte to store a bool instead of just one bit. That is a first huge space saving just up for grabs. The second is to forget about all even numbers other than 2. This can half the space the sieve needs and for trial division half the time. But in the long run Dijkstra still wins.
@Bolpat
@Bolpat 9 ай бұрын
In the sieving algorithm, when setting multiples to false, one can start at the square of the number and then move on in increments of double number. E.g. for 5, one can skip 10, 15, and 20, and set 25, 35, and so on, to false. The first big skip works because the lower multiples are also multiples of smaller primes. The double step works because every second multiple is even and therefore not prime. Only care must be taken to special-case 2, which obviously can't use that.
@AK-vx4dy
@AK-vx4dy 9 ай бұрын
@14:42 Probably true but brave thesis, in today machines it depends, if algorithm is cache friendly it that is true but in other case "space requirement" becames "time requriment" because of slow access time to main memory
@firstnamelastname-oy7es
@firstnamelastname-oy7es 9 ай бұрын
Booleans usually take up a byte in computers due to the way memory is accessed at the byte level, even though they are mostly used to represent true or false values. There's no reason not to use bits instead of booleans if memory efficiency is the main concern of the algorithm. 30 bits fits into 4 bytes, so you would only need to use 4 booleans worth of space instead of 30.
@marlonmarcello
@marlonmarcello 9 ай бұрын
So glad KZbin recommended your channel, instant sub! Great content, clear and concise explanation and lovely visuals.
@raizdesamauma8607
@raizdesamauma8607 9 ай бұрын
I Didn't know about this algorithm altough in college I had already researched a bit about prime algos. Thank you for the great explanation, clear breakdown of the different algorithms and the benchmarks! Great video
@victorferreira5852
@victorferreira5852 8 ай бұрын
Very interesting. I myself am a quantum computing physicist currently writing a paper for a quantum algorithm that we found to handle such prime identification tasks. Even though im very familiar with algorithms like that, i didnt know about this one from Dijkstra, so thats a really nice video!
@indrajitpal63
@indrajitpal63 8 ай бұрын
This is brilliant ! Thanks for sharing 😃
@cthutu
@cthutu 9 ай бұрын
Great video and content. But may I leave some feedback about one of your visualisation. When you were showing the Boolean array for the Sieve method, you used green for true and red for false. As a colour blind person, I struggled to see the difference between which boxes were green and which were blue. I would have suggested, using green for true, but using the original purple colour for false (i.e. not "lit").
@tkwilson71
@tkwilson71 9 ай бұрын
This was an excellent explanation of the different algorithms. Great job, and thank you!
@ingiford175
@ingiford175 9 ай бұрын
Since square roots are hard to find, when I did this, I kept track of the squares of the various number 2 (4), 3 (9), and only started using them when they got to the number in the ().
@keypey8256
@keypey8256 9 ай бұрын
Square roots are on modern cpus as fast as division
@ingiford175
@ingiford175 9 ай бұрын
@@keypey8256 When I did this, some computers still had 8 inch floppy disks.
@Pluki1
@Pluki1 9 ай бұрын
@@keypey8256how do computers calculate square roots and division?
@sayedaulia
@sayedaulia 9 ай бұрын
ingford175's Algorithm
@iofish__
@iofish__ 9 ай бұрын
Really interesting video. I didn't know about Dijkstra's method. As an aside trial division can be made faster by only looking at numbers which are a multiple of 6 plus or minus 1 and then you can skip checking whether the numbers are even or divisible by 3. You also don't need to do real square root calculations, just integer square roots which are a lot cheaper. With the sieve method you could probably save space by using int128s and flipping the bits accordingly rather than storing every true or false as an 8 bit bool. But I like Dijkstra's method the most
@luigidabro
@luigidabro 9 ай бұрын
128bit processors don't exist jet, and when 64x processors deal with larger numbers than their word size, they do some very expensive conversions. So, an usize (32bit on 32x, 64bit on 64x) would probably be faster.
@iofish__
@iofish__ 9 ай бұрын
​@@luigidabro Ah okay I didn't know that. So an int64 would be fastest on a modern computer then
@luigidabro
@luigidabro 9 ай бұрын
@iofish__ better use an unsigned value, as lshift and rshift on signed values is language dependent.
@jalsol
@jalsol 9 ай бұрын
using bitset may be fast, but when combined with segmented sieve (it's the same sieve of Eratosthenes but you sieve one small segment at a time instead of the whole range at once), it turns out worse than just the segmented sieve CPU caching is also very important as well
@JimKnisely
@JimKnisely 9 ай бұрын
Similar to only checking odd numbers greater than 2, when updating the next multiple of an odd prime p, add 2p not p, since one can skip over the even multiples.
@andersama2215
@andersama2215 9 ай бұрын
The reason for using the square root is probably quicker to pick up if you consider how you're checking for primes, if you're testing for multiples (or the divisibility, however you'd like to put it) you'll be looking at two numbers A and B, where A == B or not. If we started a brute force search where A < B, the thing we should notice is that if we were to reach a case where A == B, then by definition the number we're testing can't be prime*. The second thing to notice is like you pointed out that once we continue our brute force search and flip over to where A > B, we're just observing the communicative property of multiplication at work, every A we'd be checking would have been a B we've already seen paired. So we only have to brute force up to a number which when multiplied is less than our original number.
@wooooooo_Oo
@wooooooo_Oo 8 ай бұрын
Great video. Just a note on the color's chosen during the Sieve of Eratosthenes' Boolean Array. I'm red/green colorblind and had a hard time seeing true and false values because the false value was not distinct enough against the background. Another option is the yellows for true wasn't bright enough for me to quickly identify.
@mschoenert
@mschoenert 9 ай бұрын
I believe that the sieve of Eratosthenes with a sliding window is a better middle ground between the straigthforward sieve and trial division. The sieve with a sliding window has a memory footprint that is comparable to Dijkstra's algorithm and a performance that is comparable to the straigthforward sieve. To be more concrete - Dijkstra's algorithm requires an array for the primes and an array of multiples. The sieve with a sliding window requires those and additionally the sliding window. But already with a sliding window of a few kBytes it achieves a performance that is about as fast as the straigthforward sieve (if not faster because it is more cache friendly). Don't get me wrong. I am a huge fan of Dijkstra. I believe that he is one of the most brilliant minds to ever work in computer science. And his "Notes on Structured Programming (EWD 249)" are a seminal paper that brought us structured programming. And this prime generating algorithm is *just* an example that he uses to show how to derive an algorithm in this structured way. He is so brilliant that even his examples are amazingly clever. But from an engineering perspective, I claim that Dijkstra's algorithm is never optimal. If you only need a few hundred primes and runtime is not so important, then trial division is super simple, has a minimal memory footprint, and runs fast enough. But if you need more primes and runtime is important, then the sieve with a sliding window is better than Dijkstra's: it has basically the same memory footprint and runs much faster. How much faster really depends on the number of primes that you want to generate, but for large enough N it can be more than an order of magnitude.
@holyshit922
@holyshit922 9 ай бұрын
When I was learning programming at high school we used this approach It is based on that divisors comes in pairs (for squares these numbers are repated) so it is not necessary to check numbers grater than square root of upper bound of interval
@jhonnywhite1256
@jhonnywhite1256 4 ай бұрын
Hey thx a lot for giving the code in your Patreon, really useful for my study
@cppexplorery
@cppexplorery 8 ай бұрын
This man really does cool things 💪
@dtar380
@dtar380 2 ай бұрын
Math explanation for the sqrt thing, think of any number, well, its square root is just exactly the number that sits in the middle of all multiplications possible to get itself, because its the result of multiplying that number twice, if you increment by the number it wont match the original same if you decrease it
@nexcode_ir
@nexcode_ir 7 ай бұрын
Woww, It was great. Thanks for sharing your precious knowledge
@SamDsk
@SamDsk 8 ай бұрын
When it comes to large primes numbers we use probabilistic approach to determine whether the given number is a prime. They are called probabilistic primality tests like Miller-Rabin algorithm.
@christophergame7977
@christophergame7977 9 ай бұрын
As a child I learnt to pronounce 'sieve' as, using the International Phonetic Alphabet, 'sıv', not as 'siv', nor as si:v. Other uses of the vowel: as in 'kip', not as in 'keep'; as in 'slip', not as in 'sleep'.
@AshishKadam
@AshishKadam 8 ай бұрын
Haven't seen such quality content in years, hats off to you sir! ❤😊
@Melds
@Melds 9 ай бұрын
Interesting topic! Hopefully some feedback is ok. Your video was good, but I think it was a bit wordy in parts. For example, almost six minutes were spent on trial division but there was a lot of repeated parts that could have been simplified and condensed. The square root part also seemed intuitive already from factor tables, so I thought it didn't need much time.
@TerjeMathisen
@TerjeMathisen 9 ай бұрын
If you are comparing algorithmic efficiency: space vs time, then I really think you should extend your sieve with a few very natural extensions: a) Making the array a bit array reduces the memory needed from 8 Mbit to 1 Mbit or 125 KB. b) Just skipping all the even numbers (just remember that 2 was in fact prime) reduces it by another 50% to 61.75 KB, at which point it fits easily in the CPU $L2 cache. c) If we also skip all multiples of 3, then there are only 2 candidate numbers to check in each group of 6, i.e. at this point we only need ~41 MB. d) One final step, removing all multiples of 5, means that we have exactly 8 remaining candidates in each group of 30 numbers, so memory usage is down to N*8/30 bits, which for N=1e6 means just 33 KB. Finally, when crossing out multiples of the current prime P, we only need to start at P*P, and then the following odd multiples, until we get to the (precalculated) sqrt(N) limit.
@mandrak87
@mandrak87 9 ай бұрын
Why not always skip even numbers larger than 2? No need to check them as they can never be prime (divisible by 2). Would save half the checks.
@durgaprasad814
@durgaprasad814 9 ай бұрын
Computer finds a number is even by finding modules of 2.
@mandrak87
@mandrak87 9 ай бұрын
@@durgaprasad814 True but you could use a boolean flag that flips back and forth to skip every other number.
@matthewp1988
@matthewp1988 9 ай бұрын
You still have to check the other multiples to increment them. For example 12 at 12:26
@durgaprasad814
@durgaprasad814 9 ай бұрын
@@mandrak87 yes that operation will be faster. However can we parallelis the division approach
@RyanLopez-z6q
@RyanLopez-z6q Ай бұрын
You can increment by 2 starting at 5
@GiladLeef
@GiladLeef 9 ай бұрын
I didn't know this algo and it seems like a great one. thanks for the video!
@MrJasbur1
@MrJasbur1 9 ай бұрын
Just a minor mention on trial division. We don’t have to explicitly calculate a square root. Instead, if we only go up to the largest prime number that can be squared to get a value still less than or equal to the value being trial divided, we know that’s the last prime number that has to be checked.
@MrJasbur1
@MrJasbur1 9 ай бұрын
Also, for the Sieve of Eratosthenes, there are tricks that use the fact that the perfect squares are just sums of consecutive odd numbers starting with 1. So, like 1=1^2, 1+3=4=2^2, 1+3+5=9=3^3… and the pattern has been proven to continue forever.
@MrJasbur1
@MrJasbur1 9 ай бұрын
Sorry, it’s that fact, plus the fact that you can start crossing out multiples starting with the square of the newly identified prime number in the sieve.
@LynxaaOG
@LynxaaOG 9 ай бұрын
In theory you could reduce the space complexity for the Sieve of Eratosthenes approach by using a bitset instead of booleans.
@Vaaaaadim
@Vaaaaadim 9 ай бұрын
In the course I first learned about the Sieve of Eratosthenes, we were told from the get-go to implement it using a bitvector. Sidenote, technically, the space complexity would be exactly the same. Using a bit instead of a boolean would certainly use 8x less space or more depending on the programming language, but Big O complexity doesn't care about constant factors. I wonder if using a compression algorithm would improve the sieve's space efficiency, I would guess yes.
@LynxaaOG
@LynxaaOG 9 ай бұрын
@@Vaaaaadim Good catch, the space complexity would be the same regardless, but the memory usage would be less with a bit vector. I mixed these two terms up. You could pack bits that are 1 into a structure, but at that point, that's more for storage concerns, not runtime memory performance.
@Vaaaaadim
@Vaaaaadim 9 ай бұрын
@@LynxaaOG What is the difference between storage concerns and runtime memory performance?
@LynxaaOG
@LynxaaOG 9 ай бұрын
@@Vaaaaadim I just commented that in response to you mentioning the possibility of using compression to improve the efficiency, which is typically only applicable if you were to store data on disk, while you could in theory compress the memory during runtime, you'd be adding additional performance costs to achieve that, which would hurt the algorithms performance.
@kmjohnny
@kmjohnny 8 ай бұрын
An elegant presentation of some brilliant algorithm ideas.
@dominiquecolin4716
@dominiquecolin4716 8 ай бұрын
that would be interesting to find a 'green' measure that would take into account the CO2 cost of the data center + time of calculation for each approach. I think we should go for such stuff, because time always seems the most important, since data storage is not perceived as an issue any longer. It was different from 1970 til 2000.
@greensalad_1205
@greensalad_1205 7 ай бұрын
0:29 Trial of Erastosthenes
@nk4g26
@nk4g26 7 ай бұрын
this algorithm is 1624x faster than Dijkstra's algorithm from numpy import arange, ndarray # i use arange because it faster than range and i want to return array def is_prime(start: int, stop: int) -> ndarray: if not start & 1 == 1: # In the binary system, every prime number ends with a 1, so I use a binary comparison operation (( & )) to check if the last digit is not 1. return arange(start+1, stop, 2) # The condition is true if the starting number is even; therefore, I add 1 to make it prime and then step by 2 to yield a prime number each time. return arange(start, stop, 2) for Example: is_prime(2, 10) [3 5 7 9] and this for even def is_even(start: int, stop: int) -> ndarray: if not start & 1 == 0: return arange(start+1, stop, 2) return arange(start, stop, 2)
@gr.4380
@gr.4380 9 ай бұрын
Here's the thing about space efficiency: even though most consumer apps can use more memory for the sake of performance, training AI is very space intensive, so the higher time requirement might be a better tradeoff than more memory used.
@BramCohen
@BramCohen 9 ай бұрын
In practice the memory requirements tend to be either/or. Either the requirements are below the available memory in which case who cares, or they're above it and your algorithm is swapping to disk and if its access patterns are bad enough it will finish long after you're dead.
@simplefahrenheit4318
@simplefahrenheit4318 9 ай бұрын
Does the sieve of erasthosthanes (in modern) use units place being 1 , 3, 7 and 9 to create the initial boolean array because above 5 those are the only places of prime numbers
@karreevn9085
@karreevn9085 9 ай бұрын
This kind of video is extremely useful Hope you'll have more videos about argorihm :))
@thevalarauka101
@thevalarauka101 9 ай бұрын
"so instead we need to increment our smallest multiple" and suddenly my brain just goes like, oh yeah, that makes sense, that's why this algorithm is better
@911Salvage
@911Salvage 5 ай бұрын
Dijkstra's algo is pretty easy to code. One of the very first assignments we got in data structures and algorithms course was to code Dijkstra's prime number algorithm in C or Java.
@KRYPTOS_K5
@KRYPTOS_K5 5 ай бұрын
! PRIMES ! Brasil. R.H.Hodara ! To US Mathcircle ! Prime? Euler Prime? ! RFO!Basic! Android Native Offline fn.def isEuler(x) y=x^2-x+41 IsEuler=y Fn.rtn IsEuler Fn.end fn.def IsPrime(n) IF n
@na50r24
@na50r24 5 ай бұрын
No idea if this is outdated but I'm prepping for a Datastructures and Algorithms exams and instead of using Modulo for the Sieve of Erast, you could instead do a nested loop use multiplication I'm more used to C, so I write it in C I don't think the Python one will be too different. I use 0 for False and 1 for True. C has a standard library for booleans but that's just a guy doing #define false 0 #define true 1 So yeah, I got used to writing it explicitly. for(int i = 2; i
@KRYPTOS_K5
@KRYPTOS_K5 5 ай бұрын
This is 3 times faster. ! PRIMES ! Brasil. R.H.Hodara ! To US Mathcircle ! Prime? Euler Prime? ! RFO!Basic! Android Native Offline fn.def isEuler(x) y=x^2-x+41 IsEuler=y Fn.rtn IsEuler Fn.end fn.def IsPrime(n) IF n
@Jetpans
@Jetpans 9 ай бұрын
Would be cool to see trial division with the optimization that all primes except "2" are of form 6n-1 or 6n+1. Because that seems to me like the only algorithm that could methodicaly calculate "the largest prime number".
@solqr6521
@solqr6521 9 ай бұрын
How exactly are you planning to use that as an optimization? The first steps of the trail division algo test module 2 and 3. If the number passed that point, then it must of the form 6n-1 or 6n+1.
@Jetpans
@Jetpans 9 ай бұрын
@@solqr6521 you can be checking ONLY the numbers of form 6n+-1. And skip all that are not this form. It is only a linear optimization but I think it is pretty big. For example, instead of checking ALL the numbers 40-60, you only need to check 41 43 47 49 53 55 and 59. That comes out to a third of total checks.
@aboodahmad8236
@aboodahmad8236 9 ай бұрын
You know it's simple number theory proplem? P≡k(mod6) As P is prime, k have prop(1,2,3,4,5,6) Can't be 2 or 4 or 6 because P will be divisible by 2 (p is not even) and in 3 case it's divisible by 3 (p is not divisible by 3) so only cases is (1,5) which is equivalent to (1,-1) This formula not have big impact on mathematics.
@programmertheory
@programmertheory 9 ай бұрын
I wonder. Were these algorithms optimized in any way? For example, the trial division method can be optimized for numbers greater than 3 by utilizing the fact all primes > 3 take the form of 6k +-1. Meaning assuming n > 3, we can put 2 and 3 in the primes list and then iterate over 5,7,11,13...n and not add the numbers in the list that are not prime, like 25, which is 6(4) + 1. This does improve the trial method by quite a bit.
@MusicEngineeer
@MusicEngineeer 9 ай бұрын
Very interesting. I didn't know this algorithm. What is the actual space- and time-complexity of the 3 algorithms in terms of Big-O?
@minamagdy4126
@minamagdy4126 9 ай бұрын
For an upper limit of n and number of primes p, the space is O(p) over the length of the output (itself O(p)), while I believe the time complexity to be O(n log(p)) with a modified min-heap (the modification is to allow getting all nodes of a value simultaneously or sequeentially, as all nodes of a value are to be raised upon coming to said value. The value will be the multiple of the prime in the tuple).
@ader7836
@ader7836 9 ай бұрын
Amazing video, great explanations!
@stephenshipley1066
@stephenshipley1066 9 ай бұрын
Very interesting but why does trhe sieve take u p so much space if it's a Boolean array? If this is composed of bits it would surely be smaller for the 1-30 example. I suspect that this is calculated using an integer array so most of the bits are wasted. Yes, bitwise operation would take longer but the Sieve of Eratosthenes has time to spare.
@lavneetjanagal
@lavneetjanagal 9 ай бұрын
Awesome video. Thank you.
@rursus8354
@rursus8354 9 ай бұрын
I think using Python as a test is misleading about the advantage. I get it that Dijkstra doesn't use division nor multiplication, which is a vast improvement on the age old computers that Dijkstra used. Iff my observation is true - it might be wrong - then his improvement will really show up when the algorithm is implemented in C *_as is customary_* when programming algorithms and integrating them into Python. Python is actually a very bad language for everything except fast prototyping. It is exceedingly slow and inefficient, but nevertheless it is used for numeric calculation since it is so easy and well-documented to integrate foreign C code.
@zsoltory4176
@zsoltory4176 8 ай бұрын
A much shorter proof is the followin: If a natural number n is NOT prime, it must be divisible into two factors n=f1*f2. If both factors f1>sqareroot(n) & f2>sqareroot(n) we get the contratiction n=squareroot(n)*squareroot(n) < f1*f2 =n. So if n is not prime, its smallest divisor must be smaller than squareroot(n). If there is no such divisor, n must be prime. (sorry for my lousy Engish).
@andypyne
@andypyne 9 ай бұрын
I like the Miller Rabin Probabilistic Primality test. That's super quick, with the caveat that it's not calculating Primes, but is very accurate at finding candidates - but for any desktop method if finding Primes, Miller Rabin is faster and as accurate. Miller Rabin breaks down at numbers higher than a desktop computer can easily calculate anyway
@b001
@b001 9 ай бұрын
Interesting, I’ve never heard of it. I’ll look into it. Thanks for sharing!
@kookookookoo4244
@kookookookoo4244 9 ай бұрын
3:39 a bit confused 56 /2 = 28 :)
@b001
@b001 9 ай бұрын
Oops, blooper! Good catch 😂
@DeathSugar
@DeathSugar 9 ай бұрын
There's more space efficient way to find primes similar to Dijkstra's algorithm called Pritchard sieve. It should use less memory than Dijkstra and has close to Eratosthenes's sieve performance.
@janhornbllhansen4903
@janhornbllhansen4903 8 ай бұрын
Nice presentation, but discussion seems a little off. I'd be interested to know when these algorithms break due to time/space requirements. Trial division will run and output primes until it runs out of memory, while sieve won't even start if you set limit sufficiently high.
@satviksrinivas8764
@satviksrinivas8764 9 ай бұрын
Great video, thank you!
@YouHaventSeenMeRight
@YouHaventSeenMeRight 9 ай бұрын
Since the pool also maintains a list of found primes (together with the multiples), why do we need an additional primes list?
@HarryC4615
@HarryC4615 9 ай бұрын
Hi b001! Your videos are really well made and clear. I wondered what software/tools do you use to create your videos and the animations. Keep up the good work.
@CrapE_DM
@CrapE_DM 9 ай бұрын
This is the first time I've seen anyone apply the square root stopping point to the Sieve. Good to see it
@forgotabhi
@forgotabhi 9 ай бұрын
I feel like I have found a goldmine! Loving your videos. Also what theme are you using?
@MI08SK
@MI08SK 8 ай бұрын
It is possible to optimise the sieve of aritosta... implementation by storing booleans in one bit each. reducing memory by 8x
@suhedheglobdefne6739
@suhedheglobdefne6739 9 ай бұрын
I believe memory size measures are wrong, or memory was wasted in test of sieve of Eratosthenes. Million bits is around 122 Kbytes or 0.120 Mbytes. The size of used memory of Trial division is more than four times of that. My theory that python can't make byte array, only array of numbers and mask it as byte array. Calculations: Size difference between tests is 8 Mbytes. Suppose that it is size of "byte array". Since length of array is million, size of one element is 8 bytes. Exactly 64 bit integer.
@InforSpirit
@InforSpirit 9 ай бұрын
Yes, reason is because Python True and False are memory pointer reference and that pointer number is saved in list. Python. numpy.ones(n, dtype="bool") is better way in space and time.
@sushismitcher225
@sushismitcher225 9 ай бұрын
This is really cool!! Thank you for your very good videos, you always make really cool videos.
@beepbop6697
@beepbop6697 9 ай бұрын
All of these algorithms can be improved by just incrementing by two to skip all the even numbers. 2 being the only special case prime number that is even (all other primes are always odd).
@Idan_Nesimov
@Idan_Nesimov 9 ай бұрын
Amazing video, well done
@zyklos229
@zyklos229 9 ай бұрын
That's the sieve, but with minimum look-ahead, which saves space.
@bennettzug
@bennettzug 8 ай бұрын
14:20 I was curious and the time complexity of the sieve of eratosthenes is O(n log log n), which i think is the first algorithm i've seen like that lol
@CheezePie
@CheezePie 9 ай бұрын
This is so informative! I'll try to implement Dijkatra's approach in Go.. keep making such awesome content ❤
@swedishpsychopath8795
@swedishpsychopath8795 9 ай бұрын
Thank you Norway for inspireing Dijkstra to do this.
@b43xoit
@b43xoit 9 ай бұрын
How?
@AlfW
@AlfW Ай бұрын
9:45 Actually, the Sieve of Eratosthenes isn't even using multiplication, it is only using addition.
@platypi_otbs
@platypi_otbs 9 ай бұрын
Dijkstra has been my favorite mathematician since 1994. Just brilliant.
@_fudgepop01
@_fudgepop01 9 ай бұрын
I think when it comes to embedded systems the dijkstra’s prime algorithm is the best one When space AND time are both important, with the relative limitations possible within embedded hardware, it would be the best with both of its trade offs …that said, at that point you might as well just calculate the list of primes externally or with intermediate steps depending on how far you need to go and just encode that as a lookup table lol
@sergejnekrasov7688
@sergejnekrasov7688 9 ай бұрын
For any human resource machine friends: Dijkstra's method becomes slower if your ALU has no mutliplication hardware :(
@uselesscommon7761
@uselesscommon7761 9 ай бұрын
Okay woah this is the type of content that makes me sub immediately
@asandax6
@asandax6 9 ай бұрын
Trial division is used as a prime test for random numbers. So if you are not generating primes and only need to know if the number is prime or not Trial division is more faster and efficient than prime sieve.
@Axman6
@Axman6 9 ай бұрын
Did your implementation of Dijkstra's (prime) algorithm use a min-priority queue to find the minimum multiple? I'd be shocked if Dijkstra weren't using one, given Dijkstra's Algorithm well known use of min-priority queues, and that might improve the runtime of your algorithm, with a little memory overhead.
@mschoenert
@mschoenert 9 ай бұрын
Dijkstra's original algorithm didn't use a min-priority queue / heap. Using just a vector instead of the min prio heap is faster, because the O(log) cost of updating the multiples are more expensive than what you save by not scanning for a divisor. And that is because scanning for a divisor is actually cheap - because for most candidates one of the early primes will already divide.
@sentinelaenow4576
@sentinelaenow4576 9 ай бұрын
I've found Miller-Rabin primality test to be extremely efficient, even for very large numbers, like 20 digit numbers.
@CallOfCutie69
@CallOfCutie69 8 ай бұрын
This is the moment Walt became Heisenberg Prime.
@pl_asma
@pl_asma 9 ай бұрын
your videos are extremely well made, you should be getting at least 200k views on these videos i also prefer the sieve of eratosthenes, really fast and who cares abt space tbf
@manifestasisanubari
@manifestasisanubari 9 ай бұрын
I agree that his videos are very well made and deserve more views. Just commenting this to help his video ❤
@mattsadventureswithart5764
@mattsadventureswithart5764 9 ай бұрын
I don't think anyone cares about space any more, but in the days of most people having access to nothing better than 8bit micros, it was v important.
@vectorlua8081
@vectorlua8081 9 ай бұрын
When your checking all primes up to 4 billion it matters, a lot.
@mschoenert
@mschoenert 9 ай бұрын
@@vectorlua8081 - quite so. As others have remarked, this can be solved by using a segmented sieve (aka sieve with sliding window). My implementation computes the ~200'000'000 primes up to 4 billion in roughly eight seconds on my (not very fast) desktop.
@vectorlua8081
@vectorlua8081 9 ай бұрын
@@mschoenert Nice, what language are you using? The reason I remarked on the size requirement is that not everyone has 32 GB or even 24 GB of RAM, eventually the computer would start using the swap memory and then things start getting quite slow.
@anon_y_mousse
@anon_y_mousse 9 ай бұрын
I wonder if we could speed up Dijkstra's algorithm by using a multiway hash table where the primes and the multiples are both used as keys that link to each other. Could index the multiple table by a combination of the prime and its multiple to avoid conflicts where two primes have the same multiple. My usual method of using a table size that's a power of two would fit neatly here because we want to avoid division.
@dtar380
@dtar380 2 ай бұрын
May we get the code for the Dijkstra's algo, mine is using 16 times more space than SoE, speed wise is alr, its in between SoE and TD, but im racking my brain with the memory
@tododiaissobicho
@tododiaissobicho 5 ай бұрын
Djikstra was a legend, that's a really cool algorithm
@snoopy1alpha
@snoopy1alpha 9 ай бұрын
Thanks for the video. That algorithm is great. In my opinion it is superior to both more "classic" solutions because it allows for bigger integer representations in other languages than Python (I don't know much about Python, but I would guess that it uses something like BigInteger for numbers exceeding "long"). In a language like Java or Kotlin your array size is limited to (2^31) - 1 (int). This is a very big number (a bit more than 2.1 billion), but you cannot go beyond that limit with the sieve approach (or you will consume a lot more memory for that). Arrays with "long" indexes are not supported (up to (2^63) - 1). Dijkstra's approach does not use arrays (or at least I would not implement it with an array). But even if you would use an array it would require less entries and you could explore the long range. I will try to code this in Kotlin without looking at your Python code (as an exercise). I'm not sure how far I can explore with it, but it will be an interesting challenge. When using the sieve with Java or Kotlin, you could use a BitSet instead of a boolean array. You are still limited to int indexes, but the space requirement is far less, because it internally maps the bits to single bits inside of long numbers inside of an array-like structure. If you explore the whole int range, it would require around 260MB instead of the > 8G you used for 1 billion in your Python program. Maybe Python has a similar data structure? It would be worth checking.
@snoopy1alpha
@snoopy1alpha 9 ай бұрын
Small update from my side (I didn't even get to the Dijkstra prime variant yet): I first implemented the classical two algorithms in Kotlin. For some reason your trial division algorithm in Python is way more efficient than mine in Kotlin (space-wise and especially time-wise). In fact, I actually had more space problems with my trial division algorithm than I had with my sieve algorithm. The trial division algorithm requires you to store your result in a data structure that makes them easily accessible. I first tried a list but that was a really bad decision. In the JVM (Kotlin runs on it), lists of primitive types require you to box them as objects which heavily increases the memory consumption of every single number. So I decided to use an array of primitive integers. Arrays in the JVM require to have a fixed length. But I cannot know the exact length of the array. I could calculate an approximation using the prime number theorem. However ... the formula x/ln(x) gives you a lower bound, but I require an upper bound. The "other formula" Li(x) is beyond my ability to code in Kotlin (or any other language ... actually I cannot calculate it at all because I lack the math knowledge required to do so). So I decided to use an array of length "limit" (which is the upper bound of the range you check for primes). In doing that, I waste a lot of memory. For some reason my trial division variant is way slower than yours. Your example (limit 1M) took 3 seconds, mine took almost 9. At the limit of 5M yours took 25 seconds, mine took almost 3 minutes. I haven't checked your Python code yet, but I certainly will do so when I'm done with the third algorithm. Most likely I'm doing something wrong. Now to the sieve algorithm. I tried my BitSet-approach and it was way more successful than I initially thought. For Int.MAX_VALUE as limit (2^31 -1 ... or about 2.1 billion) my algorithm took 18.2 seconds. I cannot check the size of my data structure as easily in the JVM than you can in Python. However, I limited the maximum JVM size to 260 MB and it worked. If I go down to 256, an exception is thrown when initializing the BitSet. When trying to run the trial division algorithm with that max heap size, it runs into the exception immediately. With some shenanigans, I managed to use the BitSet approach with the trial division algorithm. However, this algorithm requires to iterate over the primes found so far. The BitSet is slower for this purpose (compared to iterating over an array of primes found so far), resulting in an even slower runtime for trial division. I am still planning on implementing the Dijkstra variant, but I don't think I can outperform the sieve algorithm with BitSet as data structure. I have to thank you again for this challenge. You really got me hooked :D
@snoopy1alpha
@snoopy1alpha 9 ай бұрын
Second update from my side: I figured out my performance issue with the trial division algorithm. I messed up the square root condition for terminating the intermediate steps. I always tested all primes found so far instead of stopping at the square root of the current number. After fixing that my algorithm outperforms yours time-wise (I might have a better CPU and/or Python might be slower in general than Kotlin@JVM). I also optimized the trial division by checking only the odd numbers and leaving out the even ones entirely. In doing so, the trial division variant does pretty well (e.g., limit of 10M in 13 seconds). I also solved the space problem I had by actually figuring out how to approximate the pi-Function (exact number of primes up to a limit) with the logarithmic integral function (Li(x)). This worked out exceptionally well. Now I'm using up to around 460 MB for a limit of 2.1 billion (Int.MAX_VALUE or 2^31-1) instead of 8 GB. My BitSet variant of the sieve approach outperforms the trial division both in time and in space complexity. As mentioned before, it uses about 260 MB for its data structure. Now the elephant in the room: My implementation of the dijkstra prime algorithm. My first try was very underwhelming. It was way slower than my highly optimized trial division, while using up 3 times the space. I used two arrays, one for the primes found and another one for the multiples. The second array hat to be long numbers instead of int, doubling the size of the array. I certainly did something wrong. I analyzed what I did and figured out that retrieving all pairs of prime and current multiple of the minimal multiple is the expensive operation. My approach was linear in complexity resulting in this fiasco. I tried to optimize the linear search (finding a clever termination condition), but failed. Then I decided to try a better data structure for the "pool". I did two things. First: I don't need all primes in the pool, only those smaller than the sqrt of the limit. This significantly reduces the size of the pool data structure, independently of which one I choose. Even the multiples didn't have to be in the long range anymore. For tracking the primes, I am now using the same data structure as for the trial division approach (with the same way of predetermining the array size). For the pool I created a class containing a prime and its current multiple. Instead of an array, I put the objects of this class into a priority queue (internally it is a heap) that is automatically sorted by the value of multiples. When adding a new prime I put it into the pool if the prime is smaller than the square root of the limit. When updating a minimal multiple, I remove all minimals from the priority queue, increase their multiple value and put them back into the priority queue. When implementing it like that, I still was slightly disappointed at first. It was still outperformed by the trial division for fairly huge limit values. When I tried one billion as a limit the Dijkstra variant was finally significantly faster than the trial division. Either my trial division optimizations have been "too good" or the priority queue approach still has potential for improving. As conclusion for my findings I want to summarize my three implementations by showing you how they perform when trying to solve the limit of 2.1 billion. - Trial division requires 460 MB of space and around 21 minutes of runtime. - Sieve of Eratosthenes requires 260 MB of space and around 18 seconds of runtime. - Dijkstra prime requires 460 MB (primes) + 115 KB (priority queue) of space and around 8 minutes of runtime. It only uses slightly more memory than trial division but outperforms it for bigger limits by a lot. All three have one "expensive" operation or "flaw" that should be avoided. - Trial division suffers from an almost quadratic complexity and is still pretty slow, even when optimized - The BitSet structure, I used for sieve, has a speed disadvantage when retrieving all primes. You do not iterate over the primes but over all numbers and evaluate the associated bits for deciding which numbers are prime. That was the reason why it is inferior for being used in the trial division algorithm (see first update comment). - The Dijkstra approach can only be as good as the strategy used for maintaining the minimal multiple values. In a nutshell: the space-optimized variant of the sieve of Eratosthenes outperforms both the trial division approach and the Dijkstra approach both in space and time complexity and in my opinion may truthfully be called inferior to both... even if this counters the core message of this wonderful video. The Dijkstra approach is interesting but sadly of limited use. I guess now I will have a look at your Python code 😀 Addition: I just looked at your Python code. You also used a heap/priority queue for the pool. You could optimize the heap size by stopping adding primes after the primes exceed the sqrt of your limit. Did you try higher limits? I wonder if your Dijkstra implementation is still close to sieve if you exceed 1 billion as limit (Spoiler: It isn't. For 1 billion, sieve took around 80 seconds and Dijkstra exceeded my patience 😀)
@AK-vx4dy
@AK-vx4dy 9 ай бұрын
@13:55 I have some doubts, Dijkstra method should be much faster, slower than erastones (but i won't bet, it is not cache friendly) have simiilar operations, don't have suqare root and division, so speed should be in simmilar ballpark. I don't know python much so i guess problem lies in heappoll poping and pushing
@rutwikroutu
@rutwikroutu 9 ай бұрын
Which microphone do you use ?
@hulakdar
@hulakdar 9 ай бұрын
the size requirement of sieve for 5 million should be around 625kb, not 8mb, if you use a bitmap, and not array of bools, so this is probably some python inefficiency
@eclipse6859
@eclipse6859 9 ай бұрын
I recently had to take a coding test for a potential internship. I had a question that asked me about factors of any given number. I couldn't figure out how to do it so I just i++ my way up. Unfortunately there was a runtime limit of 4 seconds, which failed the attempts for big numbers, and I never got the question. This would've been very helpful a couple weeks ago.
@Schnorzel1337
@Schnorzel1337 9 ай бұрын
Factorization is a different problem tho and can not be done quickly. Best approach is start with your number N and start with your first prime p1 and divine N until p1 stops dividing N. Repeat for prime p2,p3 until N becomes one. So 3260 will result in: [2] : 1630 [2,2]: 815 [2,2,5]: 163 [2,2,5,163] : 1 See the gap between 5 and 163? That is why prime factorization is slow. Now for any prime N you would generate all primes from 2 -> sqrt(N) checking all of them while none divide N.
@MaloHombre84
@MaloHombre84 9 ай бұрын
In Dijkstras approach, why didn't you increment the multiples of 2 and 3 after adding 11 to the pool?
@PatrickPoet
@PatrickPoet 8 ай бұрын
fun video, thanks, I love stuff like this. your pronunciation of sieve isn't the British or American pronunciation, where is it from?
@deatho0ne587
@deatho0ne587 9 ай бұрын
Sqrt is normally slower than division also, which was part of Dijkstra's version also.
@ItsJoeyG
@ItsJoeyG 9 ай бұрын
I love watching your videos then implementing them into code (without first looking at your example of course) much more fun than leetcode imo since you do a great job at explaining each algorithm. Well earned sub. Thank you :D
The Algorithm Behind Spell Checkers
13:02
b001
Рет қаралды 418 М.
Увеличили моцареллу для @Lorenzo.bagnati
00:48
Кушать Хочу
Рет қаралды 8 МЛН
What type of pedestrian are you?😄 #tiktok #elsarca
00:28
Elsa Arca
Рет қаралды 32 МЛН
Players push long pins through a cardboard box attempting to pop the balloon!
00:31
How on Earth does ^.?$|^(..+?)\1+$ produce primes?
18:37
Stand-up Maths
Рет қаралды 414 М.
New divisibility rule! (30,000 of them)
26:51
Stand-up Maths
Рет қаралды 286 М.
5 Useful F-String Tricks In Python
10:02
Indently
Рет қаралды 329 М.
How to Win Snake: The UNKILLABLE Snake AI
17:05
AlphaPhoenix
Рет қаралды 2,2 МЛН
An Exact Formula for the Primes: Willans' Formula
14:47
Eric Rowland
Рет қаралды 1,4 МЛН
The hidden beauty of the A* algorithm
19:22
Polylog
Рет қаралды 898 М.
THIS is Why List Comprehension is SO Efficient!
5:25
b001
Рет қаралды 173 М.
The Most Useful Curve in Mathematics [Logarithms]
23:43
Welch Labs
Рет қаралды 351 М.
The Boundary of Computation
12:59
Mutual Information
Рет қаралды 1 МЛН
Увеличили моцареллу для @Lorenzo.bagnati
00:48
Кушать Хочу
Рет қаралды 8 МЛН