Are Lists Evil? Creator Of C++ | Prime Reacts

  Рет қаралды 109,018

ThePrimeTime

ThePrimeTime

Күн бұрын

Пікірлер: 350
@michaelthompson7217
@michaelthompson7217 Жыл бұрын
Bjarne Stroustrup is one of the best engineers to emulate. he always comes off as a guy focused on solving problems and not about having pointless opinions.
@youtubeenjoyer1743
@youtubeenjoyer1743 Жыл бұрын
very subtle
@georgeokello8620
@georgeokello8620 Жыл бұрын
And somehow he has contributed to c++ been the most bloated language with providing solutions to problems that c++ created.
@anon1963
@anon1963 Жыл бұрын
@@georgeokello8620 c++ is not bloated, you include what you want to use. it's that simple
@shlokbhakta2893
@shlokbhakta2893 Жыл бұрын
@@georgeokello8620 just dont use it then lol
@101Mant
@101Mant Жыл бұрын
​@@shlokbhakta2893you don't always get a choice what you work on. I did C++ for several years. It's one of those interesting cases where lots of individual decisions make sense but the final result has problems. You end up with lots of features you don't want and not getting things other languages have had until years later.
@polyglotusamericanus4163
@polyglotusamericanus4163 Жыл бұрын
Big O notation is based on limits in calculus. In theory O(1) is always faster than O(n) but only as n approaches infinity. This is why built in sort functions in well implemented programming languages will often select an algorithm based on the number of elements you pass in.
@simonhrabec9973
@simonhrabec9973 Жыл бұрын
Thank you for stating what the O notation actually is.
@chepossofare
@chepossofare Жыл бұрын
This. For small arrays (i mean, like 10 elements) an insertion sort (hell, even a bubble) can be WAAAY faster that a qsort().
@isodoubIet
@isodoubIet Жыл бұрын
"In theory O(1) is always faster than O(n) but only as n approaches infinity." That's not right either.
@bigpest
@bigpest Жыл бұрын
@@isodoubIet How so? I’m guessing we’re gonna say constant-time with a large constant can be more expensive than linear-time with a tiny constant?
@isodoubIet
@isodoubIet Жыл бұрын
​@@bigpest That particular statement is inaccurate in a couple ways. First, everything that's O(1) is also O(n), or O(n²) etc, just the bound isn't tight. That means O(1) is not necessarily faster than O(n) for any n. Maybe that's pedantic, so let's switch to tight bounds throughout (I'll use Θ instead of O, for clarity). Secondly, we use the limit notation to indicate the asymptotic behavior as n "approaches infinity" but it's important to understand that what this actually means is that there's some _finite_ value at which Θ(1) is faster than Θ(n). We're not dealing with some weird mathematical statement that's only valid exactly on the limit, like 0.9999... = 1. Given a _large enough_ n, Θ(1) will always beat Θ(n).
@anonymouscommentator
@anonymouscommentator Жыл бұрын
I'm surprised to see how many people miss the fact that big O notation is about scalability of an algorithm and not absolute performance
@khatdubell
@khatdubell Жыл бұрын
This. I said in my own comment using big O for speed is like measuring the number of lanes on a road to tell you how fast the cars can driving on it are going.
@stilldreamy5181
@stilldreamy5181 Жыл бұрын
In very high level languages, all bets are off. Just use whatever data structure is most convenient until you encounter performance issues. Then measure to find out which part of the code is the culprit. Then repeat the measure, tweak, measure cycle.
@ゾカリクゾ
@ゾカリクゾ Жыл бұрын
fax
@chris52000
@chris52000 Жыл бұрын
Exactly! I’d rather use a set to get the unique elements in a small collection even if slower as it’s more expressive. If it turns into a bottleneck then optimize.
@williamdrum9899
@williamdrum9899 Жыл бұрын
I agree. When I do anything in more modern architectures I don't even think about clock cycles. On 8 bit micros though? Time to whip out the opcode chart
@zoeherriot
@zoeherriot Жыл бұрын
It’s cache misses that causes the slow down. The list will perform badly because it’s generating cache misses all over the place. The array is guaranteed to have contiguous memory, so as you don’t have to go to main memory as frequently. This is the meaning of compact and predictable access of memory.
@OMGclueless
@OMGclueless Жыл бұрын
It's not always the cache misses. There are also pipeline stalls from dependent loads: the next address to fetch in the list is in the data you are currently fetching so even with speculative execution those loads have to be entirely serial. Cache misses are indeed overwhelmingly likely to be the culprit for performance issues and hence it's a good rule of thumb to optimize for them first, but they aren't the only effect.
@zoeherriot
@zoeherriot Жыл бұрын
@@OMGclueless that's a great point. Thanks for the clarification.
@EmreHepsag
@EmreHepsag Жыл бұрын
@@zoeherriot cache misses take place for array, too. However, there are prefetchers that guesses the next elemnt and prefetches it to make it cache hit. Also, when there is a cache miss, cache fetches 1 cacheline of data to cache which is mostly 64B. So, it brings several elements of the array which again reduces number of cache misses.
@zoeherriot
@zoeherriot Жыл бұрын
@@EmreHepsag yeah, I get that. My comment was because both Bjarne and Scott Myers both did talks where they explicitly covered this topic - and the point was - almost always choose arrays over linked lists. I wasn’t trying to get into the weeds, I just assumed everyone had seen those talks.
@Spartan322
@Spartan322 Жыл бұрын
To be fair, also presumes that you're iterating the list, if you aren't, they aren't as bad, but in most cases when you make a container of elements that isn't like a map/dictionary, you're usually iterating it, hence why lists tend not to be a great container to resort to.
@Zekian
@Zekian Жыл бұрын
One does not add constants to big O notation, because it is not intended to indicate performance, which varies between processors and data. Instead big O notation is an indication of how the performance scales with regards to input, agnostic to the processor and data. It's possible for an O(n*log(n)) algorithm to perform significantly worse than an O(n^2) counterpart on small inputs and significantly better on larger inputs. When optimizing, benchmark for your specific data and platform. The reason why linked lists generally don't perform well is due to cache misses and speed of main memory vs CPU cache. So while it might be a single step in the algorithm the CPU will be idle for 1000s of cycles as it waits for main memory to return the value. Predictable memory lookup (Such as with with a vector/array) often leads to the CPU prefetching the data before it actually needs to be read.
@MrAbrazildo
@MrAbrazildo Жыл бұрын
11:30, during my measurements of "if-block vs switch vs array (fixed-size) vs bitfields", for different configurations of these data structures, I found that they have this increasing speed order, on average. But I also found that some switches are slower than ifs, and others beat the "lesser" array configurations, as well as the faster array ones beat some "slower" bitfields.
@ea_naseer
@ea_naseer Жыл бұрын
compiler magic sir
@MrAbrazildo
@MrAbrazildo Жыл бұрын
​@@ea_naseer Compilers differ too. GCC's performance is more unpredictable than Clang's. I think that's due to make more use of jumps.
@RunTheTape
@RunTheTape Жыл бұрын
It also depends on the architecture. For example if your using slow ass RAM such as SPI on embedded devices, you'll get wildly different results with the same code as opposed to running on a PC or a IBM Power RISC. Always fish some results early. Benchmark for your needed dataset size.
@MrDarkoiV
@MrDarkoiV Жыл бұрын
Bjarne is epitome of pragmatism trumps all. That's probably why C++ is how it is, but still one of the most widely used language.
@Average-Lizard
@Average-Lizard Жыл бұрын
Interesting. I was just doing tests like this in C# yesterday morning... I found if you're inserting multiple 100s of thousands or more elements near or at the front, it's worth converting a list to a linked list for the insertions, and then back. This gets better as the original size of the list and number of insertions increases. I'm not sure I've ever seen a scenario in production for this kind of thing... but if it comes up I'll be ready for the rare time a LL will be worth it! lol (and that's only if the memory isn't a concern for the app at that point)
@ErazerPT
@ErazerPT Жыл бұрын
Not sure about the current state, but List used to be just a cute wrapper around T[]. If it still is, your scenario would make perfect sense as the container array would have to be allocated>copied>disposed every time you exceed it's size with the insertions. And obviously, an insert (or removal) near/at the beginning will need A LOT more element shifting than the same operation at the end. Should be easy to test. Allocate a List then start inserting items one by one at the end and timing the insertion. It should be (near) constant until you hit the limit, then it will have an "outlier" and go back to the same (near) constant on the next insertion, with the "outliers" appearing at regular intervals because of the allocate>copy cycle. But C# is a really nasty thing when it comes to optimization. For example [y*stride+x],[y][x] and [y,x] should be pretty close, but they are anything but. And if you pick the best one (for the moment), you might get surprised if you later change your CLR and it becomes worse than before. Or to put it simply, if performance is critical, C# is not it, write a dll/so in C++/C/somethingpredictable with the performance critical parts in it. If you can live with sub-optimal, by all means, let it all live in C# (for example [][] is not always the best, but is never the worse ;) ).
@Average-Lizard
@Average-Lizard Жыл бұрын
Ya if performance was truly critical, we'd probably want some sort of optimized/custom initializing and resizing logic instead of the what C# provides with its array wrapper for List, like you said. But as I mentioned, I haven't even needed to stray from List in any project for performance reasons, as the C# hasn't come close to network times. C# actually just released a bunch of performance upgrades with .Net 8 today! Still not the language you choose for hardcore performance, but there's some nice improvements for JIT and AOT compilation. This was just a fun, quick experiment (similar to what The Primagen was writing) to see if I should have been considering LinkedList instead of List in my previous projects. @@ErazerPT
@ErazerPT
@ErazerPT Жыл бұрын
@@Average-Lizard Can totally understand you. On some project we had to interface with a Sharepoint server that lived in a network that can only be described as either dysfunctional, schizo or both. Add both things and you get a slug on Valium. We gave up on optimization because we could at best shave some dozens of milis out of seconds on the external calls...
@pokefreak2112
@pokefreak2112 Жыл бұрын
If a usecase were heavy on front insertions you could probably just go with a double ended array. Or just an array where you store the items in reverse by convention, so front insertions get added to the back
@arthurmoore9488
@arthurmoore9488 Жыл бұрын
@@ErazerPT Step 1 in my plan to optimize C# applications I deal with. Use async. I've seen C# written purely synchronously more often than you'd think. Heck, I've written more synchronous C# than I'm proud of. When the upper level is synchronous, it's just easier to stay that way. Slower, but easier.
@williamlvea
@williamlvea Жыл бұрын
One thing to keep in mind, for many things code readability and maintainability, while subjective, can be more important that small performance gains. If a Map makes your code crystal clear, use it
@sharpfang
@sharpfang Жыл бұрын
My first question when approaching this stuff is "How big is the element, and how many elements are there?' I can usually rewrite a function that inserts/removes n elements into/from a vector length m from O(m*n) (move tail after every insertion/removal) to O(m+n) bundling all the operations into one pass through the vector (move tail by m, then subsequently move elements by less as elements are encountered). I like to use lists to store big almost-singletons. Stuff you typically expect to have one of, but in rare cases you'll want more. Say, a logging service with connection to the logging server. In 99.9% of the cases you'll have one logging server, one connection, period. Rarely you'll have 2. Almost never 3. The object itself is quite heavy, containing a sizeable, fixed size buffer for the logs. I don't want a vector to pre-allocate space for several of these. I absolutely don't allow moving them on resize, 'cause other stuff already has pointers to the existing instance. But traversing the 1-3 element list to visit all the elements takes a blink of an eye comparing to time spent performing the object's actual job, and no RAM is wasted.
@Vortex-qb2se
@Vortex-qb2se 2 ай бұрын
There are people out there who argue with the creator of C++ about performance? Wow.
@JorgetePanete
@JorgetePanete Жыл бұрын
What about smartass compilers? Let's say the compiler knows I'm doing 5 deletions, instead of deleting + compacting 5 times, it could try to do all deletions at once maybe? Same goes for other crazy operations on vectors that it can see optimizable or not?
@alexaneals8194
@alexaneals8194 Жыл бұрын
That's why you always test. There may be compiler options that will speed up the performance, but they may also slow down the performance. Same goes with tuning queries in the database. Just throwing indexes at something does not always work.
@bonsairobo
@bonsairobo Жыл бұрын
Merge sort also has a good use case: external sorting. If your data set doesn't fit in memory, merge sort is a totally valid strategy.
@phill6859
@phill6859 Жыл бұрын
I used merge sorting linked lists in C and it beat using arrays. Insertion sorting doesn't scale no matter how you you store the data. Binary searching your linked list will beat linear searching, he seems to overlook that. I didn't need the data sorted all the time, there are always tradeoffs.
@Kycilak
@Kycilak Жыл бұрын
@@phill6859 How does binary search on linked list function? You have to traverse the list to get to the checked element. Is the "check element" function that demanding?
@egor.okhterov
@egor.okhterov Жыл бұрын
​@Kycilak maybe he meant skip-list, but anyway that guy seems delusional and needs to be taken to infirmary
@phill6859
@phill6859 Жыл бұрын
@@Kycilak yes. Doing a strcmp on each item takes time and if your linked list pointers are allocated from a block they will end up in the cache. If you linearly compare all the keys then you will thrash the cache. This was 25 years ago, caches were smaller. I had plenty of people telling me I was wrong at the time. It worked and it was faster, so I just ignored them.
@phill6859
@phill6859 Жыл бұрын
@@egor.okhterovnot a skip list, going to an index in a linked list is faster than strcmp linearly because of cache usage. 50% of the time you will be going in the opposite direction that you just travelled and so that is in the cache too. You can cache more pointers than you can data.
@voskresenie-
@voskresenie- 3 ай бұрын
It's not that 0(1) is really O(C*1) - it's that O(C(n)) means that operations are less than or equal to C*f(n) for some constant C and sufficiently large x. For example, 5x^3 + 3x^2 + x + 18 < C*x^3 when x is >= 1000 and C is 6, since 6(1000)^3 > 5(1000)^3 + 3(1000)^2 + 1000 + 18, and for all similar expressions where x > 1000. It's the same concept as a limit going to infinity, how only the largest piece matters once x gets large enough. idk if this is actually what O stands for, but in my head I always say 'order of f(n)', which is what it is.
@arekusei9580
@arekusei9580 2 ай бұрын
it is actually what big-O stands for. But in computer science we are used to the fact that n approaches infinity, because n is like some number of 'basic operations', so algorithms are always scaling to infinity. But to be clear, 'n' in f(n) might be some other value, even negative, or even a vector of values
@mfc1190
@mfc1190 Жыл бұрын
A more telling example: If you write a function that always returns an array of size 1000000, this function has a O(1) time and space complexity. Another function, that always returns an array of size (n) where n is the input, will be an O(N) function, but will be faster up until 1000000.
@vasiliigulevich9202
@vasiliigulevich9202 Жыл бұрын
Both are O(1)
@cedricvanhaverbeke5594
@cedricvanhaverbeke5594 Жыл бұрын
Yeah, but that doesn't mean the first one won't be faster. @@vasiliigulevich9202
@SVVV97
@SVVV97 Жыл бұрын
​@@vasiliigulevich9202depends on the precise array and implementation details. We can't tell either way a-priori
@vasiliigulevich9202
@vasiliigulevich9202 Жыл бұрын
@@SVVV97 no. Allocation/initialization of array of primitives - O(1) array of objects - O(N). Easy as that, can't be misread.
@SVVV97
@SVVV97 Жыл бұрын
@@vasiliigulevich9202 No, that's just not true and a gross oversimplification. An array of uninitialized memory is *probably* (modulo weird allocators) O(1). A zeroed array might be O(1) or O(N) depending on your allocator, OS, hardware etc.. Generating an array of random numbers is almost surely O(N). Generally speaking the time complexity can be arbitrarily high because computing any element of the array can be arbitrarily large - and together with that we can also bump the space complexity arbitrarily high.
@fuseblower8128
@fuseblower8128 Жыл бұрын
I feel that abstracting memory away has created more problems than that it has solved 😅
@youtubeenjoyer1743
@youtubeenjoyer1743 Жыл бұрын
it all started with new and delete
@orterves
@orterves Жыл бұрын
All software development is about tradeoffs. If the problems caused by abstracting memory away are net less expensive than the solutions that the abstraction provides - then it is a good abstraction.
@gagagero
@gagagero Жыл бұрын
@@youtubeenjoyer1743 GCs famously didn't exist before Javascript, right?
@VojtěchJavora
@VojtěchJavora Жыл бұрын
I mean, to be fair, the memory is already abstracted for you by the hardware.
@youtubeenjoyer1743
@youtubeenjoyer1743 Жыл бұрын
​@@gagagero It's not about gc or javascript.
@everything-narrative
@everything-narrative 8 ай бұрын
I had an application once that processed some data, correlating two lists of data by using one list as keys and the other list as a lookup table; the lists were two different representations of the same dataset. Changing that lookup list to a hash table provided a speedup of over 16000x in the biggest dataset we had of over 50k entries. Point is: time your application before you optimize; you don't know when the constants trump the asymptotics.
@rezabani9904
@rezabani9904 4 ай бұрын
When remove an element from a doubly linked list you also have to update the next and previous links (pointers) in the next and previous location as well.
@shinobuoshino5066
@shinobuoshino5066 Жыл бұрын
Here's an even faster vector based semi-compact "list": don't move elements over, keep another vector of indexes to determine where gaps are, only fill in the gaps on specific call to do so, such as after you're done inserting/removing whatever, that will be undisputably faster than removing one element and shifting entire block of data at a time. You can even not ever move elements anywhere at all, simply maintain second vector that holds correct permutation of first's indices, and it will still be faster than list, because when it matters, you can swap elements according to permutation before doing any heavy work. There's practically zero real world usecases for linked lists.
@pokefreak2112
@pokefreak2112 Жыл бұрын
I believe we should use fixed size arrays by default, it's hands down the best data structure most of the time.
@Spartan322
@Spartan322 Жыл бұрын
Majority of the time a vector (dynamic array) with a pool, SBO, and/or static buffer are just as good as a fixed size array, unless you have some reason you need to avoid heap allocations.
@arthurmoore9488
@arthurmoore9488 Жыл бұрын
@@Spartan322 Pros and cons. If I ever expect the number of elements to change I'm almost always going to go with std::vector over std::array. Assuming we're talking about member variables, the memory layout between the two is pretty different, even if both are on the heap. Most of the time it doesn't mater, but that extra pointer dereference can cause cache misses. It's also a massive difference in how copy, move, and serialize/deserialize operate. With the fun part being moves are almost always much faster when using std::vector.
@Spartan322
@Spartan322 Жыл бұрын
@@arthurmoore9488 Usually the case for which I'll consider array over vector is when doing constexpr/consteval stuff, as Apple's version of Clang still doesn't like allocations in constexpr/consteval even in C++20, template metaprogramming can get pretty gnarly with everything we have in C++20. (that's not entirely supported by any if not all the compilers) Course you need to never let allocations escape the constant eval environment regardless so you have to return fixed sized objects, but most of the time I just resort to vector if I need a contiguous container unless I need some type of stack advantage, its rarely been useful in my experience to deal with arrays aside from those cases.
@pokefreak2112
@pokefreak2112 Жыл бұрын
@@Spartan322 Directly taking pointer to elements in an array can be very useful. Explicitly setting a hard limit on something makes stress testing trivial. Fixed length arrays are more portable, not every platform has malloc (i.e. wasm)
@Spartan322
@Spartan322 Жыл бұрын
@@pokefreak2112 Pretty sure emscriptem has dlmalloc and emmalloc, so how would wasm not support malloc? Like guess you could set it to run without malloc, but that's not really the same as saying it does not support it, malloc is literally standard C, if it can't use malloc, its not standard C regardless, that's like saying well defined behavior does not need to follow the standard, then you're simply not using C, let alone C++. I can't think of a single platform that doesn't support malloc that can claim to support C.
@guylaingreer4190
@guylaingreer4190 2 ай бұрын
The biggest reason to use std::list over std::vector is the difference in iterator-invalidation. Iterators are/maybe invalidated by inserting into/removing from a std::vector, but iterators for a std::list remain valid unless the element your iterator is pointing to was removed.
@Leonhart_93
@Leonhart_93 11 күн бұрын
That's just a C++ thing only. In general, the best use linked lists have is to allocate a list of different size and types elements, filled with the equivalent of void pointers in C that later get a defined size and type.
@Lukeisun7
@Lukeisun7 Жыл бұрын
Genuinely asking, for the problem in the beginning why would you intuitively reach for the linked list? It says that you remove the elements by random positions, in the LL implementation you'd have to traverse till you get to that position where as you have direct access in the vector/array implementation.
@jaredteaches894
@jaredteaches894 Жыл бұрын
It’s because when you remove an element in a vector you have to shift every element over by one. Here’s the original array: std:vector = {0, 1, 4, 6, 8}; and if you want to remove 4, you have to shift over 6 and 8 so that the new vector looks like {0, 1, 6, 8} If not for shifting, the vector would look like: {0, 1, GARBAGE, 6, 8}. The reason you don’t need to create a new vector is because you just change the iterators pointing to the last element.
@NTSHOni
@NTSHOni Жыл бұрын
Because linked lists are better designed to delete elements in the middle of the list (you just relink the previous to the next and the next to the previous), whereas vectors will realocate as the elements must be contiguous in memory.
@Lukeisun7
@Lukeisun7 Жыл бұрын
@@jaredteaches894 Thanks for the reply. So essentially it's just a tradeoff between taking the time to search the linked list to find your target and shifting the memory? Is it more intuitive because it's more ergonomic from a developer POV?
@augusto8632
@augusto8632 Жыл бұрын
Sometimes you already have the element you want to remove, so you don't need to transverse
@jaredteaches894
@jaredteaches894 Жыл бұрын
​@@Lukeisun7 It's "intuitive" because it's often taught that insertions and deletions are efficient in a linked list, which is true; they are efficient. So, when you want to have a data structure where you plan on doing a lot of insertions and deletions, you will naturally choose a linked list. The author is suggesting that you need to measure the cost of operations. The article ends by emphasizing the importance of locality -- both spatial and temporal. The general gist of caching is the most common case will be efficient. Things closer to X will probably be used again so hardware will exploit that. This is not necessarily true in a linked list. Nodes on a list don't have to be near each other in memory. In other words, the addresses of nodes might not be cached (within the TLB) and this extra memory references will be incurred.
@arrowtongue
@arrowtongue Жыл бұрын
I was literally watching your 4 month old video about this mere minutes ago, since I wanted to know more about the use case of Linked Lists, what're the chances of a new upload 2 hours ago!
@TheEVEInspiration
@TheEVEInspiration Жыл бұрын
12:00 Doesn't mean much when not telling which method turned out to be faster. For me it is obvious the array is, its just a tiny bit of continuous memory that can be processed at full processor speed without memory latency messing things up. If the list turned out to be faster, now that would really surprise me and I would certainly expect that to be because of some data skew making the comparison invalid.
@larjasoul
@larjasoul 5 ай бұрын
this is why unit testing, benchmarking, and pure functions are so powerful. you can pump out dog sh** code, and easily fix/optimize things later, in my experience.
@KatarinaClaes-q1p
@KatarinaClaes-q1p Ай бұрын
wait a sec, I just realised that the insertion time that I made for my arraylist datastructure library in C is actually faster than a linked list, and probably so too because a linked list is getting a ton of cache misses. I made my arraylist in C kind of a ring buffer so that it tricks you into thinking that inserting front will actually insert at front but it actually inserts at the back, and because of some math this is O(1) and then inserting means just instantly going to the element and then either moving all elements to front or to the back depending on which path is faster. so that O(n) but my insertion can always calculate the fastest path. the only thing that would take more time would be doing a resize which is O(nn) but tbh, this can easily be prevented with my arraylist too by just having a big enough capacity for the job from the start or something like that. but then again yeah Idk if the resizing is that bad honestly.
@dapperpanguin994
@dapperpanguin994 Жыл бұрын
I get that array based lists are compact, but how is adding/removing from their wrapper counterparts such as vector or ArrayList/List faster? Just because they are contiguous and are accessed via an offset? I assume this means that accessing an element of the linked list through its reference/pointer member is slower overall.
@Hazanko83
@Hazanko83 Жыл бұрын
Linked lists force memory lookups for every node, where a vector has all of the info in contiguous memory. That might might be obvious, and most people also know that memory lookups are extremely slow... But often a vector's elements can fit within a cpu register which is blazingly fast, and the cpu is also blazingly fast at manipulating data within it's own registers. std::vector also has optimizations for certain removals, so many operations can be performed with a single 'walk'' of the array where that can be much harder with a linked list. Linked lists do have use-cases, but I think it's relatively rare and it's usually a trade-off for a gain somewhere else; they seem closer to an educational tool than anything else lol.
@egor.okhterov
@egor.okhterov Жыл бұрын
I don't like words like "fast" and "slow" in this context. "Blazingly fast" and "extremely slow" makes me cringe even harder. I could accept "faster" and "slower" though :)
@Hazanko83
@Hazanko83 Жыл бұрын
@@egor.okhterov The difference is large enough to use "blazingly" and "extremely lol. The number I find from a Google search is cpu registers are generally 10-100 times faster than RAM, but it's kind of an ambiguous value.
@egor.okhterov
@egor.okhterov Жыл бұрын
@@Hazanko83 the context is king. What matters is "observable" difference. For example if I sort a file in 1ms or 100ms I will not observable the difference, so that's basically the same speed for me. If I am a web page user and a page loads faster then I blink all the speeds faster then I can move my eyelids are equivalently blazingly fast 😀
@Hazanko83
@Hazanko83 Жыл бұрын
@@egor.okhterov That kind of logic only works for one-off situations. a 1ms difference compared to a 100ms difference, could mean thousands or millions of dollars in server costs for a website. Or in a game it could mean it running at 1fps or 100fps if every function is running 100x slower than it could be. The context is real world applications vs someone's toy project.
@user-hk3ej4hk7m
@user-hk3ej4hk7m Жыл бұрын
It'd be interesting to see how the performance changes if you're using arena allocation for the linked list, or some sort of data structure that better represents the problem at hand, that'll cost you development time though and there's no guarantees that'll be faster.
@Hazanko83
@Hazanko83 Жыл бұрын
it's called a std::vector or a c-style array lol. The whole point of a linked list is that you can have exit/begin pointers to 'neighboring' nodes BECAUSE you don't know where they are in memory, if you're expecting data to be allocated within the same area of the ''arena'' or to be contiguous, then you'd just use an offset on the object pointer to get to previous/next nodes(which is literally a raw array). std::vector essentially is it's own ''arena allocator'' in the sense that it manages the allocations and 'garbage collection' for you, instead of calling into your own custom memory allocator. For example, the struct below and it's function works, and is ''valid c++''. Although I would not recommend designing anything like this lol... struct SillyLinky { inline SillyLinky* get_next() { return (this+1); }; };
@youtubeenjoyer1743
@youtubeenjoyer1743 Жыл бұрын
You still have to store pointers and that overhead adds up, your cache lines will have less data, and you will have more cache misses. Each insertion requires allocation, writing node data at the advancing arena pointer. Depending on the problem, your linked list nodes may become randomly spread out on the arena, and a simple traversal will cause unpredictable memory access patterns.
@user-hk3ej4hk7m
@user-hk3ej4hk7m Жыл бұрын
@@youtubeenjoyer1743 We're talking about small memory optimizations, so the size of the array might not be a concern. Making the arena larger but not so much as to allowing it to fit into a cache line could be possible. Also I'm not talking about a linked list on top of an arena in the "use this allocator instead" c++ way, I'm talking about maybe a vector with index pointers, you'll be jumping arround the vector instead of the entire heap so probably that changes things.
@user-hk3ej4hk7m
@user-hk3ej4hk7m Жыл бұрын
@@Hazanko83 You took the worst possible interpretation of what I said lol. I'm not proposing an arena allocator in the c++ way, I meant it in the sense of treating the vector like a linked list, storing pairs of (element, index of next element). This way you still have the constant search performance advantage. Would still need to test it though.
@Hazanko83
@Hazanko83 Жыл бұрын
​@@user-hk3ej4hk7m Vector IS a constant time, random access container though?... Which is as good as you can get... You can do exactly what I showed with that struct thing, with it being inside of a vector, without needing a pointer variable as a member or needing anything initialized(it's just a bad idea). When you use operator[] on vector, it's simply pointer arithmetic on a pointer. After having data contiguous in memory, and increasing cache locality, clever algorithms is really all that's left... not more misdirection.
@Ben_EH-Heyeh
@Ben_EH-Heyeh Жыл бұрын
Arrays are only faster when you have cache. Singly linked lists are faster when you don't have cache. Lisp, a programming language of linked lists, functions best in 36 bit word size. 32 bit size is not optimal for Lisp. Lisp, LISt Processing, treats everything as data. Functions are themselves lists of data. Because Lisp code can be processed as data, you can write Lisp code which will write Lisp code. Something that Bjorn Strousup can't do in C++. Christian Schafmeister, a software engineer and chemical engineer, designs 3D molecular modeling software, switched from C++ to Lisp. He spent 10 years on his project, after hitting a development wall in C++, found Lisp, and wishing he had started his project in Lisp.
@shinobuoshino5066
@shinobuoshino5066 Жыл бұрын
Good thing there's no computer in the wild that doesn't have cache AND has a lot of memory. Microcontrollers have no cache but don't have enough memory to accomodate each element wasting 2+ bytes because pointers.
@Ben_EH-Heyeh
@Ben_EH-Heyeh Жыл бұрын
@@shinobuoshino5066 uLisp then. ;-D
@broyojo
@broyojo Жыл бұрын
I love that prime will never highlight the end and beginning characters of a text section
@invinciblemode
@invinciblemode Жыл бұрын
12:05
@rex_melynas
@rex_melynas Жыл бұрын
There is also the thing than CPUs have special instructions to work with arrays, strings and more higher concepts nowdays. While they can't really do much to improve linked lists. However, it doesn't mean lists are never useful, but i do prefer arrays for contiguous lists of data, but trees which are just multi dimensional lists are so nice for many other things.
@egor.okhterov
@egor.okhterov Жыл бұрын
Probably I will invent America for you, but you could store tree in array. Look up: Fenwick tree or Heap tree
@seasong7655
@seasong7655 Жыл бұрын
9:40 Does he know about parallel merge sort?
@anon-fz2bo
@anon-fz2bo Жыл бұрын
5:39 identifier 'count' is undef? unless its global?
@MichaelM-hj2mx
@MichaelM-hj2mx Жыл бұрын
This was great, do more like these, good coding article, showing some code etc. and making us think along with the video!
@williamdrum9899
@williamdrum9899 Жыл бұрын
I'm not really experienced enough to speak on the matter but I think it depends a bit on hardware. If you're doing embedded development for a single architecture, and the machine's concept of allocation is "Here's 64k of RAM, that's all you get" I'd rather just use arrays
@peachezprogramming
@peachezprogramming Жыл бұрын
A C++ developer definitely has the tools to think like this. I use Java which handles the memory allocation for me.
@CHURCHISAWESUM
@CHURCHISAWESUM 2 ай бұрын
You are my favorite tech slop ❤
@Luclecool123
@Luclecool123 Жыл бұрын
The devil lies in constants
@rafaeldeconde8148
@rafaeldeconde8148 Жыл бұрын
Is there a link to the video he mentioned in the beginning? About inserting random numbers into a sorted sequence?
@pashok1641
@pashok1641 Жыл бұрын
0:35 what video ia he talking about?
@krux02
@krux02 Жыл бұрын
this one: kzbin.info/www/bejne/mae9cqV3ntWCoNE
@ithinkimhipster502
@ithinkimhipster502 Жыл бұрын
kzbin.info/www/bejne/mae9cqV3ntWCoNEsi=piPX_a1S3wOqpwFn
@mike200017
@mike200017 9 ай бұрын
The whole O(N) being actually O(C*N) is interesting and important, but when it comes to cache locality issues, it's a different issue and much worse. The average cost of a memory access is equal to the cost of a cache miss times the probability of getting a cache miss. The probability of getting a cache miss on a random memory location grows linearly with N (memory). Also, the cost of individual cache misses increases with N because the data has to be fetched from even further away. In my own measurements, years ago, the overall cost of cache misses for random jumps in memory of size N was around O(N^1.5), meaning that something like a linked-list traversal ends up being around O(N^2.5). Ultimately, the problem is that big-O analysis assumes that all memory access whenever and wherever always cost O(1). The real world is not like that. There is a whole field of research in CS about trying to model the cost of memory accesses and incorporating that into the overall big-O analysis, but as you can imagine, it gets really complicated.
@andreykot4507
@andreykot4507 Жыл бұрын
What if objects are not strings? just big enough objects, so in this case using vactor may be not so effective
Жыл бұрын
Hash code generation is what makes sets slower than arrays with small numbers of elements.
@khatdubell
@khatdubell Жыл бұрын
You can do an inline merge sort.
@nzeu725
@nzeu725 Жыл бұрын
I just made a linked list in c today. Guess i'll make a vector tomorrow
@623-x7b
@623-x7b Жыл бұрын
It's probably because of memory locality, cache misses, something like that. [Edit: would love to see more C++ on this channel] Advent of code maybe
@codeman99-dev
@codeman99-dev Жыл бұрын
The velocity comment wasn't too far off. Big O is an indicator of acceleration. A wimpy engine (high constant) will always lose because shedding ounces won't affect the acceleration.
@egor.okhterov
@egor.okhterov Жыл бұрын
No, it's not about acceleration. O(n^2) is a set of functions. For example, a function f(n): 3n + 7 belongs to that set 😂
@BosonCollider
@BosonCollider Жыл бұрын
The actual lesson that everyone should draw is that BTrees are the god tier data structure, but sadly not empathized enough. The debate is not necessarily just between lists and vectors, BTrees with a node size close to the page size are often better than either. In Python I like to use the SortedContainers library for this. (yes, my applications are often disk bound)
@shinobuoshino5066
@shinobuoshino5066 Жыл бұрын
BTrees are only useful in databases as far as I know. Why would I use BTree for memory-only when sorted vector will outperform it in binary search anyway?
@RandomGeometryDashStuff
@RandomGeometryDashStuff Жыл бұрын
09:47 but quicksort doesn't preserve order of equal elements
@egor.okhterov
@egor.okhterov Жыл бұрын
Why would you care?
@dmitrychernivetsky5876
@dmitrychernivetsky5876 8 ай бұрын
1:49 Whaaaat? No you dont care about other side. You take the pointer of the n+1 element and adjust the point of n-1 element to point to n+1 element, right? And let the garbaage collector sort it out.
@Akronymus_
@Akronymus_ Жыл бұрын
Specifically with the array and deleting an entry and moving the rest up, can't that be done with a simple memmove?
@SVVV97
@SVVV97 Жыл бұрын
Yes, but what's your point? memmove can be quite expensive
@Akronymus_
@Akronymus_ Жыл бұрын
@@SVVV97 my point is that you can do the whole "remove the gap created by deleting an element" in one syscall. yes, it will likely be more expensive than just adjusting a few pointers and free()ing some memory. But with so much cheaper traversal due to cache locality and such, you are better off with the memmove in a lot of cases, as long as the moved memory isn't too large. Amortizing is quite powerful.
@slotos
@slotos Жыл бұрын
Yeah, we program in a heap only languages with no worries, until we stumble on a problem where we need to build, use, and discard a highly efficient lookup structure, and suddenly memory access and method dispatch is costing you precious microseconds on operations that you need to run hundreds of per web request. I really yearned for a C struct and a vector in Ruby a month ago. Still do.
@egor.okhterov
@egor.okhterov Жыл бұрын
Why do you care about hundreds of microseconds for a web page? :)
@orterves
@orterves 5 ай бұрын
As always the correct answer is to measure first
@goodgolymissmoly
@goodgolymissmoly Жыл бұрын
One other thing that rarely gets brought up is that in higher level languages with runtimes usually have highly optimized native implementations of Data Structures. So, if your in something like Java, C#, or PHP, its basically impossible to write faster data structure than what is implemented natively without writing a language extension. This because those native implementations are usually binaries that are written in something lower level like C or C++. Those implementation have better access to low level functionality abd memory management than what you can write in these higher level languages. So use whats already available to you! Of course, this is a different story if you're working with C or Rust.
@youtubeenjoyer1743
@youtubeenjoyer1743 Жыл бұрын
No, it’s precisely the opposite. In these higher level languages data structures always store pointers to tables of pointers to data and functions, instead of just storing the data. All this indirection “levels the playing field” by trashing your cache no matter what your program is doing. When you remove the benefit of contiguous memory and cpu cache, suddenly all asymptotic notations become useful.
@goodgolymissmoly
@goodgolymissmoly Жыл бұрын
@@youtubeenjoyer1743 I stand corrected.
@SirMedvind
@SirMedvind 11 ай бұрын
0:01 Yes
@TehPwnerer
@TehPwnerer Жыл бұрын
Use a linked list and keep a separate binary search tree list, get the best of both worlds
@jonbondMPG
@jonbondMPG Жыл бұрын
If in doubt, vector it out. Until such time as you can make an informed decision.
@joshuasanders4302
@joshuasanders4302 2 ай бұрын
"because we've been taught this our whole life" yes, because professors teaching software engineering have no idea what they are doing, couldn't program themselves out of a paper bag, and their previous profession was "student". Cache locality is huge, linked lists are *almost* always the devil, but always measure regardless.
@HrHaakon
@HrHaakon Жыл бұрын
If your collection is already sorted, bubblesort will spank quicksort when it comes to performance. So if your array have 1-2 things out of place, and a size of $HUGE, it might very well beat qs.
@williamdrum9899
@williamdrum9899 Жыл бұрын
Problem is you can't tell how close to sorted an array is until after you sorted it
@HrHaakon
@HrHaakon Жыл бұрын
@@williamdrum9899 True, but real data in the wild is often "almost sorted" (for a given variety of almost), so if we want to be boring there's a reason why merge-variants like Timsort are used in libraries.
@KoltPenny
@KoltPenny Жыл бұрын
We all know that a fully connected graph is the superior data structure.
@heavymetalmixer91
@heavymetalmixer91 Жыл бұрын
You mentioned that everything in JS is on the heap, but what about Python?
@batatanna
@batatanna Жыл бұрын
It's on the heap heap
@heavymetalmixer91
@heavymetalmixer91 Жыл бұрын
@@batatanna Wut
@batatanna
@batatanna Жыл бұрын
@@heavymetalmixer91 hooray
@mariansalam
@mariansalam Жыл бұрын
Gotem
@YoTengoUnLCD
@YoTengoUnLCD Жыл бұрын
You can safely assume most of these interpreted languages do (almost) everything in the heap.
@anarchymatt
@anarchymatt Жыл бұрын
Finger trees are kind of dope
@SHONNER
@SHONNER Жыл бұрын
I remember the original video. It's an eye-opener.
@onground330
@onground330 Жыл бұрын
i cant find it, whats the link ?
@SHONNER
@SHONNER Жыл бұрын
@@onground330 I wouldn't be able to find it either. 8 months ago is when I saw it.
@egor.okhterov
@egor.okhterov Жыл бұрын
Eyeppenheimer
@mytechnotalent
@mytechnotalent Жыл бұрын
I feel heepy.
@egor.okhterov
@egor.okhterov Жыл бұрын
I feel unhappy
@vinylSummer
@vinylSummer 11 ай бұрын
I feel eepy
@sahilahmed3066
@sahilahmed3066 Жыл бұрын
First KZbin live and now Frontend Masters Workshop...Man you are so delicated into the Matrix 🥺🫂✨
@ludoviclagouardette7020
@ludoviclagouardette7020 Жыл бұрын
I think that even better than vector, deque would probably shine for that problem, as deletion and insertion towards the start would be slightly faster yet still very predictable
@shinobuoshino5066
@shinobuoshino5066 Жыл бұрын
Except in most cases, std::vector beats std::deque.
@noxagonal
@noxagonal 10 ай бұрын
Tried linked lists and vectors in C++. Linked list was always slower... ALWAYS. iterating is slower, adding is slower, removing was sometimes faster but considering you have to iterate to find what you want to remove... it ended up being slower. This was with containers with over a million elements. Cache misses are MASSIVE time sinks, as in theory your cpu can do around 4000 floating point multiplications (assuming avx512, 1000 with sse2, don't quote me on exact numbers though) in the same amount of time it takes to recover from level 3 cache miss. Just something to think about. And you have a cache miss per element. On the other hand, copying is really fast.
@egor.okhterov
@egor.okhterov Жыл бұрын
Try sorting 1TB file without merge sort
@SHONNER
@SHONNER Жыл бұрын
So should I be using MongoDB? Or something else.
@egor.okhterov
@egor.okhterov Жыл бұрын
Yes
@vinylSummer
@vinylSummer 11 ай бұрын
Yes
@VojtěchJavora
@VojtěchJavora Жыл бұрын
I often disagree with Stroustrup, but he is a very smart person and this article shows that.
@eggmeister6641
@eggmeister6641 Жыл бұрын
As someone who's been programming in C++ for 8 years, what's a list?
@mmj-video-logs
@mmj-video-logs Жыл бұрын
std::list is a doubly linked list implementation in standard template library
@CrIMeFiBeR
@CrIMeFiBeR Жыл бұрын
in dutch there is the saying : " Meten is weten" = Measuring is knowing
@rolandcsaszar5479
@rolandcsaszar5479 Жыл бұрын
In german, there is the proverb "wer misst, misst mist". Who measures, measures bullshit.
@Raspredval1337
@Raspredval1337 Жыл бұрын
always measure the perf. One time I thought it would be clever to replace a hash map with a continuous sorted memory and use binary search. And for a large input size (like 1M elements) the binary search on a presorted memory is more than 1000 times slower than hash map 🤯
@iMagUdspEllr
@iMagUdspEllr 8 ай бұрын
A hash map has O(1) access. The hash map should always beat O(log(n)). Why would an array beat a hash map?
@Raspredval1337
@Raspredval1337 8 ай бұрын
@@iMagUdspEllr it depends on implementation of the hash-map. There are 2 kinds of hash-maps: sorted and unsorted(bucket based). Sorted maps implemented with red-black trees (with log(n) access complexity) and work best with small (
@skilz8098
@skilz8098 Жыл бұрын
All of the various algorithms that can be applied to all of the various tables and or data structures in many cases do have overlapping complexity curves in both time execution and memory footprint. These overlapping curves or more aptly put is the intersection between said curves. When it comes to BIG O notation the first thing to remember is that it's usually represented by Worst Case Scenario, and as is mentioned in the video the constants or coefficients that are dropped from the representation. Big O is not precisely but more relatively, approximately, or generally speaking. So generally speaking, you have Constant, Log, Linear, Log Linear, Quadratic, Cubic, Polynomial and Exponential: O(1), O(log n), O(n), O(n log n), O(n^2), O(n^3), O(n^k), O(2^n) respectively and each algorithm is categorized by the rate of change of its runtime into relation with the change in its input as it grows in size and the category is always by worse case scenario. There are times where one classified algorithm on a specific data set would be assumed that it would run faster than another where one is commonly known to be say O(log n) and the other is O(n). Yet in some circumstances which could be depending on other factors, looping structure, branching with the branch predictor, data size or size of data structure within the provided container causing a bunch of cache misses, could end up causing the O(n) algorithm to outperform the O(log n) algorithm. Here the O(log N) algorithm for some reason is running at its absolute worse, and the O(N) algorithm in this same case might even be optimized by the compiler, OS, drivers, assemblers, etc, no issues with branch predictor, rarely any cache misses, it could be internally vectorized through SIMD, etc... and could end up running at an O (log N) or even better almost approaching constant time... So basing your Code on Big O Notation isn't always the most suitable. Knowing how various algorithms work on various containers with said data structures and data sets internally, and know what to expect from them in various different situations I think is more important. For example if you know the size your data's container is never going to change, then a simple raw array will suffice but one has to be careful with invalid indexing and invalid pointer arithmetic. If you know the container's length or size is going to change periodically but not all too often then either a vector a list or set would suffice which could also depend on how you into to use or access it. If you're going to need direct element access via indexing then vector is more appropriate. If you need faster insertion and deletion rates then perhaps a list. If you know each element is going to be unique then a set. If you have a specific preferred ordering them perhaps a queue or deque such as a priority list, if you need a FIFO, FILO or LIFO, LILO, type of structure than maybe a stack or their variants. The type of data, the size of the data, the amount of elements within the container, the type of algorithms that are going to be performed on the data itself, on the data structure, and even on its container are just some of the factors in choosing the appropriate container / algorithm combo. There might be times where you have one section of code that doesn't need to be super fast and in fact you're running it in its own separate thread in parallel to another thread that is crunching through a huge loop and you want them to finish approximate at the same time. It's okay if the smaller thread isn't fast but you don't want it to end too early and waiting for too long for the threads to be joined... So maybe you choose a slower algorithm instead because you want both sections of code to complete in say approximately 50ms. So there are many factors as to why, when, where and how you might use said components. Big O might be a helpful guide at comparing various algorithms/containers but it gives you absolutely nothing in terms or regards of how and why you might use one over another. Many times those choices could be situation dependent or could be programmer preference choice... I've been listening to Bjarne Stroustrup on and off throughout the years, nearly 20 years now, and one of my favorite videos was when he was in a discussion with Jason Turner a young and brilliant Modern C++ developer, and a few others, some who are even on the committee or board for the drafting of the Language Standard.
@kahnfatman
@kahnfatman 2 ай бұрын
A new version of "GOTO considered toxic"?
@naive_algorithm
@naive_algorithm Жыл бұрын
If c++ introduces ownership and borrowing, will it be great again?
@isodoubIet
@isodoubIet Жыл бұрын
It never stopped.
@youtubeenjoyer1743
@youtubeenjoyer1743 Жыл бұрын
C++ will never be great. You have to start over and never add any bad language features like two flavours of pointers, two colours of struct, methods, private/public/protected, inheritance, constexpr, RAII, exceptions, and a ton of undefined behavior. Also the standard library is in a very sad state, but it’s the cruft that c++ will have to carry until the end.
@isodoubIet
@isodoubIet Жыл бұрын
@@youtubeenjoyer1743If you want to complain about C++ you should probably not give a list of some of its awesome features.
@shanep2514
@shanep2514 Жыл бұрын
The difference between continuous memory and linked memory should be explained as such: If you have a library with 30 rooms and ask people to find and move books per request: is it easier if the books are in 3 adjacent rooms that need to be accessed and moved? or will it be easier to find books (with a little note showing what room has the next book) within any of the 30 rooms that may be on opposite ends of the library? (It's a little hyperbolic, but it should make sense why linked memory isn't always better) Of course libraries prefer categorized shelves of continuous memory.
@trapexit
@trapexit Жыл бұрын
I think the fact people who know O notation don't understand this point is ultimately because they really don't understand O notation.
@egor.okhterov
@egor.okhterov Жыл бұрын
You really understand what O(n) is when you also understand what o(n) and Θ(n) are.
@evadeflow
@evadeflow Жыл бұрын
Interesting. This article makes me appreciate the pragmatism of Lua’s design even more. It doesn’t just “use vectors (dynamic arrays) by default”, it uses them (tables) for _everything._ 🤔
@ea_naseer
@ea_naseer Жыл бұрын
yes there's a name for that... I've forgotten what it was
@BoardGameMaker4108
@BoardGameMaker4108 Жыл бұрын
Tables are probably maps, which are much slower than vectors (unless they use some kind of wizardry underneath).
@testtest-qm7cj
@testtest-qm7cj Жыл бұрын
@@BoardGameMaker4108 Lua "table" is a hybrid of a dynamic array, and a dictionary/map/lookup-table/whatever-you-call-it, under the hood. That is why there exists separate ipair() and pair() in the first place. If you use Lua table with integer indices, it switches to the underlying dynamic array; if with keys, then a dictionary.
@MSheepdog
@MSheepdog Жыл бұрын
Lua does some interesting things to make tables that work as arrays as well as hash maps. You start to see some of the issues with the approach if you've ever tried to use ipairs on a table that wasn't only written to using table.insert(). I actually think it'd be better if Lua had arrays and tables as separate data structures instead of the one type of table that you can use incorrectly.
@evadeflow
@evadeflow Жыл бұрын
@@MSheepdog, I really like the _brutishly_ simple design of Lua. It’s _so_ easy to embed, which is really what it’s meant for. I know people write entire _apps_ in Lua, but… if you’re mostly exposing entry points into another language and writing some glue code to customize policy & business logic, I think the “everything’s a table” design is reasonable. I’m aware there are pitfalls of ‘mixing paradigms’ with tables (i.e., treating the same table as both an array _and_ a dictionary)-but, the way I’m using Lua-I haven’t personally been bitten by them.
@khatdubell
@khatdubell Жыл бұрын
"For a big enough 1 and a small enough n, O(1) is slower than O(n)" Clever, but this is wrong. Because Big O is not a measure of speed. This is akin to me measuring the number of lanes on a road to tell you how fast the cars can driving on it are going.
@BurninVinyl
@BurninVinyl Жыл бұрын
2:35 everytime when Prime goes technical I feel like a computer with some missing drivers.
@Minolrx
@Minolrx Жыл бұрын
Java uses merge and quick for their Aarrays inplementation
@joejoesoft
@joejoesoft Жыл бұрын
Java uses neither. See: dual pivot quicksort. (which uses parts of insertion sort and quick sort )
@Minolrx
@Minolrx Жыл бұрын
@@joejoesoft dual, tripple is quicksort. It also uses merge but this is not the channel to discuss specifics. 🤗
@Minolrx
@Minolrx Жыл бұрын
@@joejoesoft amd for the record the term is timsort. Go check the specifics if you wish to.
@joejoesoft
@joejoesoft Жыл бұрын
@@Minolrx It's been a while since I've looked at DualPivotQuicksort.java, but it now indeed does use more than just Insertion and Merge sort. A lot has changed since 2009. I stand corrected. I didn't look before the dual pivot change, but I vaguely remember it was bubble sort and single pivot quicksort at some point.
@egor.okhterov
@egor.okhterov Жыл бұрын
There are actually 3 types of asymptotic notations: 1. Big O(n) 2. Small o(n) 3. Theta Θ(n) Each of them is a set/class of functions {f1, f2, f3, ...} that are all indistinguishable between each other when we look at them through a certain lense.
@Exilum
@Exilum Жыл бұрын
I mean, velocity is the wrong word, as it just means speed, but acceleration could work.
@egor.okhterov
@egor.okhterov Жыл бұрын
Velocity is a vector. Acceleration is a vector. Both are wrong analogies when thinking about O(n)
@Exilum
@Exilum Жыл бұрын
@@egor.okhterov As mathematical objects, sure, as concepts, not necessarily. O(n) really just tells you how fast the execution time grows, not what its base speed is. So in that sense, it's as good an analogy for O(n) as speed is for execution time. We use fast, speedy, quick, and other adjectives for programs, despite them not physically moving and having no measurable speed. But they figuratively move from A to B at a set speed.
@egor.okhterov
@egor.okhterov Жыл бұрын
@@Exilum now I better understand this perspective, thank you for clarification. This is not how I think about it though. I am thinking of an estimate for the number of atomic operations instead of thinking about time. Speed, velocity and acceleration are all tightly coupled and based on the concept of time. The time has nothing to do with O(n) notation, but rather it is the story about problem configuration space. Surely, in the end when we are actually starting to walk across that constructed space the time manifests itself in the real world as the result, but the time could change even if the configuration space stays the same.
@Exilum
@Exilum Жыл бұрын
@@egor.okhterov Yeah, I am not saying it is the best analogy, and that's not what my original comment was meant to convert either. I pretty much just wanted to point what the chatter probably meant in chat when they said "velocity". That's pretty much why I said "could work", it's not perfect, but pretty good for teaching purposes. While it doesn't convey the structure, it conveys the relationship between the O notation and the "result" it "approximates" (a lot of quotes there).
@eksortso
@eksortso Жыл бұрын
There do exist fast in-place merge sorts. Batcher's odd-even merge sort comes to mind. But I would still test this with real data before using it, of course. Radix sort is still the go-to for ordering billions of values.
@AJewFR0
@AJewFR0 Жыл бұрын
The amount of people in comments that misunderstand undergrad-level CS fundamentals is concerning.
@egor.okhterov
@egor.okhterov Жыл бұрын
You could make a few hypothesis from this observation that could make you money
@williamdrum9899
@williamdrum9899 Жыл бұрын
Well that's one of the downsides of abstraction: people don't know how the computer actually works
@kennythegamer1
@kennythegamer1 Жыл бұрын
The difference that you're trying to explain is between the instruction function, such as f(n)=5n^2+2n+12, which encodes the time the algorithm will run given some input (of course, length of collection is an obvious factor and the default example most times), and the asymptotic classes of functions that it belongs to. Here, the obvious answer is that f is in O(n^2), but that also means it's in O(2n^2), O((n^2)/2), O(en^5), and O(n^7+16n^3) because all that Big O means is that the function is bounded above by the function in the O multiplied by a constant k or greater (technically, this is for all n > m, for some constant m dependent upon that coefficient). All that is to say that Big O does have it's use in that moving an algorithm from O(n^5) to O(n log(n)) time complexity will make the algorithm faster over the majority of the domain, but it is still crucial for one to know what part of the domain one's input space covers whether one uses Big O, Big Omega, and Big Theta or the instruction function itself. The advice for the rest of what appears in the video is essentially measure unknowns and don't concretize too early.
@firefuegomakes
@firefuegomakes Жыл бұрын
By who 😂😂😂?
@Crow-EH
@Crow-EH Жыл бұрын
Please note that "List vs Vector" is not really a valid debate in most "modern" languages, since the default/preferred list implementation will usually be backed by an array and pretty close to a C++ Vector. like Java's ArrayList. Though the "only reach for a linked list/hash set/tree set/any specialized structure when you know you need it" advice is valid in almost any language. And please don't use Java's Vector class, it's deprecated, slow, and should have been called ArrayList anyway.
@YoTengoUnLCD
@YoTengoUnLCD Жыл бұрын
The debate is valid in any programming language, you just need to know that “list” means linked list, and “vector” means dynamic array (well, vector by itself was clear enough).
@Crow-EH
@Crow-EH Жыл бұрын
@@YoTengoUnLCD True. I'm just targetting my fellow Javaist degenerates that'd try to use Vector thinking all lists are linked lists. :D
@divad1196
@divad1196 Жыл бұрын
There is a C++ talk about library implementation. The standard sort use a threshold and use a different sorting method depending on the number of element
@ruroruro
@ruroruro Жыл бұрын
Stroustrup: This is example is NOT about how big-O can be deceiving. Big-O-magen: So, anyway, let me spend the next 10 minutes explaining how big-O notation can be deceiving.
@RicardoSilva-hk2er
@RicardoSilva-hk2er Жыл бұрын
Lists are clearly the work of the devil
@egor.okhterov
@egor.okhterov Жыл бұрын
No
@az8560
@az8560 Жыл бұрын
I'm afraid there are some dumb people who will use this video to justify writing the most ridiculous N^(100-theirIQ) algorithm and argue ferociously that they are virtuous.
@timseguine2
@timseguine2 Жыл бұрын
I only disagree with Bjarne slightly. My advice is to use whichever container makes your code the most readable by default unless you have a good reason not to. And then even after that you should probably use an adaptor (which gets optimized away by the compiler) so that the code uses the readable use paradigm, but the faster algorithm. For example if a map makes logically the most sense, use a map or hashmap. If you later find out through benchmarks that a contiguous, sorted storage is better for performance, then use an adapter like flat_map so you don't have to compromise the clarity of representation of your business logic to take advantage of a faster algorithm. If you have to implement it yourself, follow the YAGNI principle and only implement the methods that you need, but with almost certainty someone has already implemented what you want in a library already.
@joejoesoft
@joejoesoft Жыл бұрын
The assumption I always see with this argument is this is running for a single user/client. The entire argument relies on a specific context, but I suspect will fail when it's scaled by concurrent users/clients. Being memory greedy just means you'll hit the barrier where you're not getting cache hits faster. I'd say front end should be fine with the "default to vector" idea. Backend should think twice and embedded should probably ignore. Everyone should be re-thinking using C++ at all, moreover.
@egor.okhterov
@egor.okhterov Жыл бұрын
Prove it
@joejoesoft
@joejoesoft Жыл бұрын
@@egor.okhterovI stated I suspect it, therefor I have proven that I suspect it.
@thadtheman3751
@thadtheman3751 Жыл бұрын
This whole video is silly. Anyone who understand what is happening in a computer knows that cache hits are much slower then moving even large chunks of memory. All Bjarne is doing is pointing out that lists cause a lot of cache hits. As for what collection to use in a program abstract the collection so you can change it later.
@theglobalsouth9680
@theglobalsouth9680 Ай бұрын
chromokopia
120x Faster Algorithm By Nested Loops
21:21
ThePrimeTime
Рет қаралды 386 М.
Why You Should AVOID Linked Lists
14:12
ThePrimeTime
Рет қаралды 284 М.
Леон киллер и Оля Полякова 😹
00:42
Канал Смеха
Рет қаралды 4,7 МЛН
Мен атып көрмегенмін ! | Qalam | 5 серия
25:41
You're Not Qualified To Have An Opinion On TDD | Prime Reacts
13:37
Stop using std::vector wrong
23:14
The Cherno
Рет қаралды 165 М.
Code Review: CPP
12:29
TheVimeagen
Рет қаралды 73 М.
C++ Super Optimization: 1000X Faster
15:33
Dave's Garage
Рет қаралды 332 М.
Gitlab DELETING Production Databases | Prime Reacts
17:27
ThePrimeTime
Рет қаралды 364 М.
The Downsides Of C++ | Prime Reacts
21:23
ThePrimeTime
Рет қаралды 150 М.
Tutorials Are KILLING Your Growth | Prime Reacts
17:35
ThePrimeTime
Рет қаралды 134 М.
Python vs Swift | Prime Reacts
19:44
ThePrimeTime
Рет қаралды 84 М.
Async Rust Is A Bad Language | Prime Reacts
28:46
ThePrimeTime
Рет қаралды 112 М.
Every Framework Sucks Now
24:11
Theo - t3․gg
Рет қаралды 143 М.
Леон киллер и Оля Полякова 😹
00:42
Канал Смеха
Рет қаралды 4,7 МЛН