How IEnumerable can kill your performance in C#

  Рет қаралды 109,703

Nick Chapsas

Nick Chapsas

Жыл бұрын

The first 100 of you can use code SCHOOL2022 for 20% off courses and bundles at dometrain.com
Become a Patreon and get source code access: / nickchapsas
Hello everybody I'm Nick and in this video I will show you how IEnumerable can harm your application's performance. I will explain why it happens, what you can do about it and how to deal with it in future scenarios.
Don't forget to comment, like and subscribe :)
Social Media:
Follow me on GitHub: bit.ly/ChapsasGitHub
Follow me on Twitter: bit.ly/ChapsasTwitter
Connect on LinkedIn: bit.ly/ChapsasLinkedIn
Keep coding merch: keepcoding.shop
#csharp #dotnet

Пікірлер: 234
@TheBreaded
@TheBreaded Жыл бұрын
I swear this is one of those things resharper has taught me with it's warnings, I rarely see it now because I know better. Great explanation of multiple enumerations.
@MicrosoftLifecam1
@MicrosoftLifecam1 Жыл бұрын
Agreed - Before using Rider I had no idea lol
@ivaniliev93
@ivaniliev93 Жыл бұрын
Me too
@TonoNamnum
@TonoNamnum Жыл бұрын
Visual studio also gives you the same warning.
@tonyschoborg
@tonyschoborg Жыл бұрын
@@TonoNamnum there must be a setting for that. We recently just upgraded to Rider from VS and have caught a few times we missed it. I have never seen the warning until we upgraded.
@crack8160
@crack8160 Жыл бұрын
haha same, when I installed resharper I got aware of this situation.
@marcotroster8247
@marcotroster8247 Жыл бұрын
This enumeration style is called co-routine for those who didn't know. You basically have a function on hold that can give you the next element right when you need it 😄 Actually this is a crazy efficient way to represent e.g. endless streams like indices from 1 to n, e.g. for n=int.MaxValue this is 2^31-1 * 4 byte. Your PC would simply explode if you'd call ToList() on it because it's 8GB of data. But a co-routine like Enumerable.Range() could do that with just 2 int variables and 8 byte. It really makes a huge difference as you can keep this little chunk of 8 byte in faster cache levels of your CPU and crank on it like crazy. A ToList() too less or too much can make your program run 2 hours instead of 1ms 😅😅😅
@fred.flintstone4099
@fred.flintstone4099 Жыл бұрын
I don't think that is called a "co-routine", I think it is called an "iterator". So it is like a class that implements an interface that has a next() method, so the foreach loop calls the next method every time it loops.
@marcotroster8247
@marcotroster8247 Жыл бұрын
@@fred.flintstone4099 Historically speaking, when coroutines were invented, all machines were single-core. Real hardware multiprocessing wasn't even a thing until mid 2000s. So, the resulting programs back then were fetching from something really similar to what we call an iterator nowadays. It's all about the illusion of concurrency by decoupling consumer from producer. And honestly, most time in compute is still spent waiting. Waiting for operations to write results back, waiting for registers to be loaded from cache, waiting for jumps because of control flow, waiting for cache synchronization between processor cores, etc. Our code in C# is really just an illusion of what's actually happening. But feel free to tell me where I'm wrong. Maybe I can learn something profound.
@fred.flintstone4099
@fred.flintstone4099 Жыл бұрын
@@marcotroster8247 No, I think you're right. I think an iterator that waits might also be called a "generator". In C# you can use the "yield" keyword for iterators. There is also IAsyncEnumerable.
@marcusmajarra
@marcusmajarra Жыл бұрын
A common recurring problem among programmers is not knowing how the code they're using works. At the very least, they should understand what the API commits to doing. Deferred enumeration of IEnumerable is a great feature in C#, but if you're using any API that exposes an IEnumerable object, you should always assume that you need to enumerate at some point, unless your objective is to merely chain subsequent operations to perform on the object. In fact, if you never actually enumerate the sequence, it will never actually execute, and this is also an easy trap to fall into. So my best advice would be to write your API according to what client code should expect. If you're returning a finite object, rather than returning IEnumerable, you should return IReadOnlyCollection or IReadOnlyList (or any read-only interface). That way, client code knows that enumeration has already been performed. If you return IEnumerable, client code should assume that enumeration will be required, and even the implementation should probably avoid enumerating to a terminal operation. JetBrains.Annotations also has the [NoEnumeration] attribute that you can assign to an IEnumerable method parameter to indicate that your method isn't performing a terminal operation over the parameter.
@youcantsee3867
@youcantsee3867 Жыл бұрын
What do you mean by 'you should always assume that you need to enumerate at some point, unless your objective is to merely chain subsequent operations to perform on the object' ? I don't understand the part ' you need to enumerate at some point'. Could you explain it for me?
@marcusmajarra
@marcusmajarra Жыл бұрын
@@youcantsee3867 it means that if you're dealing with an API that provides an IEnumerable object to you, you should assume that no actual query operation has yet to happen behind the scenes. It is only when you enumerate that the query is actually executed. For example, if the API operation is fronting a database call, no query is run against the database until you first enumerate over the results. This is different from working with a list or an array, which has already been materialized with contents. The enumerable object has no contents until you enumerate.
@youcantsee3867
@youcantsee3867 Жыл бұрын
@@marcusmajarra Thanks for your reply. I have one more question, so the word 'enumerate' is means calling some method like 'count()' 'toList()' or calling for each. Am I right?
@marcusmajarra
@marcusmajarra Жыл бұрын
​@@youcantsee3867 essentially. If you're not digging into the results, you're not enumerating.
@Tekner436
@Tekner436 Жыл бұрын
@@youcantsee3867 A good example would be for instance in C# you call a function that returns an IEnumerable - IEnumerable list = GetList(); - you would think that doing a foreach (var item in list) twice would use the same results from GetList(); but the IEnumerable interface doesn't actually grab any data until it is 'enumerated' in a foreach. That means each foreach of list will execute all the actions GetList() did to build the IEnumerable result. You could something like IEnumerable query = customers.Where(c => c.Active); List result1 = query.ToList(); List result2 = query.ToList(); It's possible that result1 differs from result2. if customers data is backed by a database, each call of ToList (or any foreach statements) on the IEnumerable will build the query, execute it, and return a new result.
@mastermati773
@mastermati773 Жыл бұрын
When I started to think about it more deeply, this system is actually very very good: If we have some enumerable thing A given for a consumer B, how could B assume that it has enough memory to hold all elements of A? Ans: It can not, and thus it protects itself with this solution of multiple enumerations: If file read in this video was gigantic (let's assume milions of lines) then multiple enumeration IS desired! The solution is just to use IReadOnlyList, which has enough space saved prior to the enumeration.
@rafaelm.2056
@rafaelm.2056 Жыл бұрын
I been programming with C# for about 15 years and there are parts about it that still mystify me. Your example of obtaining a count via an IEnumerable reminded me of how I learned on my own a similar situation with your example. In my case I was loading over 100k records. EF was new to me and I couldn't understand why my app was taking a performance hit until I discovered the difference between IEnumerable and IQueryable. From then on it forced me to take into consideration the overall purpose of the program and how to use IEnumerable properly. You are very well versed in the programming language, more than me after working with C# for so long. On a side note, back when I was learning programming in 1991 I asked a senior developer of our mainframe why people are sloppy with their code. He told me that it will only get worse because as computers get faster it will compensate for bad coding practices and the end result will be lazy programmers. I came from learning to program on a mainframe environment where every byte counted. We ran accounts payable and payroll for 300 employees. All of it was done on a 72 megabyte hard drive.
@kenbrady119
@kenbrady119 Жыл бұрын
I seem to remember the LINQ documentation explicitly stating that Enumerables are lazy-evaluated. It is a feature, one that all developers should be cognizant of so that they can force one-time evaluation when appropriate.
@jamesmussett
@jamesmussett Жыл бұрын
The biggest problem I have with Linq in general rather then IEnumerables is the heap allocation that takes place when evaluating queries with ToList() and the like in memory-sensitive hot paths. In almost every other scenario it's absolutely fine, but it makes my life hell when I have to do rate calculations on 100-500 messages/s. Would be good to see you cover MemoryPool and ArrayPool at some point, those types have truly saved my bacon!
@nickchapsas
@nickchapsas Жыл бұрын
I have a video about object pooling coming, probably around October or November. It's a really interesting topic
@jamesmussett
@jamesmussett Жыл бұрын
@@nickchapsas Perfect, I look forward to it! =)
@sealsharp
@sealsharp Жыл бұрын
@@nickchapsas Sweet!
@asteinerd
@asteinerd Жыл бұрын
Great illustration of how/why this happens. Something I can send to my peers that get confused as to why their code is hitting an API twice when running around with IEnumerable or IQueryable.
@asdasddas100
@asdasddas100 Жыл бұрын
I feel like you can explain this in 1 minute if they're already experienced programmers, but if they're new this would be helpful
@Max-mx5yc
@Max-mx5yc Жыл бұрын
IEnumerable - The fast-food restaurant of programming
@michaellombardi3638
@michaellombardi3638 Жыл бұрын
You have no idea how much this helped me today! I was looking at a problem where counting an IEnumerable with zero elements in it resulted in a significant delay and I thought I was going crazy! I had no idea that IEnumerable would be lazily evaluated. Thanks for the help! :)
@SmoothSkySailin
@SmoothSkySailin Жыл бұрын
Great video! I always feel good about myself when I know exactly what the problem is and what your solution is going to be at the start of the video... It doesn't happen often, but when it does, I give myself a gold star :-) Thanks for posting such good content!
@mariorobben794
@mariorobben794 Жыл бұрын
My personal choice is to return an I…Collection, so that the consumer knows that the “inner” code isn’t deferred. Of course, there are situations where an IEnumerable is better, for instance when implementing repositories. But such repositories are mostly consumed from other application specific services.
@megaFINZ
@megaFINZ 6 ай бұрын
It's fine if result is supposed to be mutable because ICollection exposes things like Add and Remove. Otherwise you'd want to return something read-only: IReadOnlyCollection or ImmutableList or similar.
@emmanueladebiyi2109
@emmanueladebiyi2109 Жыл бұрын
Great stuff Nick. Your impact on my programming had been tremendous!
@stephajn
@stephajn Жыл бұрын
This is something I knew about and have been working to pass on to others as well. Thanks for making this video. I will share this with them in the future!
@nocgod
@nocgod Жыл бұрын
it really is quite clean, there is even a warning (at least in visual studio) CA1851: Possible multiple enumerations of IEnumerable collection it just requires the developer to read the warning and handle it. (or the senior developers elevate this from warning to a compilation error)
@superior5129
@superior5129 Жыл бұрын
A bigger problem with methods that return IEnumerable is when they take parameters like a Stream or any IDisposable.
@joost00719
@joost00719 Жыл бұрын
I learned this the hard way too, but it was a very important and interesting lesson to learn. Glad you made a video on this because it is a very important feature in .NET that can make or break your application.
@marna_li
@marna_li Жыл бұрын
Great point! There was a riddle posted not long ago, showing what would happen if you essentially Task.Run in a LINQ query. LINQ is defered execution of code so you have to be careful. The auther of the riddle told me about a nasty bug regarding this - logging stuff in a query. Always make sure that you evaluate the query or else you might run code multiple times.
@Arekadiusz
@Arekadiusz Жыл бұрын
Whoa, for the past one year I was getting sometimes warnings "Possible multiple enumerations" and never knew what does it mean :V Thank you!
@krftsman
@krftsman Жыл бұрын
I was just giving my developers a lesson on this exact topic last week. I wish I could have just pointed them at this video! Thanks so much!
@DanStpTech
@DanStpTech Жыл бұрын
yes I knew and it caused me a lot of trouble. Thank you for your explanation, always appreciated.
@LordXaosa
@LordXaosa Жыл бұрын
Materialization is not good option too. What if you file is 200GB size? Or what if there is pseudo infinite enumerable like network data or reading database cursor? So you can't always cast to list because of memory. So yes, watch you code and do what you understand. yield return is not bad if you know what you are doing.
@henrikfarn5590
@henrikfarn5590 Жыл бұрын
I agree! Understanding your code is the mantra - at one point in time IEnumerable was THE way to do it in my company. For large payloads IEnumerable is great but applying it everywhere is an antipattern
@battarro
@battarro Жыл бұрын
Then treat it as a stream. On the scenario he gives, if the file is 200GB ReadAllLines will create a 200GB memory array of strings, so ReadAllLines is not the appropriate method to read such a large file, you have to stream it in.
@billy65bob
@billy65bob Жыл бұрын
@@battarro it would actually be well to 400GB, the file is likely ascii/utf-8, whereas the in memory representation is UCS-2, which is 16-bit (and similar to utf-16).
@DanielLiuzzi
@DanielLiuzzi Жыл бұрын
@@battarro ReadAllLines won't do streaming but _ReadLines_ will
@epiphaner
@epiphaner Жыл бұрын
This was exactly the solution I was hoping to see because that is what I have been using for years :) As for the return type, I always return as specific as possible while accepting as generic as possible. Worked well for me so far!
@MiroslavFrank
@MiroslavFrank Жыл бұрын
IReadOnlyCollection
@andytroo
@andytroo Жыл бұрын
the IEnumerable approach is the only sensible one in some situations, if there are too many items to fit in memory. I find myself using a 'batchBy(int n)' approach: it turns IEnumerable to IEnumerable so that you can work on a smaller list, but if things are too big, you can take them in byte sized chunks. It does mean something like 'count' (or other things that require global knowledge) can only be accumulated and discovered at the end of the list.
@stefanvestergaard
@stefanvestergaard Жыл бұрын
You could also adress how yield return's are dangerous in that the source list can be changed between enumerations, e.g. items removed between the .Count() and the output.
@markinman3119
@markinman3119 Жыл бұрын
Totally didn't know about it. Thanks Nick.
@leahnjr
@leahnjr Жыл бұрын
I did not know/understand this, but now I totally do. Thanks!
@jongeduard
@jongeduard Жыл бұрын
Basically it's very simple: LINQ is a pipeline (and chained yield returning function calls are as well). It's a series of enumerators chained together like a single expression, and will not be running until you run a loop on it, to perform actual work. A function like Count() is a terminating operation, because it does not return an IEnumerable by itself but a computed, numeric result, meaning it has to run a loop on the the preceding expression. And a self written foreach loop is basically another terminating operation. ToList and ToArray are as well, they create a new collection in memory and run a loop to fill it with data. This means that ToList and ToArray come with the disadvantage of extra memory allocations. While not using them and repeating loops on the expression come with a time and CPU usage penalty, like basically shown in the video.
@billy65bob
@billy65bob Жыл бұрын
Count() is actually smart, if the underlying type is an ICollection, it will return ICollection.Count instead of evaluating the IEnumerable.
@OwenShartle
@OwenShartle Жыл бұрын
A key phrase, which is maybe more of a LINQ term, that I was also hoping to hear was "deferred execution". Great topic to be privy to, and great video, Nick!
@Max_Jacoby
@Max_Jacoby Жыл бұрын
The most surprising fact is an amount of people who doesn't know that. If you ever used "yield" keyword it kinda obvious that IEnumerable MyMethod() returns T one by one and don't store the whole thing in memory hence the second call to this method will calculate Ts one by one again. I can see a confusion though if you always get IEnumberable from a third party and never used "yield" keyword yourself then yes, it's not obvious.
@andreast.1373
@andreast.1373 Жыл бұрын
I've seen that warning before and had no idea what it meant. That was a great explanation, thanks for the video!
@harag9
@harag9 Жыл бұрын
Is that warning just a Jetbrains warning, or does it appear in VS2022 now?
@andreast.1373
@andreast.1373 Жыл бұрын
@@harag9, to be honest, I'm not sure but I believe it's only a JetBrains warning.
@harag9
@harag9 Жыл бұрын
@@andreast.1373 Thought it might be, never seen the warning in VS. Cheers.
@MrSaydo17
@MrSaydo17 Жыл бұрын
If I hadn't already been using R# for the last 4 years I wouldn't have ever known about this. Great explanation!
@stoino1848
@stoino1848 Жыл бұрын
I knew about that and also felt into the downsides of it. Since then I am cautious when I get an IEnumerable and check my call stack if it is used multiple times (aka enumerated). But also I remember to have read in the official c# best practice guide to use IEnumerable as return type and parameter. (did not looked it up again).
@akeemaweda1716
@akeemaweda1716 Жыл бұрын
I didn't know about this before and will be more careful about the usage going forward. Thanks Nick
@tonyschoborg
@tonyschoborg Жыл бұрын
Funny you should come out with this video. We recently upgraded to using Rider and have caught a few times we missed this when using Visual Studio. Thanks for the content as usual!
@harag9
@harag9 Жыл бұрын
Thanks for sharing this, didn't know about it, but I personally never use IEnumerable. However looking at colleagues code during code review I can now point this issue out to them when I spot it. Cheers.
@Victor_Marius
@Victor_Marius Жыл бұрын
Possible issue. If they're not using the count method on IEnumerable there's no problem. But should point out the multiple resource access if they intentionally iterate multiple times.
@carldaniel6510
@carldaniel6510 Жыл бұрын
I rolled my own "CachedEnumerable" which lazy-caches the results of an enumeration - it's a wrapper over IEnumerable which tests the underlying enumerable (e.g. is it IList) and skips the cache for enumerables that are already cached/array-based. Using it gives me the best of both worlds - lazy enumeration and automatic caching.
@nanvlad
@nanvlad Жыл бұрын
By introducing caching layer you lose actual source data, so if between your enumerations file/db is changed, you have to implement your own cache updating to give the latest set of items to consumers
@carldaniel6510
@carldaniel6510 Жыл бұрын
@@nanvlad Yep. So I don't use it there.
@urbanelemental3308
@urbanelemental3308 Жыл бұрын
BTW, there's a CSV competition article that covers all the CSV parsers for .NET and since the last time I looked the Sylvan.Data.Csv library was the winner and shockingly fast even when using types.
@billy65bob
@billy65bob Жыл бұрын
Oh, that's neat. I'm still using TextFieldParser from the VisualBasic namespace, since it's the best one (and the only one) built into .NET itself.
@smwnl9072
@smwnl9072 Жыл бұрын
The beauty of IEnumerables is lazy/deferred execution. A trap (per this video's message) if you don't have a grasp of what it is. Lazy/deferred execution I believe was borrowed from the Functional paradigm. The idea is that you have a set of logic/algorithm which wont be executed/evaluated unless with explicit intention. In C# LINQ, you express the 'intention' by calling operators like .First() .ToList() .Count() .Any() etc. Examples of lazy LINQ operators, .Where() .Select() .OrderBy() etc. These return an IEnumerable of . Lazy/deferred execution shines when composing/chaining functions and when you intend to use your functions in between a "pipeline". Hence the above 3 are often used in a query chain/pipe. Pertaining to collections, lazy evaluation passes only 1 item to each node/operator in the chain/pipe at a time. But for eager evaluation, the whole collection is evaluated and passed down. If there were conditions of 'early breaks', the latter won't benefit as the collection has been prematurely evaluated. E.g. a lazy pipe/chain products .Where(p=> p.InStock()) // each product 'in stock', will flow down.. .Where(p=> p.Price < 3.14) // but only 1 at a time and not the full list because 'where' is lazy. .Select(p=> p.ToShippable()) // Concatenated lazy chains act and behave as one (select is also lazy). // I often combine multiple individual lazy operators to solve complex problems with very little concern for performance penalty. // Shifting the order of the operators around is also quite easy as they are somewhat stand alone..
@blazjerebic8097
@blazjerebic8097 Жыл бұрын
Great video. Thank you for the information.
@Kazyek
@Kazyek 9 ай бұрын
Also, for this specific example, or any time you want to return some kind of finite collection on which you want to be able to enumerate and get a count, you can return a ICollection. The adventages to returning an interface type is to be able to change the underlying collection if needed without impacting the usage of the method. For example, if eventually you have work to do on each entities and want to parallelize it, then you might want to use a ConcurrentBag instead of a List, but both would satisfy the ICollection signature, so no refactor is needed for consumers of the original function.
@levkirichuk
@levkirichuk 7 ай бұрын
Great and very important point Nick
@KingOfBlades27
@KingOfBlades27 Жыл бұрын
This multiple enumerations text occured to me as well Resharper which is the sole reason I am aware of this behavior. Really good thing to teach to people.
@flybyw
@flybyw Жыл бұрын
When you switch to .Select(), the file is only read once while each line is selected twice; and then you could just append .ToList() to the .Select() to return a list of Customer's without splitting each line twice.
@paulovictordesouza1720
@paulovictordesouza1720 Жыл бұрын
Oh boy, this one hit hard on me Some time ago I've had to import a 15 billion line csv into database and IEnumerable gave me the impression that it would help me but it actually didn't 'cause of the multiple interations. The only solution that occurred to me in that time was to slowly add some values to a list and them import, otherwise it would throw a memory exception. Without this approach, the entire proccess would take almost 3 hours to complete. After some modifications and making it more "listy" it ended up being just some minutes.
@josephizang6187
@josephizang6187 Жыл бұрын
I didn't understand this problem this way. I usually find myself consious when using EF mostly and then when jus working with IEnumerables, I tend to get sloppy with my handling this. THANK YOU Nick🙃
@xavier.xiques
@xavier.xiques Жыл бұрын
Very useful video Nick, thanks again
@FunWithBits
@FunWithBits Жыл бұрын
yay - I was able to find the issue before Nick pointed it out. =) Though, I only looked for it because Nick pointed out there was a problem though...probably would not have cought it in real life.
@dolaudz3285
@dolaudz3285 Жыл бұрын
Just came across this warning a few days ago for the first time in Rider. At first glance, it might seem like something insignificant, but in my case this saved a few seconds of execution (in scale) for some flow.
@figloalds
@figloalds Ай бұрын
I heavily use co-routines on my applications, specially for large operations, I can read 28k lines from the database, make them as objects, turn them into JSON and send them over to clients without loading 28k things in memory, then making 28k objects, then making a 28k items json array, saves a lot of RAM and avoids high-gen GC.
@the_wilferine
@the_wilferine Жыл бұрын
Awesome video as always! It’s worth noting however that the implementation of GetCustomers using Select behaves subtly differently to yield return. The call to GetCustomers itself is deferred until enumeration when using yield return whereas it’s called only once when using the Select, when it is assigned to the customers variable. Still absolutely a performance issue as the iteration over the lines still happens twice but the file is only loaded into memory once in the Select example.
@alexanderkvenvolden4067
@alexanderkvenvolden4067 Жыл бұрын
I wonder if it's worth writing my own wrapper IEnumerable implementation that takes in an IEnumerable and caches the values as it enumerates. Then add an extension method to make it Linqy, like "CacheItems", now I'd have a drop-in way to make any IEnumerable safe to multiple enumerate, while retaining the performance of not needing to convert to a list right away.
@Victor_Marius
@Victor_Marius Жыл бұрын
And caching would mean saving to a list? Well that is the same as calling ToList() when you are creating the IEnumerable but more complicated 😅. What you could do is to create an enumerable type that stores and updates the size into a single int private member while enumerating the first time and a count method that returns that size. But this would return a size of 0 for non enumerated enumerables. Or just use an out argument (like in this video it was reading the lines from a text file - just set the number of lines into that out argument or an external variable). But I would avoid enumerating more than once or even using count for type that are not supposed to have a known size before enumerating.
@alexanderkvenvolden4067
@alexanderkvenvolden4067 Жыл бұрын
@@Victor_Marius Those are good points. It would end up doing the same thing as ToList. However, it would preserve most of the benefits of using an enumerable over a collection type. You wouldn't need to fully enumerate the type before using it (like you would with ToList). This would improve performance for expensive enumerations, as well in the case of a partially complete enumeration.
@masonwheeler6536
@masonwheeler6536 Жыл бұрын
8:45: "Know that the warning might be there but there might not be multiple enumeration in every single one of those occasions." Enumeration is the process of _going over the elements of the enumerable,_ not of creating it. When you have a List, LINQ's Count() can call the Count property directly and not have to enumerate the list to count it. But if you had multiple foreach loops or LINQ queries, that would indeed be multiple enumeration even if it's of a List or an array. As the video says, the warning is confusing. The problem isn't multiple enumeration; it's multiple _generation,_ which can be a problem for more reasons than just the performance hit. If the generation of the enumerable is non-deterministic for whatever reason, (maybe you have a call to a random number generator in there, or you're querying a database twice and someone else INSERTs something into it in between your two calls,) you can end up enumerating two different sequences of values when you intuitively thought you'd be enumerating the same values twice, which can cause bugs in your code.
@dusrdev
@dusrdev Жыл бұрын
Hey Nick, I have recently switched to Rider and I am curious, which theme are you using?
@Spartan322
@Spartan322 Жыл бұрын
Makes me think it be nice to have an enumerable type that when called statically constructs the IEnumerable once with minimal overhead when the function is called, while using ToList is clear, it would be nice to designate from the function definition without requiring to produce a list or other container explicitly. Lazy loading and enumerable reconstruction is deceptive when you're used to how containers work especially when this behavior is built into the language.
@HazeTupac
@HazeTupac Жыл бұрын
Thank you for tip, quite interesting. One question.. Does your courses come with certificate at conclusion?
@Marfig
@Marfig Жыл бұрын
The general advice for any caller of iterators that return IEnumerable, is to not mix cursor calls like ForEach, with aggregate functions like Count if both results are in scope of each other, unless the first call casts the result to a collection or array and the second call uses that cast result instead. Not doing that is not just a matter of performance; that's even potentially the least of our worries. The problem is instead that most likely we just introduced a potentially hard-to-find bug if the source data can be changed by a third party between both calls. But if both calls are not related and they are out of scope of each other, do not cast. That's a potentially expensive operation in itself.
@ChronoWrinkle
@ChronoWrinkle Жыл бұрын
it never come to me that method enumerable behaves same way, but indeed it does. Nice stuff ty!
@abdellatifnafil
@abdellatifnafil Жыл бұрын
thanks man u r the best!
@chazshrawder8151
@chazshrawder8151 Жыл бұрын
I learned this the hard way playing with Entity Framework when it first came out years and years ago. It was not a fun or quick learning experience! Unlike this video, which was both fun and quick 👍
@EverRusting
@EverRusting Жыл бұрын
I love that JetBrains catches possible multiple enumerations BUT OH MY GOD If you don't enumerate the same IReadOnlyList multiple times it will NAG You endlessly to change to parameter to IEnumerable Which is annoying because your parameter already conforms to IReadOnlyList then it will again nag you to change it back when you enumerate one more time
@ltklaus5591
@ltklaus5591 Жыл бұрын
I've switched to using IReadOnlyCollection or IReadOnlyList in most cases. The only time I use IEnumerable is when I don't need/want all items to be in memory at the same time, or if there could be a reason to only enumerate some of the items. If I had a CSV with 1,000,000 customer names and I wanted to know how many Nick's there are, I could read the file line by line, check if the name is Nick, increment a count, and move to the next line without storing all the names. Or if I wanted to get the address of the first 5 Nick's in the file, I could enumerate till I find the 5 Nick's, and then stope enumerating.
@aaron4th2001
@aaron4th2001 5 ай бұрын
I questioned the system when I used a Where clause on a list and then when I modified a value on that list it suddenly got added/removed based on the where clause. When I never updated the IEnumerable collection to reflect the changes. After trial and error I debugged it and write a before and after count and my value change somehow magically reflected in the collection, I got an inkling of how this enumeration worked after watching this video I've now discovered that Count and iterating through it, reexecutes the Where clause everytime.
@knowledgeforfun838
@knowledgeforfun838 Жыл бұрын
I had to learn it the hard way when checking for a performance issue in production.
@crifox16
@crifox16 Жыл бұрын
so basically yield returning an IEnumerable is great when you work with transient data that doesn't get reused? just to know if i understood right
@keithrobertson7579
@keithrobertson7579 Жыл бұрын
I would worry about using ToList() in the general case where the result set can be large. Imagine you're connected to a database querying millions of records. Unless you restrict the query so that you KNOW the data set will be small-ish, you shouldn't use ToList(). Also, the example here uses Count. This is an issue when applied to a custom iterator which LINQ can't incorporate; but if you're just using Where, etc. on expressions which LINQ can process into the query, it should become a SELECT COUNT(*) query, which doesn't walk all the records. Worth mentioning that Count+Walk is not necessarily bad on its own; the issue is that it's being applied on a custom iterator. One should always step through code like this in the debugger to make sure it's working as expected, and take into account the possibility of a large data set.
@co2boi
@co2boi Жыл бұрын
Good stuff. Curious, at the end of the video you said "I probably wouldn't return IEnumerable, I would probably return the Type". Can you expand on that? Also, in your example you are getting a count. Wouldn't it be better to use ICollection instead?
@borisw1166
@borisw1166 Жыл бұрын
I think what he means is an API designing question and what your intend is how to use your API. Ienumerables look like your you can filter and actually late execute the code behind that. While arrays or read-only lists express that there is no sense in filtering, because the "heavy" code is always execute, whether you filter or not. In his example filtering makes no "sense" since the file reading all the lines, creating and returning the object anyways. But ienumerable let's you think that you could filter in that and it actually makes a difference. Having an array or read-only list makes it very clear: the file is read anyways.
@siposz
@siposz Жыл бұрын
If a function actually return with a List then the function return type should List, not IEnumerable. In this way the caller exactly know, what they get back, and could consume it optimal way. If I you see an IEnumerable return type, you don't know, what happens if you call a Count() on it. Anything could happen, for example a 5 second long lasting database call. Or it could throw FileNotFound exception. But a List is easier to deal with. If it's not null, Count() will be ok.
@robwalker4653
@robwalker4653 2 ай бұрын
If you are creating a list internally, why not just return a list instead of IEnumerable? What are the advantages of returning IEnumerable?
@guiorgy
@guiorgy Жыл бұрын
I knew of this, though don't remember since when or why. Maybe when I was trying to work with a database once in the past 🤔
@kawthooleidevelopers
@kawthooleidevelopers Жыл бұрын
Hi Nick, Just finish your minimal api course. What the most optimize way to connect and work with CosmoDb? Is Dapper going to work with CosmoDB? A lot of sample codes are using DbContext and Microsoft document just show how to work directly call the database and container. Is that the best way to work with it? Maybe you've done a video on it, I just can't find it. Appreciate you sharing with us.
@nickchapsas
@nickchapsas Жыл бұрын
I would simply use the SDK of cosmos db directly. It’s pretty good
@kawthooleidevelopers
@kawthooleidevelopers Жыл бұрын
@@nickchapsas thank you, brother. I will give that a go. Appreciate you help.
@za290
@za290 Жыл бұрын
Thanks for this video. I don't use IEnumerable. After that video i'll still so :) but i learn why i'm not.
@user-tk2jy8xr8b
@user-tk2jy8xr8b Жыл бұрын
Strangely there's no OOB class or ext method to do this better. ToList would create a list that reallocates as it grows. What would be cool is a linked list of exponentially growing blocks - takes O(n) memory and time to build, but more efficient than just a linked list. And a common rule is: "iterate multiple times - use IReadOnlyCollection"
@tussle2k
@tussle2k Жыл бұрын
Yay, new video 👍
@CeleChaudary
@CeleChaudary Жыл бұрын
Thanks 👍
@phizc
@phizc Жыл бұрын
I've worked on IEnumerable/IEnumerator classes the last couple of days for getting the path to hard links (NTFS) of a specific file. It works, but dang is it annoying. I tried to only make an IEnumerator, but that's not good enough for "foreach". It really wants to call that shiny GetEnumerator method on IEnumerable. After watching the video, I've decided to just get all the links at once and return an array. The NT kernel methods are set up as enumerating (GetFirst/GetNext), but there's only ever going to be less than 1024 links (hard coded in Windows), and realistically, less than 10. Also, for my purposes, and probably everyone else's, it doesn't make sense to just get the first few, or not turn it to a List/array anyway. There's even a winapi method to get the count, before trying to enumerate them.
@lpmynhardt
@lpmynhardt Жыл бұрын
Hi Nick, love your channel, could you explain why the following is 20x slower on my pc? I have seen it mentioned here or there but never seen a good explanation using System.Diagnostics; var list = Enumerable.Range(0, 100_000_000).ToArray(); IEnumerable enumerable = list; Stopwatch sw = Stopwatch.StartNew(); foreach (var i in enumerable) { var d = i + 1; } sw.Stop(); Console.WriteLine(sw.ElapsedMilliseconds); sw.Restart(); foreach (var i in list) { var d = i + 1; } sw.Stop(); Console.WriteLine(sw.ElapsedMilliseconds); The IEnumerable is much slower than enumerating an array, if I change the order around so the array is enumerated first and the IEnumerable second, the result is the same (array is much faster still)
@ryan-heath
@ryan-heath Жыл бұрын
The foreach ienumerable is implemented using the iterator interface (movenext, current methodes) The foreach array is implemented like a for loop, no method calls involved.
@lpmynhardt
@lpmynhardt Жыл бұрын
@@ryan-heath Thanks, that makes sense
@phizc
@phizc Жыл бұрын
@@ryan-heath except the compiler, or at least the JITc would know that it is a list, just presented as an IEnumerable. For the JIT to optimize it that way it might have to upgrade to Tier 1 compilation though, and it'll only do that if it's explicitly told to, or encountered the method 30+ times.
@Moosa_Says
@Moosa_Says Жыл бұрын
Hey nick, shouldn't we just use List as return type for collections every time? and IEnumerable only in cases where we are sure that we need it ? or there are disadvantages of using List everywhere? would love to hear your thoughts. Thanks :)
@phizc
@phizc Жыл бұрын
Only problem would be if you don't want the user to change the items in the list (use IReadOnlyList or IReadOnlyCollection then) or for interfaces or abstract implementations, though in that case I think you can still return the List, even if the interface says IEnumerable. Basically, the only place I would ever have an IEnumerable return or out parameter is in an interface that might need it to be that way. Of course, if you *are* enumerating something and it doesn't make sense to return a list, do return an IEnumerable. E.g. an "infinite" list. Example: infinite fibonacci sequence IEnumerable Fibonacci() { long prev = 0; long curr = 1; while(true) { yield return curr; var p = prev; prev = curr, curr = p + curr; } } Of course it's not "infinite". Since it grows with a factor of 1.618, with long it'll take less than 100 steps. Consider BigInteger for a more painful experience 😁. Or enumerating every integer.
@Moosa_Says
@Moosa_Says Жыл бұрын
@@phizc Thank you for sharing your opinion. So, I think I'm saying right that use IEnumerable only in particular cases while Lists actually have more use cases when considering real application case scenarios. I asked this question cuz I've seen a lot of the people always returning IEnumerable and then doing .ToList(); to use it. maybe they do it to maintain some level of abstraction ..?!!
@jackoberto01
@jackoberto01 Жыл бұрын
I think it's up to the person using the code. Like Nick mentioned you might want to add in a Where clause or other instructions before enumerating. Deferred execution is a good feature of C# if you know how to use it. You can also avoid iterating the whole collection in case you use methods like First, Any or similar methods. In any case where you only need one iteration a IEnumerable works fine, if you need multiple iterations you can use ToList or ToArray first so for me an IEnumerable is best as it's flexible
@Moosa_Says
@Moosa_Says Жыл бұрын
@@jackoberto01 Yeah i think it depends on the case...but again i don't think you'll be able to use IEnumerable more than List as List cases are more in my experience.
@TheMonk72
@TheMonk72 Жыл бұрын
@@Moosa_Says I deal with files that are large enough that they just don't fit in memory often enough that it's not worth writing different code just for that case. But it doesn't matter if the file fits in memory, if I don't need to access the data by index and can process it sequentially, I have no reason to load it all at once.
@rpp1502
@rpp1502 Жыл бұрын
Great explanation, We need Zero to Hero: Microservices in C#
@carducci000
@carducci000 Жыл бұрын
I do actually know of this, and typically do take this into account; I'd be lying if I said I catch myself [or others] every single time :). It's one of those things you miss if you're working fast
@quantum_net219
@quantum_net219 Жыл бұрын
This video made me subscribe 😁
@wiktormaek9973
@wiktormaek9973 Жыл бұрын
Very similar topic to IQueryable and materializing query too soon. First time you will load whole big table you'll learn. Seen it in production with user profiles, works great till we've got lot of asian customers that actually can have some really common and short names/lastnames, effectively query was loading everything searched by phrase because it was mapped in the way that materialization happen at some point and then take-skipped.
@phizc
@phizc Жыл бұрын
Similar. Common pitfall is to define the variable as an IEnumerable instead of using _var_. Been watching too much NDC videos lately 😄
@vertxxyz
@vertxxyz Жыл бұрын
I feel like you should also show the rider-specific "why are you showing me this warning" link they usually build into the alt-enter menu for warnings like these (when they exist)
@nickchapsas
@nickchapsas Жыл бұрын
This is such a good idea for a video or a short actually
@TkrZ
@TkrZ Жыл бұрын
loving the Barking joke at the start 😂
@nickchapsas
@nickchapsas Жыл бұрын
At least one person got it 🥲
@CrapE_DM
@CrapE_DM Жыл бұрын
Interesting. In the languages I work with, doing something like this simply fails because the iterable can only be iterated over once, so you'll find out quickly that you need to cache the results to use them twice.
@cyril113
@cyril113 Жыл бұрын
​@Adam M java also
@Crozz22
@Crozz22 Жыл бұрын
This happens in C# for `IEnumerator`. However `IEnumerable` is really just a factory of `IEnumerator`s
@MofoMan2000
@MofoMan2000 Жыл бұрын
This misunderstanding comes from the fact that enumerating with "yield return" statements effectively turns the method into a coroutine. Once the end of the method is hit, it is considered enumerated and local variables are deallocated. Then you start enumerating it again and it has to redo everything.
@doneckreddahl
@doneckreddahl Жыл бұрын
Can anybody telll me what Nick is using to show stuff like "x: "Nick Chapsas, 29"" and "splitline: string[2]" when he debugs? It seems to show the count as he debugs as well.
@nickchapsas
@nickchapsas Жыл бұрын
It’s just part of the Rider debugger
@lifeisgameplayit
@lifeisgameplayit Жыл бұрын
I havent watch vid yet Nick "Epic" Chapsas is a Legend
@goremukin1
@goremukin1 Жыл бұрын
I know about this feature and always take it into account. But I know too many developers who don't know about it. It's easier to list those who know Most often I see the multiple enumeration warning on projects where people use Visual Studio. I think it's partly Microsoft's fault that they still don't warn people about the multiple enumeration possibility, so people don't care
@teckyify
@teckyify 7 ай бұрын
The yield in the the example does not have the same behavior as returnn list.Select()
@ILICH1980
@ILICH1980 Жыл бұрын
good to know, did not know before
@akumaquik
@akumaquik Жыл бұрын
Ive know this for awhile and I always thought it was a problem with .Count(). I have creatively coded around .Count() in many projects.
@EPK_AI_DUBS
@EPK_AI_DUBS Жыл бұрын
What happens if I want to return an empty list? You previously said it was better to use Enumerable.Empty, but I cannot do that if the method returns directly a List, right?
@phizc
@phizc Жыл бұрын
You can return an IList, ICollection, or a readonly version of those interfaces. Then you can return Array.Empty(). That one also doesn't allocate, or at least, only once.
@billy65bob
@billy65bob Жыл бұрын
If you must return a List, a count of 0 will wrap an Array.Empty; Not great, but the overhead isn't too bad. Using IReadOnlyCollection or similar so you can use Array.Empty directly is preferable though.
@MirrorBoySkr
@MirrorBoySkr Жыл бұрын
What is the better to use in such cases? ToList() or ToArray()?
@nickchapsas
@nickchapsas Жыл бұрын
It depends on what you wanna do with the result
@MirrorBoySkr
@MirrorBoySkr Жыл бұрын
@@nickchapsas I just want to enumerate. So, it seems to me, ToArray() is more suit. But, I see that most of people around me use ToList().
@sergiuszzalewski1947
@sergiuszzalewski1947 Жыл бұрын
The rule is simple - return precise types, and accept abstracted types. If you return List, then your method's return type should be List not IEnumerable. So consumers can exactly now what is the actual type and if they want to, they can limit it to an interface implicitly.
@andreikhotko5206
@andreikhotko5206 Жыл бұрын
That's right, I follow the same approach. Just one note: there are also interfaces like IReadOnlyCollection, IReadOnlyList, IList, which I prefer to use for returning type.
@qm3ster
@qm3ster Жыл бұрын
Why does no one have this problem with `Iterator`/`IntoIterator` in Rust? 🤔
@LCTesla
@LCTesla 10 ай бұрын
me putting .ToList() everywhere to prevent enumerables from behaving unpredictably: there. now I should be safe. Garbage collector: **whining in agony**
@TheOmokage
@TheOmokage Жыл бұрын
Its called Lazy initialization ? Can I keep this feature, when I create my own class, which is inherited from IEnumerable?
@phizc
@phizc Жыл бұрын
If you do it in the GetEnumerator method, yes. The IEnumerable interface only has that method IIRC, and it gets called by foreach, and the LINQ methods (eventually).
@TheOmokage
@TheOmokage Жыл бұрын
@@phizc what exactly i must to do in GetEnumerator method for capture this feature?
@rvladkent
@rvladkent Жыл бұрын
I usually IReadonlyCollection or IReadOnlyList, unless really need lazy evaluation
@EvaldasNaujikas
@EvaldasNaujikas Жыл бұрын
Great video, but I think it is also important to mention that calling ToList() should be done only if underlying implementation is not enumerated. For example, in your example when your IEnumerable was returning List (instead of yield), a call to ToList() would copy the same list, which increases memory usage. And for new developers, they could start thinking after the video that ToList() should always be done if they are using a method that returns IEnumerable.
@nickchapsas
@nickchapsas Жыл бұрын
There are checks in ToList to prevent the extra allocation so you won’t increase the memory
@EvaldasNaujikas
@EvaldasNaujikas Жыл бұрын
@@nickchapsas but why then rider shows additional allocation of System.Int32[] and a new array in memory? And that additional +1 only happens AFTER ToList(). See the image here: snipboard.io/XkNj4A.jpg
@EvaldasNaujikas
@EvaldasNaujikas Жыл бұрын
And it even does the same if I use List as return type for GetNumbers. After ToList - a new array is allocated in memory.
@EvaldasNaujikas
@EvaldasNaujikas Жыл бұрын
And just for fun, I added four ToList calls one after another. dotMemory still sees the allocation snipboard.io/e2NrTl.jpg
@stempy100
@stempy100 Жыл бұрын
@@nickchapsas incorrect. .ToList() will create a new list.
C# Yield Return: What is it and how does it work?
15:09
Brian Lagunas
Рет қаралды 52 М.
Writing C# without allocating ANY memory
19:36
Nick Chapsas
Рет қаралды 141 М.
ВИРУСНЫЕ ВИДЕО / Виноградинка 😅
00:34
Светлый Voiceover
Рет қаралды 7 МЛН
ONE MORE SUBSCRIBER FOR 4 MILLION!
00:28
Horror Skunx
Рет қаралды 51 МЛН
КАК ГЛОТАЮТ ШПАГУ?😳
00:33
Masomka
Рет қаралды 1,8 МЛН
Settling the Biggest Await Async Debate in .NET
14:47
Nick Chapsas
Рет қаралды 136 М.
Double the Performance of your Dictionary in C#
15:12
Nick Chapsas
Рет қаралды 63 М.
What’s the Result Type Everyone Is Using in .NET?
14:47
Nick Chapsas
Рет қаралды 97 М.
The weirdest way to loop in C# is also the fastest
12:55
Nick Chapsas
Рет қаралды 245 М.
Why all your classes should be sealed by default in C#
11:43
Nick Chapsas
Рет қаралды 90 М.
The Easiest Way to Measure Your Method’s Performance in C#
12:51
Nick Chapsas
Рет қаралды 71 М.
NodeJS 22 Just Dropped, Here's Why I'm Hyped
14:31
Theo - t3․gg
Рет қаралды 65 М.
Don't throw exceptions in C#. Do this instead
18:13
Nick Chapsas
Рет қаралды 245 М.