0:30 „do a Generator expression first and then turn into tuple: Don‘t! It’s quite slow. Make it a list comprehension and turn into a tuple after. It’s like half the time for a listcomprehension turned into a tuple than a generator expression turned into tuple. Oddly so.
@lilDaveist2 жыл бұрын
@@christopherbennett1631 but you checked for gen expression vs list comprehension. No? What I mean is x = tuple(i for i in range(1_000_000) if i%2 == 0) vs x = tuple([i for i in range(1_000_000) if i%2 == 0]) Quite a difference. If you need a tuple (hashable) or have hundreds of thousands of those (coordinations, RGB values, etc), it just makes sense to use the list comprehension turned into tuple instead of type casting a generator expression.
@lilDaveist2 жыл бұрын
@@christopherbennett1631 My machine: %timeit tuple([i for i in range(1_000_000) if i%2==0]) 71.7 ms ± 331 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) %timeit tuple(i for i in range(1_000_000) if i%2==0) 78.1 ms ± 1.28 ms per loop (mean ± std. dev. of 7 runs, 10 loops each) %timeit *(i for i in range(1_000_000) if i%2=0), 81.6 ms ± 905 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) %timeit tuple((i for i in range(1_000_000) if i%2==0)) 78.1 ms ± 820 µs per loop (mean ± std. dev. of 7 runs, 10 loops each) It’s consistently 10+% faster. Every time.
@lilDaveist2 жыл бұрын
@@christopherbennett1631 my program saved about 30 seconds of processing time from it. (About 20 million of operations in the magnitude of [i**i for i in range(1000)]).
@lilDaveist2 жыл бұрын
@@christopherbennett1631 it wasn’t a large set of data per se. We had to visualize routes from a GTFS data set and keep track of coordination data, stops, trips. To not draw the same routes/trips again, we used a route stack so it needed to be hashable. The final step (to actually draw the routes) we needed to pair up the stops: ABCD becomes AB BC CD. That’s what we used the listcomprehension into tuple for. Public transit with buses, underground etc had about 20 million unique trips. What Python version are you on? The timeit measurements were pretty much the same (10%+) with everyone involved. Edit: We used Pandas in about 80% of the project.
@Carberra2 жыл бұрын
Tip: for the pairing up of the stops, you can use itertools.pairwise.