Accessing values from data frames, data tables, tibbles, matrices, and vectors (CC278)

  Рет қаралды 957

Riffomonas Project

Riffomonas Project

Күн бұрын

Пікірлер: 18
@eric13hill
@eric13hill 6 ай бұрын
Pat, I wanted to thank you for your selfless service that you do to help others. I really have benefitted from your kindness.
@Riffomonas
@Riffomonas 6 ай бұрын
My pleasure and many thanks for your generous comment!
@PhilippusCesena
@PhilippusCesena 6 ай бұрын
this is why we love your videos, also to learn different approaches, get into the right mode and mindset. Personally I had the pleasure of watching a lot of your videos and using R for about two years, not very long, today having changed jobs I use it much less, however to do simple things with Excel I take too long... So in the end I import in R and continue to do as I always have. Thanks to your videos I still don't lose too much dexterity, which unfortunately is quickly lost by standing still.
@Riffomonas
@Riffomonas 6 ай бұрын
Thank you so much!🤓
@spacelem
@spacelem 6 ай бұрын
Fascinating! I saw the note about how the weirdness goes away with bigger N, but I was surprised by how bad the results were there. All I can think is that there's a huge overhead for the actual getting of indices, relative to using the indices to extract the data. I don't care how much more performant "x == n1 | x == n2 | ..." is, I'm not giving up "x %in% c(n1, n2, ...)"!
@Riffomonas
@Riffomonas 6 ай бұрын
Yeah, remember it's all about context and application. I use %in% all the time for analyses where speed doesn't matter. 99% of the time it takes longer to save a TIFF than to filter rows from a data frame 🤓
@djangoworldwide7925
@djangoworldwide7925 6 ай бұрын
Might be interested to try the single vector function with a map or for loop and run through the desired kmers. You might find that just iterating with parallelizing of a single vector read is the most performent..
@Riffomonas
@Riffomonas 6 ай бұрын
I tried map/sapply in an earlier episode to build a vector, it was pretty slow relative to other options
@sounkoumahamanetoure4607
@sounkoumahamanetoure4607 6 ай бұрын
what is the effect of the JIT on these comparisons ?
@Riffomonas
@Riffomonas 6 ай бұрын
Not sure what you mean by JIT?
@AKBARESFAHANI
@AKBARESFAHANI 6 ай бұрын
Why not use Arrow and use it to read data out of memory
@Riffomonas
@Riffomonas 6 ай бұрын
I haven't tried arrow, but in the next episode (Thursday, 2024-05-02) I'll try duckdb with duckplyr - it's pretty slick
@Riffomonas
@Riffomonas 6 ай бұрын
Just tried arrow - it's about 3x slower than duckdb with the filter function on a table with 1e7 rows and 3 columns. Check back on Thursday afternoon and I'll post the updated timings with arrow included. Thanks for asking!
@AKBARESFAHANI
@AKBARESFAHANI 6 ай бұрын
@@Riffomonas try saving your data out as Parquet using partitions for better performance
@AKBARESFAHANI
@AKBARESFAHANI 6 ай бұрын
@@Riffomonas and I really enjoy your videos
Comparing duckdb and duckplyr to tibbles, data.tables, and data.frames (CC279)
41:14
Smart Sigma Kid #funny #sigma
00:14
CRAZY GREAPA
Рет қаралды 85 МЛН
MY HEIGHT vs MrBEAST CREW 🙈📏
00:22
Celine Dept
Рет қаралды 90 МЛН
Сюрприз для Златы на день рождения
00:10
Victoria Portfolio
Рет қаралды 2,7 МЛН
This Is Why Python Data Classes Are Awesome
22:19
ArjanCodes
Рет қаралды 811 М.
🚨 YOU'RE VISUALIZING YOUR DATA WRONG. And Here's Why...
17:11
Adam Finer - Learn BI Online
Рет қаралды 218 М.
The tutorial you need to maximize your use of R's lists (CC276)
37:46
Riffomonas Project
Рет қаралды 850
How I'd Learn Data Analytics in 2024 (If I Had to Start Over)
14:08
CareerFoundry
Рет қаралды 812 М.
5 Simple Secrets to IMPROVE YOUR BRAIN FUNCTION (backed by science)
5:54
Python Programmer
Рет қаралды 21 М.
Data wrangling with R in 27 minutes
27:19
Equitable Equations
Рет қаралды 26 М.
Smart Sigma Kid #funny #sigma
00:14
CRAZY GREAPA
Рет қаралды 85 МЛН