R package reviews | dlookr | diagnose, explore and repair your data quick!

  Рет қаралды 8,659

yuzaR Data Science

yuzaR Data Science

Күн бұрын

In this video, we'll learn how to quickly diagnose, explore and fix problem in your data.
We'll have a deep look at missing values and ourliers and will be able to impute them with fancy machine learning techniques. And of course, we'll make much more.
If you only want the code (or want to support me), consider join the channel (join button below any of the videos), because I provide the code upon members requests.
Music by Vincent Rubinetti
Download the music on Bandcamp:
vincerubinetti...
Stream the music on Spotify:
open.spotify.c...
Enjoy! 🥳

Пікірлер: 94
@wayarberry
@wayarberry 3 жыл бұрын
Outstanding. Tremendous functionality accessible with relatively few, simple commands. Top notch graphics as well. Keep up the good work!
@yuzaR-Data-Science
@yuzaR-Data-Science 3 жыл бұрын
Many thanks for a positive feedback, William! That's why I love R, it makes life easier 😊 really happy that it is useful to more people than just me.
@muhammedhadedy4570
@muhammedhadedy4570 Жыл бұрын
Wow, what an amazing tutorial. I've been using R for 5 years, and I've never used the dlookr package before. Your explanation was simple, focused and directed to the point, just as usual. Thank you so much for your great videos. I really appreciate your work. 😊😊😊😊😊😊.
@yuzaR-Data-Science
@yuzaR-Data-Science Жыл бұрын
Thank you very much for nice feedback! I am glad you liked it. I have since then produced reviews on even more useful packages, "gtsummary" is one of the best. You might like it too.
@martinvidalon
@martinvidalon 3 ай бұрын
Eso es lo maravilloso del universo de R, uno nunca sabes si te encontrarás con un dinosaurio.
@sandiprijal9521
@sandiprijal9521 3 жыл бұрын
Why didn't I came across you channel till now? This is phenomenal
@yuzaR-Data-Science
@yuzaR-Data-Science 3 жыл бұрын
Thanks a ton, Sandip! That means a lot! My channel is still very young and I hope I'll produce more useful content. Cheers!
@mkklindhardt
@mkklindhardt Ай бұрын
I love your videos and screencasts! They are very educational and still very high-level. Could you please make a video on performing meta-analysis in R using for example the metafor or other packages? Especially covering the choices one has to make concerning types of meta-analysis: e.g. fixed-effects, random-effects or bayesian meta-analysis of different types of response variables, modulators and outcomes. It could also be cool if you covered the considerations of multivariate and multilevel meta-analysis as well as composite outcome meta-analysis, such as the concept of "borrowing of strength" (BoS) in meta-analysis when dealing with multiple outcomes or studies with small sample sizes.
@yuzaR-Data-Science
@yuzaR-Data-Science Ай бұрын
Thanks for such a lovely feedback! The meta-analysis videos are definetely on my list! And they will be done. But the list still has a few topics before it, like all kind of models. Thus, please, don't expect them too soon, but stay tuned as they are important for my work and will be produced in the future!
@imanol2506
@imanol2506 6 ай бұрын
This (specially this!) is marvelous, but also the rest of the series of explanatory videos. Congrats!
@yuzaR-Data-Science
@yuzaR-Data-Science 6 ай бұрын
Glad you like them! Since they are already a bit old, you might find the more recent videos also useful. Thanks for watching!
@haraldurkarlsson1147
@haraldurkarlsson1147 3 жыл бұрын
This is great stuff!
@yuzaR-Data-Science
@yuzaR-Data-Science 3 жыл бұрын
Thanks for the feedback, Haraldur! You might also like the Deeper Exploratory Analysis Video. It's long, but very dense with lots of nice functions, including and similar to dlookr
@buraktiras93
@buraktiras93 2 жыл бұрын
An overall thanks for all the videos uploaded into this channel!
@yuzaR-Data-Science
@yuzaR-Data-Science 2 жыл бұрын
Thanks a lot 🙏 glad it’s useful!
@angezoclanclounon1751
@angezoclanclounon1751 3 жыл бұрын
Thanks a lot Dr. Yury. Nice and helpful video.
@yuzaR-Data-Science
@yuzaR-Data-Science 3 жыл бұрын
Thank Ange! Glad it was helpful!
@dasrotrad
@dasrotrad Жыл бұрын
What a great presentation. I love this package. Thank you for introducing it and descdribing it in such an easy to follow presentation.
@yuzaR-Data-Science
@yuzaR-Data-Science Жыл бұрын
Thanks, Robert! I also love dlookr. I actually have only done reviews on packages I do enjoy and use everyday. So, you might find my other package reviews also useful. I thing the gtsummary is one of the most capable.
@bj3768
@bj3768 Жыл бұрын
I really hope your channel will grow in the future. Your videos are very helpful to me.
@yuzaR-Data-Science
@yuzaR-Data-Science Жыл бұрын
I am very glad to hear that! Thanks! And thanks for watching, it's the best support!
@syhusada1130
@syhusada1130 2 жыл бұрын
I've been using data explorer and dlookr more in my learning journey thanks to you sir.
@yuzaR-Data-Science
@yuzaR-Data-Science 2 жыл бұрын
Glad to hear that! More exiting package will follow! I just do it slowly due to my day job. Thanks for your support and Cheers 😊
@landoska
@landoska 3 жыл бұрын
Great! I love your videos. Please cover mixed models 🙏🤓
@yuzaR-Data-Science
@yuzaR-Data-Science 3 жыл бұрын
Thanks a lot, landoska! Noted! Funny enough, I do lots of mixed models in my job - medicine statistician.
@hendrikpehlke4973
@hendrikpehlke4973 3 ай бұрын
Wow! Thank you! So many important informations. I have to watch this video several times. But one question: In which order would you use the packages "janitor" and "dlookr". Would be interesting to teach people how to load and handle "dirty" excel table, fix some excel problems (e.g. date as numbers or entries like "no data" in numerical columns etc) and if those problems are fixed to use "dlookr" to diagnose, explore and repair the data.
@yuzaR-Data-Science
@yuzaR-Data-Science 2 ай бұрын
Thanks a lot for your nice reply, Hendrik! I would use janitor first and dlookr on top. I guess you already have seen the janitor video on my channel. If not, feel free ;) I also have one video on tidy data, where I show the dirty table, but there is not much of R programming. Thanks for watching!
@wasafisafi612
@wasafisafi612 Ай бұрын
Thank you so much for the video
@yuzaR-Data-Science
@yuzaR-Data-Science Ай бұрын
Glad you enjoyed it! Thanks for watching!
@haraldurkarlsson1147
@haraldurkarlsson1147 3 жыл бұрын
Yury, I noticed that your code for imputing outliers in the diamonds data repeats and is thus prime for a for loop, apply or map function (this is roughly 14 minutes into the video). I did not try to get too fancy so I wrote a short for loop to iterate over the methods. The function generates the plots one after the other. I thought I might share this with you and your viewers. Here is my rather crude code: imp_na_method
@yuzaR-Data-Science
@yuzaR-Data-Science 2 жыл бұрын
Thanks for the for loop, mate! I have to say, nowadays I mostly use missRanger, because it's a very fancy and multiple imputation. I also always check the imputed values and they never disappointed so far.
@yuzaR-Data-Science
@yuzaR-Data-Science 2 жыл бұрын
And thanks a lot for a nice feedback! I am glad videos are useful :)
@statlab_stat.solution
@statlab_stat.solution Жыл бұрын
Just awesome 👍
@yuzaR-Data-Science
@yuzaR-Data-Science Жыл бұрын
Thanks a lot 😊 I love dlookr too! You might also like the gtsummary review, if you did not see my video on it already. Thanks for watching!
@SadatQuayiumApu
@SadatQuayiumApu 3 жыл бұрын
Thank you very much for the videos! These are life changing indeed.
@yuzaR-Data-Science
@yuzaR-Data-Science 3 жыл бұрын
You are very welcome, Sadat! :)
@fatthi2025
@fatthi2025 Жыл бұрын
Many thanks for your help
@yuzaR-Data-Science
@yuzaR-Data-Science Жыл бұрын
You are welcome. Thanks for watching!
@davidjackson7675
@davidjackson7675 3 жыл бұрын
This a awe·some package.
@yuzaR-Data-Science
@yuzaR-Data-Science 3 жыл бұрын
Thanks David! I hoped this would be helpful not only to me!
@angvl8793
@angvl8793 2 жыл бұрын
Great video and great analysis ! Thank you very much! I also like package("recipes") and package("vtreat").
@yuzaR-Data-Science
@yuzaR-Data-Science 2 жыл бұрын
You are welcome. Thanks for the tipps, I'll check them out
@martinvidalon
@martinvidalon 3 ай бұрын
Excelente
@yuzaR-Data-Science
@yuzaR-Data-Science 3 ай бұрын
Thanks Martin!
@kyalosevens1268
@kyalosevens1268 Жыл бұрын
Thanks for this one
@yuzaR-Data-Science
@yuzaR-Data-Science Жыл бұрын
You are welcome!
@francismason3796
@francismason3796 2 жыл бұрын
Fanatastic video and code examples.
@yuzaR-Data-Science
@yuzaR-Data-Science 2 жыл бұрын
Glad you liked it!
@haraldurkarlsson1147
@haraldurkarlsson1147 3 жыл бұрын
Yury, I have duplicated your code and it basically reproduces with a few exceptions. For instance the correlation plot is not a matrix with ellipses but rather a colored chart with the r values. I guess as the package gets updated we will see some variations. Still good stuff - thanks. P. S. I ran my code in normal r-session with script rather than in RMarkdown.
@yuzaR-Data-Science
@yuzaR-Data-Science 3 жыл бұрын
Sure, usually, they get better. It's just amazing, that this is open source :)
@SergioUribe
@SergioUribe 3 жыл бұрын
great package, kudos and keep the work!
@yuzaR-Data-Science
@yuzaR-Data-Science 3 жыл бұрын
Thanks a lot Sergio for such a nice feedback! I will continue! Cheers!
@yaoliao3517
@yaoliao3517 3 жыл бұрын
Really Really love your tutorial!!
@yuzaR-Data-Science
@yuzaR-Data-Science 3 жыл бұрын
Thanks a lot Yao! Your feedback is very encouraging!
@RUJedi
@RUJedi 5 ай бұрын
Amazing video. This will be a great package for my EDA work. Many thanks. Is everything ok with your website in the video description? I keep getting a 404 Site Not Found error page. Same result when I try similar links in a few of your other videos.... ?
@yuzaR-Data-Science
@yuzaR-Data-Science 5 ай бұрын
Thanks, I did a follow up with many more packages „deep exploratory analysis“ . My blog was shut down because they want me to pay for increasing traffic. I refuse to pay for doing something good for the world in an open source software. So, it might take me some time to find the alternative. But that’s not a problem because the blog is actually the script for videos word by word. And KZbin is still free.
@RUJedi
@RUJedi 5 ай бұрын
@@yuzaR-Data-Science makes total sense. Have you considered moving your blog over to Github Pages, which is free and should play nicely with your script as blog or code or code on blog pages?
@yuzaR-Data-Science
@yuzaR-Data-Science 5 ай бұрын
I actually even tried to move it to github pages myself, but something was off, and I could not go online. I am not an IT guy, so I am waiting for a friend to have a look at it and may be help me to solve it. but as you can imagine we both me and my mate have an normal everyday job and life, so the priorities are often not on the blog. anyway, thanks for your support!
@GreenManXY
@GreenManXY 2 жыл бұрын
How do you get the results in the same window as your code? And the ability to preview graphs?
@yuzaR-Data-Science
@yuzaR-Data-Science 2 жыл бұрын
Via R-Markdown document instead of R script
@chacmool2581
@chacmool2581 Жыл бұрын
8:19 plot_normality() Is there a way of displaying qq plots too for the log and sqrt transformations? Can we request other fancier transforms such as Box-Cox or Yeo-Johnson? Separate but related question. What package looks at a predictor and returns the transformation that best normalizes its distribution?
@yuzaR-Data-Science
@yuzaR-Data-Science Жыл бұрын
hey, sure, I'd use: > ggpubr::ggqqplot(log(mtcars$mpg)) > ggpubr::ggqqplot(sqrt(mtcars$mpg))
@yuzaR-Data-Science
@yuzaR-Data-Science Жыл бұрын
and yes, sure, there is any transformation possible. Just type "?plot_normality()" and look inside. Hier is an example: mtcars %>% plot_normality(mpg, right = "Box-Cox")
@yuzaR-Data-Science
@yuzaR-Data-Science Жыл бұрын
these are transformations: "log", "sqrt", "log+1", "log+a", "1/x", "x^2", "x^3", "Box-Cox", "Yeo-Johnson" possible with plot_normality().
@yuzaR-Data-Science
@yuzaR-Data-Science Жыл бұрын
To your last question: I am not aware of any package for that, but there might be one. However, I am not a big fan of transforming the data because you kill interpretability. Log-transform is the most harmless in my opinion. I think it's a better way to use the correct model to fit your distribution.
@chacmool2581
@chacmool2581 Жыл бұрын
@@yuzaR-Data-Science Thanks a lot. Look forward to a vid on Decision Trees and Random Forests with R packages... 🧁
@CanDoSo_org
@CanDoSo_org Жыл бұрын
I got an error: "Error in html_paged_target_numerical(reportData, targetVariable, base_family = base_family) : object 'index' not found" when I ran airquality %>% eda_paged_report( target="Temp", output_format = "html" )
@yuzaR-Data-Science
@yuzaR-Data-Science Жыл бұрын
I don't know what's wrong with this dataset ... may be it's a numeric variable Temp. It does not work at my computer either. However, I tried iris dataset, and it worked flawlessly: iris %>% eda_paged_report( target="Species", output_format = "html" ) If you don't figure it out by yourself, report a bug to the package github page
@CanDoSo_org
@CanDoSo_org Жыл бұрын
@@yuzaR-Data-Science Thanks, Yuza.
@yuzaR-Data-Science
@yuzaR-Data-Science Жыл бұрын
Happy to help
@M.Nagah89
@M.Nagah89 3 ай бұрын
I have a question plz, Why did we put “temp” as a predictor to imputate missing values in Ozone variable ?
@yuzaR-Data-Science
@yuzaR-Data-Science 3 ай бұрын
simply as an example of a predictor
@M.Nagah89
@M.Nagah89 3 ай бұрын
@@yuzaR-Data-Science Am sorry, I cant get it !
@yuzaR-Data-Science
@yuzaR-Data-Science 3 ай бұрын
sorry, what exactly can't you get?
@yuzaR-Data-Science
@yuzaR-Data-Science 3 ай бұрын
sorry, what exactly can't you get?
@M.Nagah89
@M.Nagah89 3 ай бұрын
@@yuzaR-Data-Science Do we have to put a predictor to impute missing values in a variable?
@cesarpina4115
@cesarpina4115 4 ай бұрын
Amazing video!! thanks a lot!!!! but the code link is broke D:
@yuzaR-Data-Science
@yuzaR-Data-Science 4 ай бұрын
Thanks for the feedback! Sorry for that, man! Netlify shut down my blog since they want me to pay for increased traffic. I refuse to pay for doing something useful for the world (without earning absolutely nothing) and since R is open source. But I want to reopen it ASAP, as soon as I find an alternative for Netlify. It'll take some time though, because I am not an IT guy. FYI: my blog is actually the script for the video, word by word, code by code. Thanks for understanding! But if you want the access quicker, consider to join the channel and becoming a member. For members I provide the code immediately. Cheers!
@chacmool2581
@chacmool2581 Жыл бұрын
Is dlookr the best package for outlier diagnostics and correction?
@yuzaR-Data-Science
@yuzaR-Data-Science Жыл бұрын
What we call "the best", depends on a lot ... but, I think, it's certainly useful enough to stop looking for other packages :)
@chacmool2581
@chacmool2581 Жыл бұрын
@@yuzaR-Data-Science What's the bestest, most prettiest decision-tree visualization package? Rpart and the like all look pretty industrial.
@yuzaR-Data-Science
@yuzaR-Data-Science Жыл бұрын
@@chacmool2581 Don't know, because I don't use decision-trees
@kennethgottfredsen767
@kennethgottfredsen767 Жыл бұрын
Doesnt work with Quarto, sadly.
@yuzaR-Data-Science
@yuzaR-Data-Science Жыл бұрын
Oh, good to know! I didn’t try it with quarto
@RUJedi
@RUJedi 5 ай бұрын
What parts (still) do not work with Quarto?
R package reviews | glmulti | Find The Best Model !
13:27
yuzaR Data Science
Рет қаралды 12 М.
Whoa
01:00
Justin Flom
Рет қаралды 57 МЛН
when you have plan B 😂
00:11
Andrey Grechka
Рет қаралды 52 МЛН
А ВЫ ЛЮБИТЕ ШКОЛУ?? #shorts
00:20
Паша Осадчий
Рет қаралды 1,9 МЛН
1ОШБ Да Вінчі навчання
00:14
AIRSOFT BALAN
Рет қаралды 6 МЛН
Mastering {dplyr}: 50+ Data Wrangling Techniques!
17:35
yuzaR Data Science
Рет қаралды 4,8 М.
Multivariable Linear Regression in R: Everything You Need to Know!
20:48
yuzaR Data Science
Рет қаралды 5 М.
R demo | Many (Grouped / Nested) Models Simultaneously are Very Effective
7:04
Whoa
01:00
Justin Flom
Рет қаралды 57 МЛН