2024 updated single-cell guide - Part 1: RNA preprocessing and quality control

  Рет қаралды 5,939

Sanbomics

Sanbomics

Күн бұрын

This is a comprehensive tutorial on the most up-to-date recommendations for single-cell sequencing. This is part 1 of a multi-part series. Here I download a dataset, remove background RNA, preform quality control, and remove low quality cells.
Part 2 will cover dimension reduction and cell annotation. We will eventually get to in-depth analysis and scATAC analysis.
Notebook:
github.com/mousepixels/sanbom...
Paper/dataset:
www.cell.com/cancer-cell/full...
Reference:
www.sc-best-practices.org/pre...
0:00 Intro
0:27 Setup
12:08 Cellbender
18:20 QC
28:05 preprocessing
39:42 Conclusions

Пікірлер: 66
@007ZK
@007ZK Ай бұрын
Amazing series idea. I hope they keep coming.
@sanbomics
@sanbomics Ай бұрын
Hope is next week!
@0996Winglet-mq4on
@0996Winglet-mq4on Ай бұрын
really appreciate your videos🎉❤cannot wait to see spatial omics tutorial in the future😊
@sanbomics
@sanbomics Ай бұрын
Right now I am eagerly waiting some interesting datasets with newer more high res technology than visium
@lly6115
@lly6115 Ай бұрын
Good to see you back😊 and thank you for your update
@sanbomics
@sanbomics Ай бұрын
Yeah sorry I have been busy! Shouldn't be as long between the next few videos.
@ykoy1577
@ykoy1577 Ай бұрын
I was waiting for your video. your video is so helpful for beginner like me. Thank you so much for sharing your knowledge and experience
@MrJordi94
@MrJordi94 Ай бұрын
You trully are an inspiration for rna-seq! Love your videos and your communication skills. Hope to see the rest of the 2024 tutotial soon :D
@sanbomics
@sanbomics Ай бұрын
Thank you
@piroDYMSUS
@piroDYMSUS Ай бұрын
Amazing work, hope we will see second part soon
@sanbomics
@sanbomics Ай бұрын
Trying to release in the next week or two!
@supakornpongpakdee1544
@supakornpongpakdee1544 Ай бұрын
Thank you very much for creating this tutorial! Looking forward to the next lessons!😊❤
@caspase888
@caspase888 Ай бұрын
I look forward to your videos. Your grasp on the subject and the ability to teach are amazing. Thanks a lot 👍🏻
@sanbomics
@sanbomics Ай бұрын
Thank you! :)
@dardas15
@dardas15 Ай бұрын
this is fantastic and really helps people with limited bioinformatics background to independently analyze data-thanks so much for making these videos, ive been using them with python ever since you shared a few years ago!
@yaseminsucu416
@yaseminsucu416 Ай бұрын
You rock! Thank you for doing this, looking forward to following this series!
@jackmineeechen4380
@jackmineeechen4380 Ай бұрын
I started with the video camparing different intergration method. That one really helped me! I eventually choose scanorama for my dataset, which worked out. Looking forward to this series! I appreciate your videoes!
@jonathanback5731
@jonathanback5731 Ай бұрын
Your work is fantastic, great content!
@DuqueVJ
@DuqueVJ Ай бұрын
Amazing! Thanks very much for the tutorial, I'm learning a lot!
@babyfriedrice4878
@babyfriedrice4878 Ай бұрын
i love sanbomics so much!!!!!!!!!!!!!!!!!!!
@sanbomics
@sanbomics Ай бұрын
I love you too!
@avp300
@avp300 Ай бұрын
this is brilliant! can't wait for part two!! Ridge plot look awesome! thank you Mark! :-)
@sanbomics
@sanbomics Ай бұрын
Tomorrow hopefully!
@jianhuacao7180
@jianhuacao7180 Ай бұрын
welcome back, bro. Your channel is better than before.
@sanbomics
@sanbomics Ай бұрын
Thanks! I am trying to continually improve the quality and make videos people are actually interested in.
@alexeyryzhenkov7579
@alexeyryzhenkov7579 Ай бұрын
Thank you for your work!
@sanbomics
@sanbomics 27 күн бұрын
Thank you so much!!! Really appreciate it! :)
@moonmoun2983
@moonmoun2983 Ай бұрын
Waiting impatiently for the next part
@sanbomics
@sanbomics Ай бұрын
Wait no further! :)
@laloulymounia9266
@laloulymounia9266 Ай бұрын
Thx for the update !
@taoufikbensellak9274
@taoufikbensellak9274 Ай бұрын
I just started your sc guide and I really enjoy it. Just for some clarifications about the tools, I use mamba (conda) with python 3.8 and a lower version of pandas (
@sanbomics
@sanbomics 10 күн бұрын
I'll be doing DE using a different approach this time which should give people fewer issues. Diffxpy can be a struggle so I don't really use it anymore
@brunovinagre427
@brunovinagre427 Ай бұрын
gratefull Mark!!
@user-ne7vm7fb3y
@user-ne7vm7fb3y Ай бұрын
You were great.
@gerolduntergasser4000
@gerolduntergasser4000 Ай бұрын
cool good job😁
@MinnnWang-uv8bn
@MinnnWang-uv8bn Ай бұрын
🎉🎉🎉thanks!
@moonmoun2983
@moonmoun2983 Ай бұрын
I would like to thank you immensely because you’re one of the few bioinfo channels I can follow along, I have a question regarding a result I obtained from a following the previous full scRnA seq walkthrough you posted a year ago. I tried applying the code to a before and after chemotherapy treatment. Everything worked perfectly until i got to the deg analysis part with heat maps, With 25 top upregulated and downregulated genes and the filtering codes it didn’t yield more than 12 degs, so I had to reduce the filtering and kept genes with significant fold change above 0.05 . And I ended up with more differentially expressed genes, however in both cases my heat map was devoid of pattern, both the condition and control looked mostly downregulated. Should I conclude that there is no deg or expression signatures in both cancer sample before and aftertreatment? Because the original paper i took my data from didn’t do a deg analysis for the whole dataset but selected 4 patients out of 12 to create a deg heatmap with less than 10 genes. thank you, I’d highly appreciate your insight on my results
@sanbomics
@sanbomics Ай бұрын
Its really hard to say without knowing more and actually getting a feel for the data. You can try a pseduobulk approach and see if you have and degs. I have a video on that, but will also be covering it soon in the new tutorial series.
@kristifourie8427
@kristifourie8427 Ай бұрын
best page ever
@sanbomics
@sanbomics Ай бұрын
Thank you :)
@islemgammoudi842
@islemgammoudi842 Ай бұрын
Thanks for the Videos. Currently, I'm embarking on the journey of analyzing single-cell RNA sequencing (scRNA-seq) data combined with CITE-seq data. However, I'm facing challenges related to duplicate discrimination and assigning sub-samples via hashtags. Given your expertise in this area, I was hoping you could provide some guidance and advice on how to navigate these challenges effectively.
@mehdiraouine2979
@mehdiraouine2979 Ай бұрын
amazing work as always ! on a side note, if I were to download a fastq data from GEO with no specification of whether the adapters were removed or not in the paper, how should I check if they were removed on python.
@sanbomics
@sanbomics 27 күн бұрын
I wouldn't use python to do it only because there are several command line tools that are much faster that can do the same thing. Like cutadapt
@555gong9
@555gong9 Ай бұрын
Thank you for such a great video. Which is better for removing doublets, doubletdetection or the previous SCVI method?
@sanbomics
@sanbomics Ай бұрын
I haven't done or seen a comparison between the two. The best would probably be to run both and see how they overlap. All i can say is that doubletdetection is easier and faster
@555gong9
@555gong9 Ай бұрын
Thank you for your advice, I will try it next, thank you very much, my superhero.
@fsh9134
@fsh9134 8 күн бұрын
Thanks for making very useful videos. I was wondering if you would like to make a video related to single cell analysis using Julius AI a data analysis AI.
@goddyhong
@goddyhong Ай бұрын
thx for sharing! if i use a filtered matrix for analysis, do i still need to remove the background RNA? since i dont have a 4090🤣
@sanbomics
@sanbomics Ай бұрын
If you have a filtered matrix you can't remove background RNA. But if its just a time thing, you can use your CPUs with SoupX. I have another video on that. If you only have filtered counts, you are stuck with what you have!
@CaveCrack
@CaveCrack Ай бұрын
Thanks for the great video and series. I have a question at around 36:40 on how to interpret the graph. If the experiment had loaded say 14000 cells it appears that around 8000 would be recovered which I assume we would interpret as the number called by cellranger... For 14000 cells loaded the multiplet rate appears to be 6%, 6% of 14000 being 840 expected multiplets. However, all the blue recovery dots are aligned around 4.5%. 4.5% of 8000 would be only 360 expected multiplets. The document from which the graph is extracted says "Generally an increased number of cells per sample will increase the doublet rate". I've not been able to find clarification. Thank you
@CaveCrack
@CaveCrack Ай бұрын
Also, I am wondering if your low number of detected doublets at 1e-16 was due to the previous QC step where you exclude cells with the highest logp_total_counts and log1p_n_genes_by_counts, as these could filter a lot of doublets.
@sanbomics
@sanbomics Ай бұрын
I think in this case just ignore the blue line. The more cells you load the higher multiplet rate and more total multiplets you will have
@sanbomics
@sanbomics Ай бұрын
Exactly, it's hard to say exactly what percent the multiplets are because of the first step. I think I mention it in the video briefly... or at least i thought it
@abellopez8017
@abellopez8017 Ай бұрын
Hello! Thanks for the Video, I will begin my PhD in Bioinformatics in August, what computer do you have?
@sanbomics
@sanbomics Ай бұрын
Well.. at home I have a 32 vCPU, 128 gb ram, rtx 4090. At werk I have a 64 cpu, 256 gb RAM, rtx 4090. Sometimes I have to use AWS when I need more than that. Depending on what you plan to do it can vary a lot.
@mehdiraouine2979
@mehdiraouine2979 Ай бұрын
Another question: if you were to choose between SCVi model for detecting doublets and this clf doubletdetection method, which one is more straightforward? I feel like this method needs some tinkering around depending on the specific dataset
@sanbomics
@sanbomics 27 күн бұрын
The best method would be to use multiple methods. They will all give you slightly different results but hopefully have significant overlap. The reason I used doubletdetection here is because it is fast/simple and I already have multiple video tutorials on SOLO (scVI). It's hard to say which is more accurate. Changing parameters in scvi/SOLO will likely change the results a lot too just like what happened here.
@AP-vo7gp
@AP-vo7gp Ай бұрын
Sir, I have count matrix and want generate annotation matrix out of it then do the batch correction and then DGA plz help via process as i am not getting suitable results.
@sanbomics
@sanbomics 10 күн бұрын
Hi it is hard for me to help without knowing more specifics and what the issue you are having is
@AP-vo7gp
@AP-vo7gp 9 күн бұрын
@@sanbomics thanks alot sir I was able do it :)
@pinchos90
@pinchos90 Ай бұрын
are you're still going to develop workflows for R or you're sticking with python?
@sanbomics
@sanbomics 27 күн бұрын
I prefer python, but even this tutorial series will have some R in it because it is unavoidable. So I will have more R videos in the future
@ghujka
@ghujka Ай бұрын
Have a beer on me bro🍺
@sanbomics
@sanbomics 27 күн бұрын
Thank you!!! I can do that ;)
@charlieintampa6769
@charlieintampa6769 Ай бұрын
F%(k. Seems super useful but you could have been speaking any random language and I would have understood about the same.
WHY THROW CHIPS IN THE TRASH?🤪
00:18
JULI_PROETO
Рет қаралды 5 МЛН
ELE QUEBROU A TAÇA DE FUTEBOL
00:45
Matheus Kriwat
Рет қаралды 34 МЛН
Final increíble 😱
00:39
Juan De Dios Pantoja 2
Рет қаралды 44 МЛН
1🥺🎉 #thankyou
00:29
はじめしゃちょー(hajime)
Рет қаралды 80 МЛН
The Idea Behind Neural Networks
7:03
Daniel Krei
Рет қаралды 4,5 М.
Introduction to single cell ATAC data analysis in R
17:36
Sanbomics
Рет қаралды 13 М.
Spatial Transcriptomics
5:34
Journal of Investigative Dermatology (JID)
Рет қаралды 11 М.
Single cell analysis in python with Scanpy
20:13
Sanbomics
Рет қаралды 24 М.
Is Data Science Hard or Easy?? #dataanalytics #datascience
0:59
BEPEC by Kanth - #BuildExperience & Get Hired!
Рет қаралды 995 М.
Data visualization with R in 36 minutes
36:16
Equitable Equations
Рет қаралды 22 М.
Main filter..
0:15
CikoYt
Рет қаралды 1,9 МЛН
ПК с Авито за 3000р
0:58
ЖЕЛЕЗНЫЙ КОРОЛЬ
Рет қаралды 2 МЛН