I think it's important that you explain why we assumer 7.5% doublet in our data. I know it has something to do with the number of droplets captured. But how do we determine the number of droplets captured (in order to infer the estimated % of real doublets)? Thank you!
@kitdordkhar49642 жыл бұрын
This was very useful. It was different from our analyst strategy. Small request, instead of terminal bash, it would be helpful if you can route through save folders and files [setwd> ]. Thanks!
@Bioinformagician2 жыл бұрын
Thank you for the suggestion, I am more comfortable in maneuvering through the folders via terminal. However, I shall try to do it via R next time :)
@hyebinhan64732 жыл бұрын
THANK YOU!!! This was a life saver. Quick question: I plan to use tabula muris senis, the mega mouse single-cell dataset and I was able to manuver through selecting age/organs I wanted to use. BUT I believe they have datasets per mouse and per organ... if that's the case, do I still have to run doubletFinder on each mouse or do you think I can use the selected age/organ, with the assumption that preparation process was similar enough that batch effect would likely be minimal..... I have 15 mice on Tabular muris I plan to use and additional 15 mice I have to filter 🥲
@Bioinformagician2 жыл бұрын
I suggest you first process your data with all 15 mice at once, as a merged object and visualize. Look for batch effects. If you don't find any, then you run doubletFinder on merged object. If you do find batch effects in your data then you will have to take the run doubletFinder for each individual mice route.
@xiaosajackxu42422 жыл бұрын
Amazing job! Can you paste your codes of how you subset and recluster singlets after finishing DoubletFinder? Or can you confirm if you did exactly the same as the following steps? Thanks! singlet
@Bioinformagician2 жыл бұрын
Yes, I would run the steps you ran to recluster my cells after removing doublets from my data. Thank you for the suggestions for video topics, I have them in my pipeline :)
@mostafaismail42532 жыл бұрын
Please we need application of NMF (non negative matrix factorization) in scRNA-seq for finding expression programs
@Bioinformagician2 жыл бұрын
I'll consider making a video on this soon :) Thanks for the suggestion.
@sonaaritra11 ай бұрын
Hello Khusbu, I'm working with a publicly available dataset GSE193688 where they have provided individual .h5 files for every samples. I'm trying to run the doublet finder program on it but as you have mentioned that it should not be preferable to run on merged samples then should I run it for each one separately? I have a total of 18 files for individual biopsy samples. Is there any faster method?
@marionaisern6420 Жыл бұрын
I don't understand why in a dataset of 15000 real cells, a pN of 0,25 would represent the integration of 5000 artificial doublets... If anyone can solve my question... Thank you!!!
@문홍만-y2t Жыл бұрын
22:25 I want to clear lines with doublet characters from DF.classification column in metadata table. How can I clear it by typing command? Because to remove the doublet and integrate all samples.
@giovaniclaresta2356 Жыл бұрын
Hi Thank you for very details tutorial!! May i know how I can get the cell identity from demuxlet data after I get all the singlet?thank you
@熊飞-b5k2 жыл бұрын
Thank you for this video, but the question is whether the search and removal of doublet should be carried out before data merging and QC. In your previous video of data integration, you merged 7 samples. Does that mean that we need to clean the data 7 times before merge?Hope for your reply.
@熊飞-b5k2 жыл бұрын
What I mean is when we need to integrate several datasets, before which step should we perform the detection of doublets?Befor merge datasets?If the detection of doublets should be done before merge() function, is it necessary to perform QC and pre process standard workflow for each dataset separately?
@Bioinformagician2 жыл бұрын
Yes, it is recommended to perform doublet removal and QC for each dataset individually before integrating datasets. It can however be run on merged data. The standard workflow steps just helps identify and remove clusters of cells with low UMI or high mitochondrial %. These low quality cells must be filtered out before running a doublet prediction algorithm and before integrating and moving ahead with further downstream analysis.
@EdDone-q6g9 ай бұрын
Thanks for this workflow and shared the code. I have one issue when I run your code at the second last step. > DimPlot(pbmc.seurat.filtered, reduction = 'umap', group.by = "DF.classifications_0.25_0.21_691") Error in `[.data.frame`(data, , group) : undefined columns selected In addition: Warning message: The following requested variables were not found: DF.classifications_0.25_0.21_691 Could you please help to check it? Thanks.
@Carolina_pt Жыл бұрын
Thank you so much for this tutorial it's very informative. I was wondering if you knew how to find the expected number of doublets for icell8 sequencing data? Thank you in advance
@veerachon22812 жыл бұрын
Could you please explain, How to assume this or this value is commonly expected ? -> Assuming 7.5% doublet formation rate
@Bioinformagician2 жыл бұрын
10X user guides provide expected multiplet rate for different protocols. Here I have used the table on page 18 from the Chromium Next GEM Single Cell 3ʹ Reagent Kits v3.1 user guide (support.10xgenomics.com/single-cell-gene-expression/library-prep/doc/user-guide-chromium-single-cell-3-reagent-kits-user-guide-v31-chemistry) to get the doublet formation rate.
@youvikasingh7955 Жыл бұрын
@@Bioinformagician But what if I had 10000 cells as input and approx 1100 recovered cells?🤔..Thanks really helpful channel😍
@jessicacastillo853511 ай бұрын
@@youvikasingh7955 How did you solve that issue? Thanks!
@kalpanidesilva3062 Жыл бұрын
Thank you very much. Can you please do a tutorial on how to use DropletUtils library
@chadhighfill45782 жыл бұрын
How would you filter out the doublets?
@tomasmontserrat7042 жыл бұрын
I think you can use subset(): pbmc.seurat.filtered
@Bioinformagician2 жыл бұрын
That's right! You can use subset() to filter out doublets.
@chadhighfill45782 жыл бұрын
@@Bioinformagician How do you do this when DF.classification_SOME VALUE is always changing? i.e. how do you filter out the doublets in a dynamic way?
@SavannahVictoria-d8i11 ай бұрын
Thank you for your tutorial,could you please tell me if the paper tell us how to mark doublets in the raw data?
@Surajcxscsingh Жыл бұрын
so we are only putting aside hetrotropic doublets not homotropic
@blackmatti862 жыл бұрын
Can I still run DoubletFinder on 'SCTransform normalised' sample? If yes, is it as simple as setting 'sct = TRUE' in 'sweep.res.list_pbmc
@Bioinformagician Жыл бұрын
DoubletFinder can be used on Seurat object that has been SCTransform during pre-processing steps. And yes, it is as simple as setting sct = TRUE.
@anaarsenijevic3207 Жыл бұрын
Hello, Thanks for the great tutorial! I have one question, maybe I missed it, but - why do you use the nsclc data when calculating the pK value (starting from line 47) rather than pbmc that you used in the steps before that? Thank you!
@RupakDeySarkar Жыл бұрын
@anaarsenijevic3207, she used the pbmc seurat object only in line 47. Only the name of the list she created has the nsclc name, you can name it anything you want.
@kendy172 жыл бұрын
You're awesome keep up the amazing work!
@Bioinformagician2 жыл бұрын
Thank you :)
@ravimore5786 Жыл бұрын
Thank you very much for this workflow. It's really helpful to understand the process and steps involved in the doubletfinder. I appreciate your efforts to educate the researcher through this activity.
@Ob-xt4ej Жыл бұрын
Thank you for tutorial. I run pK Identification code, and then pK=0.2. The number of doublets is the same, but the shape of the graph is different. I wonder if I can move on to the next step or if I need to fix this issue. Thank you!
@Bioinformagician Жыл бұрын
Did you use Strategies for pK optimization? Did you find your optimum pK to be 0.2?
@kimiaslk93487 ай бұрын
you are amazing thank you so much
@parmenideskim97392 жыл бұрын
A really great video!!! Thank you very much !!!
@tulikabhardwaj4842 жыл бұрын
Waiting for your metagenomics and metatranscriptomics one.
@Bioinformagician2 жыл бұрын
I will surely consider making a video on this in the near future :)
@tushardhyani39312 жыл бұрын
Thank you for this video !!
@pariaalipour612 жыл бұрын
Thank you so much for this helpful video. I have a question. At the last step that we detect doublets and we remove them how we could go back to the first step to do integration? no sure how to transfer the needed assay to the data.
@Bioinformagician2 жыл бұрын
You shall use "integrated" assay (if used CCA method to integrate), and move forward with the steps just how you would process data in 'RNA' slot of Seurat object.
@pariaalipour612 жыл бұрын
@@Bioinformagician When I do DoubletFinder the integration still needs to be done. I mean after subsetting doublets from every individual sample, what approach I need to take. Should I move forward with subsetted samples and integrate. Thanks
@kanahia74602 жыл бұрын
I do really enjoy your channel 🤠 I am doing same analysis and it is very kind of you that you share your approach and code! Many thanks 👍
@Bioinformagician2 жыл бұрын
I am glad to hear my videos have been helpful! Thank you for your kind words :)
@tulikabhardwaj4842 жыл бұрын
Thanks thanks thanks a lot
@blackmatti862 жыл бұрын
What do you do when running 'bcmvn_pbmc
@Bioinformagician Жыл бұрын
I am unable to answer why you get NULL at find.pK step as I cannot recreate this error.
@rahmaqadeer9178 Жыл бұрын
Did you sort this out? I also get the same 'null' as I run this although my data is stored in this variable when I print it
@blackmatti86 Жыл бұрын
@@rahmaqadeer9178 No, didn’t manage to fix this
@beatriceplougastel-douglas1861 Жыл бұрын
I am also getting ' bcmvn_nsclc % select(pK)' my numeric value for the pK is 20
@NBAasDOGG Жыл бұрын
@@rahmaqadeer9178 The problem is that ParamSweep cannot find your normalized RNA counts. Here’s how to fix it: Instead of using "NormalizedData(sobj, normalization.method = "LogNormalize", scale.factor = 10000)" Do the following: "sobj