Bioinformatics - Understanding FastQC/MultiQC (Timestamps)

  Рет қаралды 12,439

Alex Soupir

Alex Soupir

Күн бұрын

Пікірлер: 7
@drumpdump1995
@drumpdump1995 2 жыл бұрын
Nice explanation. I found the overrepresented sequences really useful to confirm that your hashtag or cite-seq antibody binding and sequencing quality
@gmochales
@gmochales 4 жыл бұрын
thanks for the video! according to you, which would be the expected pattern in sequence duplication levels for ddRADseq? I have two big peaks in >10 and >100, thanks!
@alexsoupir
@alexsoupir 4 жыл бұрын
Hey, Gabriel - I am not sure. Actually haven't heard of ddRADseq until you mentioned it. Something for DNA sequencing that I have read which can increase the duplication levels is if the genome has a lot of repeats in it. Just like with transcriptome where genes that are highly up-regulated will cause duplication warnings. You can try to run the rest of your analysis and see if there is anything strange that comes from higher duplication levels. Another thing might be from PCR amplification after the library has been prepped, depending on how many cycles you run there is a chance that, after too many, you can end up with PCR bias. Not sure how your library was prepped but that could be something to think about, too. Sorry for not knowing exactly the answer - I'd have to read more into ddRADseq but it sounds like the dd part is making the selection rather narrow so some duplication would be expected for those genomic regions flanked by the restriction enzymes AND regions of a specific size rather than preparing a library 100% randomly from genomic DNA. But, like I mentioned, that is just a guess.
@victorrorisang479
@victorrorisang479 2 жыл бұрын
Can i trim my rawreads twice?... meaning i trim raw reads and take the results and trim them?
@arpitabhatt989
@arpitabhatt989 Жыл бұрын
I have done 16S metagenomic sequencing and the sequence duplication percentage is 90 %? Is it fine?
@alexsoupir
@alexsoupir Жыл бұрын
Without more information allow me to put forth some thought: 1) rRNA sequences are highly conserved regardless of bacterial or animal. Depending on what is being explore with such sequences, and the source, it's plausible to say that the sequence will overlap a great deal. 2) Duplication may mean a few different things, remember. The use of a unique molecular identifier (UMI) could aid in narrowing down the source of the duplication - is it from a high abundance of a particular species bacteria? is it from PCR bias? is it from some unaccounted for source? Really depends on the parameters from which the data was derived. 3) Looking at single isolate sequencing results. If the data that is being explore is from a single source, then the duplication *should* be high. If trying to sequence a pure source yet seeing low duplication, would be concerned. if you sequenced a short region of your genome (say 18S rDNA sequence) there should be high coverage and depending how deep the sequencing is, the duplication will increase (only so many locations for unique molecules to originate from thus deeper and deeper, even if different molecules, probability of seeing the same start and end increases). 4) Inclusion of adapter sequences not known to the software. If working with raw data and some custom adapters that FastQC is unaware of, could imagine it would flag them as duplicated sequences if sequencing through reads (technical issue). Would be more unlikely because of both likelihood of custom adapters (low, niche sequencing company) as well as the tech running the machine allowing a sample with short inserts being sequenced with the wrong kit. Summary - not always unexpected that there is duplication. Have to understand several things to assess whether any at all is expected or technical. Would reach out to the sequencing service provider to further explore reasoning. acs
Bioinformatics - Understanding Trimmomatic
23:34
Alex Soupir
Рет қаралды 17 М.
Bioinformatics - fastp FastQ Preprocessing Tool (Timestamps)
24:44
Каха и дочка
00:28
К-Media
Рет қаралды 3,4 МЛН
Quilt Challenge, No Skills, Just Luck#Funnyfamily #Partygames #Funny
00:32
Family Games Media
Рет қаралды 55 МЛН
Леон киллер и Оля Полякова 😹
00:42
Канал Смеха
Рет қаралды 4,7 МЛН
5 genomics file formats you must know
19:10
OMGenomics
Рет қаралды 26 М.
Using FastQC to check the quality of high throughput sequence
11:34
BabrahamBioinf
Рет қаралды 189 М.
R Bioinformatics  Reading fastq with Bioconductor
12:10
Nextbest
Рет қаралды 13 М.
FastQC tool for read data quality eval
9:36
Loren Launen
Рет қаралды 27 М.
Understanding the concepts behind FASTQC reports | Complete Explanation
22:57
Bioinformatics - SRA Download, QC, and Trimming
37:00
Alex Soupir
Рет қаралды 15 М.
Bulk RNA-Seq Analysis Demystified: Step-by-Step Tutorial Part 3  #fastQC #MultiQC
12:52
7 - Understanding the output of FastQC
7:11
Taj Azarian
Рет қаралды 6 М.
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,3 МЛН
Using MultiQC Reports
6:07
Phil Ewels
Рет қаралды 80 М.
Каха и дочка
00:28
К-Media
Рет қаралды 3,4 МЛН