Bioinformatics - Gene Ontology (GO) Enrichment Analysis

  Рет қаралды 26,838

Alex Soupir

Alex Soupir

Күн бұрын

Today we are going to do some gene ontology enrichment analysis and look at what GO terms are enriched from the presence of 53 in our mice that were irradiated. If you find this helpful, please consider liking the video and subscribing!
We are using the same project file that we set up a few videos ago to analyze our data!!
GO overview:
geneontology.or...
Book Chapter about p53:
www.sciencedir...
These videos are intended to be learning material for others as well as myself. I like helping others and hope these videos spur conversation.
The image at the bottom left of the thumbnail is modified from AllGenetics.EU.
Please consider contributing to my Patreon where I may do merch, gather ideas for future content, and have further discussions:
/ alexsoupir

Пікірлер: 31
@josiknuff5900
@josiknuff5900 2 жыл бұрын
Many thanks for your videos! One question : is there a possibility one can get the gene ids / entrez ids belonging to the enriched GO term? I get RNA binding as enriched GO term and I would like to access the specific gene entries responsible for this enrichment.
@HunterDriguez
@HunterDriguez 3 жыл бұрын
I've been trying to figure out what each column in the GO enrichment output is...is the "Size" column the total number of genes associated with a particular term that are expressed in the tissue? Is the "ExpCount" column the number of genes related to a particular term that you would expect to get by chance in your smaller gene set? Maybe the "Count" column is the actual number of observed genes related to a term that you got in the smaller gene set?
@sanjaisrao484
@sanjaisrao484 2 жыл бұрын
Thanks
@arjunkhadka4466
@arjunkhadka4466 3 жыл бұрын
Hi Alex, I do not have Entrez ID coz i am not working with model organisms. The output from the deseq2 has basemean, log2foldchange, lfcSE, stat, pvalue and padj only. Let me know if you can help me.
@alexsoupir
@alexsoupir 3 жыл бұрын
Hey Arjun. Unfortunately in that case, I don't know if I can help. There is a way using BioMart to sorry of make your own as previously mentioned, but it doesn't always have all information. You might be able to read what others have done for non-model organisms if Bioconductor doesn't have your organism at all. Maybe there's a way to convert between organisms?
@soumyarao8006
@soumyarao8006 2 жыл бұрын
Hi Alex, I have ensemble gene ids of mouse to start with, many of which lack Entrez IDs. Moreover, the selectGenesUP and selectGenesDown are null/empty even after changing padj to 0.1 and sig_lfc to 2 or 1. What should I do
@kyungwonmin7217
@kyungwonmin7217 Жыл бұрын
Hello. Is it possible to do GO analysis with non-model species? I have DESeq data with functional annotation file that contains Ref-seq or GO ID. In your video, org data base has been used, but my data cannot match with it.
@alexsoupir
@alexsoupir 8 ай бұрын
There are ways to build custom 'org.*' data objects, and has been a long time since I have done this. I believe it has to do with the biomaRt or AnnotationDBI package. Would start there and do further searching.
@jimdurant1192
@jimdurant1192 3 жыл бұрын
Very helpful! Thank you for doing these!
@alexsoupir
@alexsoupir 3 жыл бұрын
Sure thing! If you have recommendation or requests let me know and I'll see what I can do haha
@jimdurant1192
@jimdurant1192 3 жыл бұрын
@@alexsoupir wow, thanks. Big thing is it has been about 3 decades since i took biology. Just getting an overall grasp of the big pictute of transcriptomics has been challenging.
@jimdurant1192
@jimdurant1192 3 жыл бұрын
@@alexsoupir i work in chemical risk assessment and am trying to learn toxicogenomics... so your example with p53 was useful. Any other examples along screening for chemical carcinogens would be useful as well.
@tinacole1450
@tinacole1450 3 жыл бұрын
Is there a way to find the GO term which has the pathway info after you have gathered the counts data?
@akmmahmudulhuque9846
@akmmahmudulhuque9846 3 жыл бұрын
Thanks a lot, Alex Soupir. These are by far the most detail videos regarding RNAseq data analysis. I will recommend this channel to all the beginners at my current institution (GUGC, Korea) Could you please suggest anything like that for SNP data analysis as well?
@alexsoupir
@alexsoupir 3 жыл бұрын
Hey thanks. I figured it would be a good way to help people get through an analysis as well as get feed back from other people about possibly ways they analyze the data. Science is collaborative and I try to bring that to learning too.
@alexsoupir
@alexsoupir 3 жыл бұрын
Forgot to answer your question. I will ask another postdoc on Monday what he uses - he does SNP work with transcription factors so I think he would be good.
@akmmahmudulhuque9846
@akmmahmudulhuque9846 3 жыл бұрын
@@alexsoupir Thank you so much for your cooperation. Looking forward to hearing from you.
@alexsoupir
@alexsoupir 3 жыл бұрын
Howdy! I asked my coworker today and he said you can use samtools mpileup? I guess I don't know enough about it to answer more. The manual for samtools says if you're looking for variant calling you should use bcftools mpileup. I'll have to do more looking because now I'm interested! Would be fun to explore some cancer sequencing data for SNPs. Sorry I'm not able to be more helpful. Possibly in the future I'll try to do this!!
@akmmahmudulhuque9846
@akmmahmudulhuque9846 3 жыл бұрын
@@alexsoupir Hello, Alex! Thank you so much. I will try with samtools and bcftools. Will also be looking for your video tutorials as those provide hands-on training.
@fseiva
@fseiva Жыл бұрын
Congrats for your videos!! Could you please tell me if there is a way to save the results, for exemple of "upBP", as a data frame?
@gillesirfsh458
@gillesirfsh458 2 жыл бұрын
very informative... thanks, with one question though. what would be the approach in R, when dealing with non model organisms. ie the ones that are not well annotated
@alexsoupir
@alexsoupir 2 жыл бұрын
Hello, Gilles. There is a package that lets you build annotations for other organisms, though I don't know if all of them are in the data base. I believe it's a combination of AnnotationDBI and BioMART? but this would need to be further explored. Hope this is helpful!
@wmarei
@wmarei 3 жыл бұрын
When we run the hyperGTest, we do we set the universeGeneIds to the unique genes of res_05 and not all genes in res. I understand that we are trying to calculate the hypergeometric calculation of our subset of genes (Up or down) in total genes detected (without any significance or cut off values). Or do I miss something?
@alexsoupir
@alexsoupir 3 жыл бұрын
That's just how we were taught but a good question. I think it has to do with how we are subsetting the data. So we only want to look at those genes that are significant (hence the initial p-value subsetting), and then we want to find the ones are up or down which we can see using the log fold change (positive vs negative) - but with the log fold change we are also setting a threshold like we want them to be at least doubled for example with an LFC of 1. Does this answer your question?
@vpeska
@vpeska 3 жыл бұрын
@@alexsoupir Hi Alex, very helpful videos, I am a bit confused with the p-values. I understood padj subset. What I do not understand is the cutoff in uParams and downParams. What does this p value stand for? When I set it to 0.01 I have empty result. When p=1 I have like 18 genes and their GOs with p-value 0.04-0.95
@felipebatalini
@felipebatalini 3 жыл бұрын
​@@alexsoupir Thanks for sharing this. The only caveat is that I think the right approach is setting the "universe" to all the genes expressed in the experiment, so in a typical RNAseq experiment, we would use all genes expressed. I would double-check on this.
@genticswithkazan
@genticswithkazan 3 жыл бұрын
Very clear and helpful Thank you for your effort I have two questions 1- what is the statistical method in this Analysis 2 - can you kindly suggest any visualization tool/package for this analysis Thank you in advance
@alexsoupir
@alexsoupir 3 жыл бұрын
Howdy! So for DESeq2, I believe it is using the Wald Test of each gene to test for difference (Wald Test I believe is a modification of the Chi-sq test). DESeq2 is actually a great package and if you're in for a good read could read the vignette of it which explains everything (likely more than you're interested in). Something interesting I found is that it doesn't perform a normalization past the median ratio method because if you're just going to be comparing individual genes, each sample was mapped to the same gene region and therefore in a sense already baselined. However due to the differences in library sizes (actually sequencing throughput) the 'library' needs to be normalized and this is the median ratio method. For visualization, the next video walks through how to visualize but `pathview` and the `gage` packages are useful, and actually make some really nice plots. This can be found here:kzbin.info/www/bejne/iX6ld2d6rrecq9E
@jyoti9426
@jyoti9426 3 жыл бұрын
Me always trying to say 'Alright, everybody!' in the beginning at the same time as he does. XD
How to use DAVID for functional annotation of genes
12:55
Genomics Guru
Рет қаралды 76 М.
GO terms
34:18
mike vandewege
Рет қаралды 7 М.
English or Spanish 🤣
00:16
GL Show
Рет қаралды 17 МЛН
这三姐弟太会藏了!#小丑#天使#路飞#家庭#搞笑
00:24
家庭搞笑日记
Рет қаралды 120 МЛН
So Cute 🥰
00:17
dednahype
Рет қаралды 46 МЛН
The CUTEST flower girl on YouTube (2019-2024)
00:10
Hungry FAM
Рет қаралды 42 МЛН
Pathway enrichment analysis - simple explanation!
12:53
Biostatsquid
Рет қаралды 22 М.
QuickGO - Gene ontology annotation 2017
27:49
European Bioinformatics Institute - EMBL-EBI
Рет қаралды 36 М.
Bioinformatics - Visualizing Counts Data
34:33
Alex Soupir
Рет қаралды 9 М.
Pathway enrichment analysis tutorial in R with clusterProfiler()
18:59
MCB 182 Lecture 5.4 - Gene ontology enrichment analysis
28:32
Gerald Quon
Рет қаралды 1,4 М.
How to write and interpret gene ontology (GO) in research articles ?
15:21
Dr. Asif's Mol. Biology
Рет қаралды 18 М.
Gene set analysis - GSEA and Fisher's exact test
14:55
TileStats
Рет қаралды 6 М.
Gene Ontology and mRNA visualization (Bioinformatics S12E2)
50:43
Gene set enrichment analysis in R
1:29:32
Kim Dill-McFarland
Рет қаралды 28 М.
English or Spanish 🤣
00:16
GL Show
Рет қаралды 17 МЛН