How to manipulate gene expression data from NCBI GEO in R using dplyr | Bioinformatics for beginners

  Рет қаралды 55,709

Bioinformagician

Bioinformagician

Күн бұрын

Пікірлер: 99
@danielajbq
@danielajbq 2 жыл бұрын
youre an ANGEL for making these. I am doing my MS in bioinformatics right now and this is genuinely better than some of my courses. Thank you!!
@MichealIdedia
@MichealIdedia 5 ай бұрын
Hello, are you done with your Msc now?
@mayank9986
@mayank9986 Жыл бұрын
I am new to programming. I was looking for help to analyse RNAseq data and your video just came as a blessing. Thank you a ton.
@amitrupani9898
@amitrupani9898 2 жыл бұрын
Thank you for this very helpful video! I have recently moved from a clinical genetics laboratory to a research laboratory where pipelines are written in R and they extensively leverage the capabilities of dplyr library. So, I needed a tutorial to help me understand its basic functioning. This helped. Keep up the good work you are doing through this channel. Cheers!!
@Bioinformagician
@Bioinformagician 2 жыл бұрын
I am really glad this helped you get a basic understanding of dplyr package. Thank you for your kind words, encourages me to do more of this! ☺️
@sanjaisrao484
@sanjaisrao484 2 жыл бұрын
Excellent explanation, Thanks for teaching the basics of R, It was extremely helpful, please continue to make more videos
@eylulozerbil8548
@eylulozerbil8548 Жыл бұрын
This tutorial encouraged me to continue my R learning process by showing me how I can manipulate these kind of datas in the simplest way! thank you bioinformagician :)
@mahshidpooladvand8502
@mahshidpooladvand8502 3 ай бұрын
This was the best tutorial I could possibly find online!!! You are incredibly smart! Thanks!
@zlj8435
@zlj8435 2 жыл бұрын
Thank you for this wonderful course! I am a year 1 PhD student and it really helps me a lot!
@Radslom
@Radslom Жыл бұрын
This video was extremely helpful for me. I am currently learning how to use R and GEO2, and this video helped to clarify it. Thank you and keep up the great work!
@muyyy9000
@muyyy9000 11 ай бұрын
Thank you so much for making content like this. It's extremely helpful for beginners like me trying to analyze gene expression data on Rstudio.
@jessicus
@jessicus Ай бұрын
THANK YOU SO MUCH!!! I'm doing undergrad cancer research right now and I've been looking for a way so that I can analyze an expression matrix/transcriptomic data in R.
@Grzegorz-f1b
@Grzegorz-f1b 6 ай бұрын
Thank You my new teacher I work actually about that biogenetics in IT and C++ this video helps me very much ❤️🙏👌
@aishaa812
@aishaa812 4 ай бұрын
Thank you. Its extremely helpful for me since I am a beginner in R studio and I am trying to apply data analysis in R studio.
@Bunga-p5i
@Bunga-p5i Жыл бұрын
Thank you for the great tutorial! Just to let you know, I had to download these packages first to perform your script. install.packages("dplyr") install.packages("tidyverse") if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("GEOquery")
@syedmansoorjan2671
@syedmansoorjan2671 2 жыл бұрын
Amazing, don't have words to say for you.. try to share more... I just found this very helpful...
@Ojaswini-Pathak
@Ojaswini-Pathak 2 жыл бұрын
Very well made video and your understanding of the subject is tremendous!
@claudiocesarmontenegrojuni5141
@claudiocesarmontenegrojuni5141 Жыл бұрын
You're amazing teacher! Thank you so much for this outstanding content.
@mocabeentrill
@mocabeentrill Жыл бұрын
Thank you. You're really good at what you do. I did tis in base R and oh my word, it looks grotesque!
@mikewafula9470
@mikewafula9470 Жыл бұрын
Thanks so much for this great video. You have made it easy for me to explore gene data analysis with R. Keep sharing such content. Cheers!!
@hemanthchenga5671
@hemanthchenga5671 2 жыл бұрын
Thanks for explaining the code in detail and please make more videos
@mohammeddabbour2254
@mohammeddabbour2254 2 жыл бұрын
Wonderful explanation. Thank you so much for making this tutorial. Just a sidenote: when both dplyr and plyr (from tidyverse) packages are loaded and you want to use a certain function, it is better to specify the package the function is available in when calling the function (such as: dplyr::rename()). Otherwise, R may mistakenly think you are trying to use the function in the plyr package and return an error. Happy coding!
@Bioinformagician
@Bioinformagician 2 жыл бұрын
Correct, thanks for pointing it out. Have taken care of that in the videos following this video :)
@余长
@余长 Жыл бұрын
Very helpful and you are very patient. It seems that you know exactly what my questions are.
@jammerkd
@jammerkd 2 жыл бұрын
Excellent videos and you are a fantastic teacher
@rajanirao6011
@rajanirao6011 2 жыл бұрын
These videos are so good!!! Good practise to learn R. Thank you!
@Bioinformagician
@Bioinformagician 2 жыл бұрын
I am glad you found this helpful! :)
@cerenuzun5989
@cerenuzun5989 2 жыл бұрын
It was very helpful and it would be great if you continue these tutorials. Thank you so much!!
@Bioinformagician
@Bioinformagician 2 жыл бұрын
I am glad you find my videos helpful! :)
@setarehsohail5422
@setarehsohail5422 2 жыл бұрын
Amazing!! You are a professional teacher!! Thanks!
@seungwonkim8359
@seungwonkim8359 Жыл бұрын
Really helpful! Thank you very much. I hope you continue these marvelous work for long, since I am working on bulk/single cell RNA seq these days.
@Saed7630
@Saed7630 Жыл бұрын
Clean, clear and informative!
@ayobamiogunsola6139
@ayobamiogunsola6139 Жыл бұрын
Thank you for making this video. It has been helpful.
@karthibiotech426
@karthibiotech426 2 жыл бұрын
Wow.. its very helpful I am just practicing with another dataset..with your same protocol... Thanks a lot...
@BISMILLAH7334
@BISMILLAH7334 2 жыл бұрын
Excellent ! Thank you for the tutorial . Looking forward to many more such useful tutorials
@Aishwarya-p4w
@Aishwarya-p4w Ай бұрын
I have knowledge of basic R and Python packages and I love these tutorials as I wanted to start with NGS analysis and never knew where to or how to. I do have a question, if I use a different dataset or the same data and use the different pipelines used in this playlist, can I upload it on GIt and have a mini project of my own? Is that okay to do? Or do I need to modify it to consider it as a project?
@sayeman9577
@sayeman9577 Жыл бұрын
Thanks! Very helpful
@xelaldaero9339
@xelaldaero9339 Жыл бұрын
Thank you! Your videos are very useful!
@o1kun
@o1kun 2 жыл бұрын
Your video really helped me!! Really appreciate it😊
@jithus89
@jithus89 9 ай бұрын
> gse = GEOquery::getGEO(GEO = 'GSE183947', GSEMatrix = TRUE) Error in open.connection(x, "rb") : Problem with the SSL CA cert (path? access rights?) why this error?
@MohammadNasirAbdullah
@MohammadNasirAbdullah 9 ай бұрын
Thank you so much, it really helps me 😊😊😊😊😊😊😊😊
@alaminafendy6071
@alaminafendy6071 11 ай бұрын
Thank you so much. Nicely explain..
@lisahuang850
@lisahuang850 2 жыл бұрын
Really nice video! I was wondering if you could demonstrate how to convert the raw count to tpm or fpkm values in r as my GSE dataset provide raw count. Thanks!
@Bioinformagician
@Bioinformagician 2 жыл бұрын
Thanks for the suggestion. Will plan a video covering this!
@moulytasnuva1860
@moulytasnuva1860 2 жыл бұрын
@Bioinformagician Is there any process to find the threshold value from FPKM to compare the early and late stages of cancer?
@gustavoantoniobrugesmorale1881
@gustavoantoniobrugesmorale1881 2 жыл бұрын
You are excellent. Thank you!!!
@mikewafula9470
@mikewafula9470 Жыл бұрын
Thanks again for the video. I have managed to download the gene expression data (GSE 216497). How do I get its corresponding metadata.
@melinaguillon2449
@melinaguillon2449 4 ай бұрын
Hi! I can't install GEOquery, I get this error message: Warning in install.packages : package ‘GEOquery’ is not available for this version of R
@uditgupta8961
@uditgupta8961 Ай бұрын
exactly!!
@QAKS1264
@QAKS1264 2 жыл бұрын
@arcturusdig1673
@arcturusdig1673 Жыл бұрын
I can't understand most of the things you do. I need to go to other tutorial videos for understanding every single step. If you want your viewers to understand especially beginners, then please make your explanation more lucid and easy.
@tushardhyani3931
@tushardhyani3931 2 жыл бұрын
Thank you for this video !!
@juliangrandvallet5359
@juliangrandvallet5359 2 жыл бұрын
Amazing!!!! now how can I plot a heatmap out of this data?
@gaurangagarwal3817
@gaurangagarwal3817 2 ай бұрын
Hey! could u help me in finding the differential gene expression level from a gene omnibus dataset through R Limma package
@Ijazalijin
@Ijazalijin 2 жыл бұрын
how can is activate the GEOquery packge??
@Bioinformagician
@Bioinformagician 2 жыл бұрын
Run library(GEOquery) at the beginning of the script
@harshjasani8637
@harshjasani8637 2 жыл бұрын
Hello, Thank you for amazing video and tutorials. I could not load the GEOquery library, any ideas what could be the reason?
@Bioinformagician
@Bioinformagician Жыл бұрын
probably you need to install it first before loading?
@andyderek3021
@andyderek3021 2 жыл бұрын
Thank you for this well explained video. Please, if i want to do survival analysis based on gene expression data with lets say GE183947, how can i get the clinical data information from GEO ?
@Bioinformagician
@Bioinformagician 2 жыл бұрын
If it is not provided with the metadata, you might have to reach out to the authors.
@IslamSafwat--
@IslamSafwat-- 6 ай бұрын
GREAT! many thanks::)
@awa8061
@awa8061 2 жыл бұрын
can you suggest any python package for gene expression analysis?
@Bioinformagician
@Bioinformagician 2 жыл бұрын
Unfortunately, I do not have any recommendations for python packages. I only use R for gene expression analysis.
@faizu0076
@faizu0076 Жыл бұрын
I didnt founr getGEO protein query in this there is no any package support with this name solve rhe problem plz
@chinspostdoc
@chinspostdoc Жыл бұрын
HI have some questions. Please help to resolve the or to understand them. What if the GEO study only gives us a raw file containing either text files, or . CEL files. how to read the data from that. 2) suppose if a GEO study contain many samples of different tissues, then how to make 2 groups comprising on only those samples that a person is interested e.g. as i want to compare expression data from healthy and covid patients but GEO study contain some samples of ell lines treated with a certain chemical along with tissues of healthy and covid patients. Then how can i make two group with heathy and covid name and also includes samples into those groups accordingly. 3) If GEO raw file contain count.text files of each sample then how we can use them for differential expression analysis. Your kind reply would be much appreciated.
@Ojaswini-Pathak
@Ojaswini-Pathak 2 жыл бұрын
Hi, I tried installing GEOquery package and got error - package GEOquery is not available for this version of R, could you please help.
@naveenyethirajula1279
@naveenyethirajula1279 Жыл бұрын
Please tell me how to install it
@hiraalmas9042
@hiraalmas9042 Жыл бұрын
I am facing same issue
@1980yadalam
@1980yadalam 2 жыл бұрын
very good video, thanks.
@sharadjaiswal1705
@sharadjaiswal1705 Жыл бұрын
Ma'am how to write R script. that are used in this video?
@muneeramashkoor7919
@muneeramashkoor7919 2 жыл бұрын
Hello, your videos are very informative. I am trying to look at the gene expression of my gene of interest. The supplementary data in GEO is in the form of a .fpkm_tracking file. How can I go about solving/looking at the expression using these files? Thank you!
@Bioinformagician
@Bioinformagician 2 жыл бұрын
If there are no raw counts provided, you can create them yourself. You can fetch RNA-Seq reads associated with GEO dataset from SRA. Once you get the reads, you can align and quantify them to get counts.
@aheedan9957
@aheedan9957 2 жыл бұрын
Hi, nice one, but I did not understand the part of pData and phenodata function.
@yahyayozbatiran
@yahyayozbatiran 2 жыл бұрын
Hello, how can i plot a specific gene expression in cancer subtypes from tcga, for example; I want to plot> MSH2 gene expressions in Colon Mucinous versus Colon Adenocarcinoma
@terryadams2652
@terryadams2652 2 жыл бұрын
@Bioinformagician, I apologize for my question (please), but, as a Biologist, I am now learning Python. I really don't want to spend what little time I have learning another language (R). So, to get these results, is it possible to just use Python instead of R? Thank you very much, my dear.
@Bioinformagician
@Bioinformagician 2 жыл бұрын
You can perform R equivalent operations in python. I believe it is pandas package in python that will allow you to do all your data wrangling.
@kajalpanchal8239
@kajalpanchal8239 2 жыл бұрын
thankya Khushbu!
@sanjaisrao484
@sanjaisrao484 2 жыл бұрын
Thanks
@aytacoksuzoglu2975
@aytacoksuzoglu2975 Жыл бұрын
why did we put -> .
@imvasco
@imvasco 2 жыл бұрын
What about GEO data thats not CSV but TXT?
@Bioinformagician
@Bioinformagician 2 жыл бұрын
Sometimes gene expression data is also available as a .txt file on GEO. You could read in .txt similar to how you read a .csv file in R. Please make sure .txt file contains gene expression data. Usually, the 'data processing' section for each sample should provide details on what does the txt file contains and how it is processed.
@irodasay3448
@irodasay3448 2 жыл бұрын
Thank you for the tutorial. I have a question about converting GSE to ExpressionSet. I used your vignette and tried to do the same for GSE181462. 1th I got GSE by : gse
@Bioinformagician
@Bioinformagician 2 жыл бұрын
Try changing GSEMatrix = FALSE
@mohamedalfaki4268
@mohamedalfaki4268 2 жыл бұрын
Hi and thanks for this very nice tutorial, I have this error when I am trying to reshape the data Error in `stop_formula()`: ! Formula shorthand must be wrapped in `where()`. # Bad data %>% select(~gene) # Good data %>% select(where(~gene))
@Bioinformagician
@Bioinformagician 2 жыл бұрын
Can you give me a little context of what you are trying to do? I am having a hard time recreating this error. Thanks!
@markrenton6981
@markrenton6981 11 ай бұрын
Can someone please explain what the two ".." are at the start of her file path when reading in the data file?
@Bioinformagician
@Bioinformagician 11 ай бұрын
The "../" is the Linux notation to move up a directory level in the file system hierarchy. For instance, if you're in the directory "/home/user/documents/" and you use "../", you'll move up to the "/home/user/" directory.
@zeynepdurkaya883
@zeynepdurkaya883 Жыл бұрын
ı cant command call the data the chapter 6.14 isnt clear enough
@sanjaisrao484
@sanjaisrao484 2 жыл бұрын
Mam some doesn't have sample names in Geoquery metadata please help, I am stuck here
@Bioinformagician
@Bioinformagician 2 жыл бұрын
Are you using the same dataset used in the video?
@gargiagravanshi355
@gargiagravanshi355 5 ай бұрын
Hello ma’am ! I funckin need your help I’m stuck with a project and my mentor is very toxic please let me know how can I contact you.
@Bioinformagician
@Bioinformagician 5 ай бұрын
My contact details can be found in the video description :)
@SamipSapkota-zg8hy
@SamipSapkota-zg8hy 4 ай бұрын
the value of strain samples and cell.type becomes null
@killa14108
@killa14108 2 жыл бұрын
Hi what happens when there are NAs in the gene expression data? The accession number is GSE70947 and it's a breast cancer data set with 296 total samples and 62976 features (genes). I followed what you did and queried the data directly using GEOquery from Bioconductor. I am just stuck now and figuring out how to deal with NAs and would appreciate your help. Thank you!
@Bioinformagician
@Bioinformagician 2 жыл бұрын
I would quantify the NAs for each gene across all samples and filter out genes that have NAs in more than half of the samples. I usually prefer to replace NAs with 0.
@killa14108
@killa14108 2 жыл бұрын
@@Bioinformagician Thank you very much! Do you also might have any recommended methods for feature (gene) selection for creating a classification model in predicting cancer/normal samples?
@vahidgorganli8895
@vahidgorganli8895 Жыл бұрын
🙂👍
@muhammadrafiq7645
@muhammadrafiq7645 2 жыл бұрын
great vedio can you please share your email indeed some help.
@hamadalbasri9058
@hamadalbasri9058 Жыл бұрын
great vedio but why not translate ?!
How Much Tape To Stop A Lamborghini?
00:15
MrBeast
Рет қаралды 194 МЛН
Car Bubble vs Lamborghini
00:33
Stokes Twins
Рет қаралды 42 МЛН
Triple kill😹
00:18
GG Animation
Рет қаралды 18 МЛН
Миллионер | 3 - серия
36:09
Million Show
Рет қаралды 1,2 МЛН
GEO2R and Data Manipulation
24:20
Neuromatter
Рет қаралды 10 М.
5 Steps to Transitioning Into Bioinformatics As A Bio Student
28:19
OGGY INFORMATICS
Рет қаралды 28 М.
How to download sequencing data from SRA NCBI | Bioinformatics 101
12:31
Bioinformagician
Рет қаралды 47 М.
Data wrangling with R in 27 minutes
27:19
Equitable Equations
Рет қаралды 28 М.
Gene Expression Analysis (Bioinformatics S12E1)
52:36
Danny Arends
Рет қаралды 2,5 М.
How Much Tape To Stop A Lamborghini?
00:15
MrBeast
Рет қаралды 194 МЛН