Bioinformatics Project from Scratch - Drug Discovery Part 1 (Data Collection and Pre-Processing)

  Рет қаралды 143,629

Data Professor

Data Professor

4 жыл бұрын

Do you want to collect your very own novel and original dataset in biology that you can use in your Data Science Project? In this video, I will show you how to download and pre-process biological activity data from the ChEMBL database that you can use to perform Computational Drug Discovery. The dataset is comprised of compounds (molecules) that have been biologically tested for their activity towards target organism/protein of interest. This video represents Part 1 in a multi-part video series on Bioinformatics Project.
🌟 Buy me a coffee: www.buymeacoffee.com/dataprof...
⭕Code: ✅ github.com/dataprofessor/code...
⭕ Playlist:
Check out our other videos in the following playlists.
✅ Data Science 101: bit.ly/dataprofessor-ds101
✅ Data Science KZbinr Podcast: bit.ly/datascience-youtuber-p...
✅ Data Science Virtual Internship: bit.ly/dataprofessor-internship
✅ Bioinformatics: bit.ly/dataprofessor-bioinform...
✅ Data Science Toolbox: bit.ly/dataprofessor-datascie...
✅ Streamlit (Web App in Python): bit.ly/dataprofessor-streamlit
✅ Shiny (Web App in R): bit.ly/dataprofessor-shiny
✅ Google Colab Tips and Tricks: bit.ly/dataprofessor-google-c...
✅ Pandas Tips and Tricks: bit.ly/dataprofessor-pandas
✅ Python Data Science Project: bit.ly/dataprofessor-python-ds
✅ R Data Science Project: bit.ly/dataprofessor-r-ds
⭕ Subscribe:
If you're new here, it would mean the world to me if you would consider subscribing to this channel.
✅ Subscribe: kzbin.info...
⭕ Recommended Tools:
Kite is a FREE AI-powered coding assistant that will help you code faster and smarter. The Kite plugin integrates with all the top editors and IDEs to give you smart completions and documentation while you’re typing. I've been using Kite and I love it!
✅ Check out Kite: www.kite.com/get-kite/?...
⭕ Recommended Books:
✅ Hands-On Machine Learning with Scikit-Learn : amzn.to/3hTKuTt
✅ Data Science from Scratch : amzn.to/3fO0JiZ
✅ Python Data Science Handbook : amzn.to/37Tvf8n
✅ R for Data Science : amzn.to/2YCPcgW
✅ Artificial Intelligence: The Insights You Need from Harvard Business Review: amzn.to/33jTdcv
✅ AI Superpowers: China, Silicon Valley, and the New World Order: amzn.to/3nghGrd
⭕ Stock photos, graphics and videos used on this channel:
✅ 1.envato.market/c/2346717/628...
⭕ Follow us:
✅ Medium: bit.ly/chanin-medium
✅ FaceBook: / dataprofessor
✅ Website: dataprofessor.org/ (Under construction)
✅ Twitter: / thedataprof
✅ Instagram: / data.professor
✅ LinkedIn: / chanin-nantasenamat
✅ GitHub 1: github.com/dataprofessor/
✅ GitHub 2: github.com/chaninlab/
⭕ Disclaimer:
Recommended books and tools are affiliate links that gives me a portion of sales at no cost to you, which will contribute to the improvement of this channel's contents.
#dataprofessor #bioinformatics #drugdiscovery #drugdesign #chembl #cheminformatics #bioinformaticsproject #bioinformaticproject #drug #drugs #molecule #molecules #machinelearning #lecture #dataprofessor #bigdata #QSAR #QSPR #machinelearning #datascienceproject #randomforest #decisiontree #svm #neuralnet #neuralnetwork #supportvectormachine #python #learnpython #pythonprogramming #datascience #datamining #bigdata #datascienceworkshop #dataminingworkshop #dataminingtutorial #datasciencetutorial #ai #artificialintelligence #tutorial #dataanalytics #dataanalysis #machinelearningmodel

Пікірлер: 269
@DataProfessor
@DataProfessor 3 жыл бұрын
👉Watch this video next (How to learn data science in 2021) kzbin.info/www/bejne/pYOZaGOKrdybbpo Support this Channel 👇👇👇 🌟 Buy me a coffee www.buymeacoffee.com/dataprofessor 🌟 Download Kite for FREE www.kite.com/get-kite/? 👉 Subscribe to this KZbin channel kzbin.info 👉 Join the Newsletter of Data Professor newsletter.dataprofessor.org 👉 Blogs on Medium medium.dataprofessor.org/
@aadimnepal6497
@aadimnepal6497 2 жыл бұрын
Hi Professor. Where can I find the documentation for CHEMBL library?
@kapilkumar-jn4il
@kapilkumar-jn4il 2 жыл бұрын
Dear sir, Indian card does not support automatics payment please set for a prepaid plan. I am getting into trouble paying.
@suthishmababu2107
@suthishmababu2107 2 ай бұрын
Do you have documentation for this project
@DataProfessor
@DataProfessor 4 жыл бұрын
Thanks to the discussion with Shweta in this comment section. Back in the days, 7 years ago, we manually compiled the bioactivity data of more than 2000 compounds from hundreds of research articles. The whole process took 6 months, then we spent a few more months manually curating the data, and double checking again and again for consistency. Fast forward to today, we can do the same thing in less than 10 minutes as shown in this video. I am thankful for the generosity of data providers for making these APIs as well as the various libraries such as pandas (imagine handling hundreds of Excel files and manually curating those) and scikit-learn (imagine optimizing learning parameters manually on 50 computers and via a GUI interface of data mining software such as Weka). Coding is indeed a real superpower. If you are thinking of whether to learn coding or not, my recommendation is yes! It will be one of the best decision for your career and hobby 😃
@shwetaredkar734
@shwetaredkar734 4 жыл бұрын
Very true. I thank you on behalf of all your subscribers for all the efforts and getting such super amazing tutorials to all of us.
@DataProfessor
@DataProfessor 4 жыл бұрын
@@shwetaredkar734 I'm flattered.Thanks for the kind words. Glad the tutorials were helpful.
@kapilkumar-jn4il
@kapilkumar-jn4il 2 жыл бұрын
@@DataProfessor need help sir
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Can't wait for part 2! I know your subscribers have been asking for this series!
@DataProfessor
@DataProfessor 4 жыл бұрын
Thanks Ken! Inspired by subscribers and by Ken Jee’s Data Science from Scratch Series 😃
@KenJee_ds
@KenJee_ds 4 жыл бұрын
@@DataProfessor haha I am flattered! Keep up the great work!
@queenofunderland
@queenofunderland 2 жыл бұрын
Me too. Yaay, thx for the video, Prof. Learned sth from my bioinformatics study, but definitely you give me more experience!
@soukisama05
@soukisama05 3 жыл бұрын
Dear Data Professor, I can not even express how grateful I am for your content and dedication to your subscribers! I come from a biological background and I am new to the bioinformatic world. You give me motivation and great advice to continue studying in this incredible field. Great work, Greetings from Brazil!
@DataProfessor
@DataProfessor 3 жыл бұрын
Wow, thank you! Glad the contents are helpful to your journey into bioinformatics 😃
@saraalm9567
@saraalm9567 Жыл бұрын
This is incredible. Thank you so much for sharing your knowledge and experience with us!
@raedhanoon2791
@raedhanoon2791 4 жыл бұрын
This was a useful, informational video! It was straightforward and very interesting to learn about. You've intrigued me to pursue a bioinformatics program and I plan on starting my own project using the same tools demonstrated, Thanks Data Professor!
@DataProfessor
@DataProfessor 4 жыл бұрын
Glad it was helpful!
@gauravbhattacharjee5737
@gauravbhattacharjee5737 4 жыл бұрын
Professor, this is an indispensable resource! You are the best!
@DataProfessor
@DataProfessor 4 жыл бұрын
Happy to help! Thanks for the kind words!
@mr.harambae
@mr.harambae 3 жыл бұрын
Yess! Finally! The niche teach professor!
@Yoursleepassistant
@Yoursleepassistant 3 жыл бұрын
What an awesome explanation I finally maneged to find channel to walk me through step by step. I sincerely thank you
@DataProfessor
@DataProfessor 3 жыл бұрын
Happy to hear that, you're welcome!
@FarisIzzaturRahman
@FarisIzzaturRahman 3 жыл бұрын
I watch this on May 2021, and sooo excited with this amazing project video, what a great content, thankss Prof!
@DataProfessor
@DataProfessor 3 жыл бұрын
Glad you enjoyed it!
@shwetaredkar734
@shwetaredkar734 4 жыл бұрын
This video is a treasure that I have found. Probably the first video ever on Chembl data collection. I wish this video was out when my paper was under review last year. Luckily, I could solve the reviewer query on Chembl.
@DataProfessor
@DataProfessor 4 жыл бұрын
Shweta, Thanks for the kind words! I’m flattered, glad it was helpful! 😃
@shwetaredkar734
@shwetaredkar734 4 жыл бұрын
@@DataProfessor seriously, I faced a lot many issues to solve the comment. May be I was not familiar with Chembl. But surely, this video will help many of them. Thanks for making this.
@DataProfessor
@DataProfessor 4 жыл бұрын
@@shwetaredkar734 Thanks for sharing. And thanks again for the kind words 😃
@manabendraborah8654
@manabendraborah8654 Жыл бұрын
sir....thank you so much for simply sharing your knowledge.
@salikmalik7631
@salikmalik7631 4 жыл бұрын
Great video data professor, waiting for part 2.. :)
@flowstateofmind2380
@flowstateofmind2380 10 ай бұрын
ตามมาจากช่อง Data rockie นะครับ อยากสนับสนุนให้อาจารย์เขียนหนังสือด้าน data science for bioinformatics มากเลยครับ จะรอซื้อครับ 👍
@CostanzoPadovano
@CostanzoPadovano 3 жыл бұрын
Perfect lesson, thank you Professor! Greetings from Italy!
@DataProfessor
@DataProfessor 3 жыл бұрын
Thanks for watching and glad it is helpful 😊
@ElijahErureh
@ElijahErureh Жыл бұрын
I want to show my sincere appreciation for how you made data science so simple for me and interesting
@DataProfessor
@DataProfessor Жыл бұрын
Thanks for the kind words, this means a lot!
@khaifea8829
@khaifea8829 4 жыл бұрын
As a Bioinformatics MSc student I found this so interesting
@DataProfessor
@DataProfessor 4 жыл бұрын
Thanks Muhammad for watching and the kind comment! I am currently editing the video of Part 0 (Bioinformatics 101) and Part 2 will be filmed soon.😃
@user-pm6hz9pb1e
@user-pm6hz9pb1e 3 жыл бұрын
hi everyone please , can anyone tell me how to get past master thesis in Bioinformatics about Drug Discovery ?
@tiamat1628
@tiamat1628 5 ай бұрын
Brilliant sfuff thank you very much for sharing your knowledge and wisdom
@danieltoo2008
@danieltoo2008 Жыл бұрын
Wow!!! Now I can learn DS and Bioinformatics from a Thai Professional. Thanks Prof for helping get out off my SandBox prison mindset.😅😅😅
@anjelidubey1095
@anjelidubey1095 7 ай бұрын
What a great video! This helps so much in getting started building my portfolio
@onkarkumbhar1610
@onkarkumbhar1610 3 жыл бұрын
This is one only resource condensed, thanks!
@DataProfessor
@DataProfessor 3 жыл бұрын
Thank you for watching and support!
@TravisKPHall
@TravisKPHall Жыл бұрын
Dear Professor, this was an amazing tutorial and lecturer in one, thank you so much, but I have two questions, I am currently trying to create my own independent model and I have a question, the classification of active, inactive or intermediate based on the IC50 value that you used, is this a universally accepted for all molecules when making a QSAR model? What constitutes us using these parameters active 10000?
@alvinmodales6809
@alvinmodales6809 3 жыл бұрын
You are so amazing data professor, thank for always sharing your expertise and knowledge
@DataProfessor
@DataProfessor 3 жыл бұрын
You are very welcome
@avoniadevile3035
@avoniadevile3035 2 жыл бұрын
I am new to your profile. I was intrested to see about drug discovery and especially l am rescently got interested in bioinformatics. Sadly l think l chose the wrong course as I am biomedical science student in my second years and now l can see that l should have gone bioinformatics way! It was very straight forward lesson, proffessor! Thank you so much
@DataProfessor
@DataProfessor 2 жыл бұрын
Hi, thanks for the comment. It's okay, I was also a biomed student back in the days. You'll get a solid domain knowledge of the biomedical sciences. You can also do bioinformatics in the side.
@marcofestu
@marcofestu 4 жыл бұрын
Eager to see part 2 professor 😁
@luiseduardogoncalves2228
@luiseduardogoncalves2228 3 жыл бұрын
Thank you professor for this tutorial! I am very new to bioinformatics and I would like to assure one thing. For example, when I type on ChEMBL a protein like Phospholipase A2 and then select targets, all of the results displayed are Phospholipase A2 targets right? Secondly, if i choose single protein, this means that the displayed protein is targeted by phospholipase A2. Is this correct?
@alizanaz5963
@alizanaz5963 11 ай бұрын
Hi! I have been following your tutorials from past few days. I have a question. Is it necessary to test models on the same dataset they were trained with, or can a different test dataset be used for evaluation?
@ashwins5180
@ashwins5180 4 жыл бұрын
Thanks alot sir...please upload part 2, i m eagerly waiting for ur next video for bioinfo projects in ml😍😍😍
@DataProfessor
@DataProfessor 4 жыл бұрын
Thanks Ashwin, it was fun making this video, I will also release a Part 0 video (introductory video on the biology background) as well 😃
@yinyang6058
@yinyang6058 3 жыл бұрын
Thank you! This is soooooo helpful to me!!!
@nidhibharani1886
@nidhibharani1886 3 жыл бұрын
Great video with concise explanation 👍 Is there a similar way to download sequencing data? Please make a video on that topic.
@poojavani223
@poojavani223 2 жыл бұрын
This video really awesome, thank you so much.
@ramkumarrs1170
@ramkumarrs1170 3 жыл бұрын
Wow! super step by step process explanation!
@DataProfessor
@DataProfessor 3 жыл бұрын
Glad it was helpful! 😃
@anjuzoldyck9266
@anjuzoldyck9266 6 ай бұрын
thank you so much professor you are a gem 💎
@aliqaitoon9965
@aliqaitoon9965 4 жыл бұрын
Great lecture. Thanks professor. Please we need second part soon 🙏😍
@DataProfessor
@DataProfessor 4 жыл бұрын
Ali, glad you found the video helpful. I'm also thrilled to make the next part, please stay tuned 😃
@aliqaitoon9965
@aliqaitoon9965 4 жыл бұрын
@@DataProfessor i will be
@adnaneaouidate3934
@adnaneaouidate3934 4 жыл бұрын
Thank you for the outstanding tutorials, (I have a small problem I prefer to work on jupyter-notebook but i face the problem of : when i arrive to this step targets = pd.DataFrame.from_dict(target_query) targets TypeError: data argument can't be an iterator ) I don't face this problem on Colab
@aashishkatyal
@aashishkatyal 3 жыл бұрын
Awesome Tutorial Must Watch for Beginners.
@DataProfessor
@DataProfessor 3 жыл бұрын
Glad you think so! Thank you for watching!
@delilahjones6496
@delilahjones6496 Ай бұрын
I would really appreciate it if you explained your lines of code and how they work. Otherwise, I am not learning why I'm typing what I'm typing, or why it's necessary, etc. I just watched another series by a different guy who thoroughly explained every line of code he wrote in R and I walked away with a much better understanding of why each line of code is being typed or why certain arguments were used.
@DataProfessor
@DataProfessor Ай бұрын
Thanks for the feedback 😊
@narendra_nn
@narendra_nn 2 жыл бұрын
Hello, In recent version of Chembl(v29) database there is a target SARS-Cov2 with Chembl Id of CHEMBL4523582; In this dataset there is a feature of Standard type containing inhibition (~98% of data) and IC50(~2% of data) as classes, I am pleased to know the difference between them or can i see those both as a same standard type....
@sebastiancastro4126
@sebastiancastro4126 4 жыл бұрын
Hello, I would like to make an inquiry. When we obtain the dataframe with the data for the biological activity of the selected target, some numeric columns are "object", which prevents statistics from being performed. How can I change the dtype of these columns? Thank you very much!
@kamalikabhattacharjee336
@kamalikabhattacharjee336 2 жыл бұрын
This is so good for self learners like me, thanks for creating this kind of bioinformatics content. Apart from acetylcholinesterase what are other molecules for which I can create similar project ? Tried for human BRCA1 but not a great one to create the exact similar project. Please suggest a few options.
@khushboodutta30
@khushboodutta30 Жыл бұрын
Even I'm interested to learn to build a project for HER 2 /BRCA1 / Bcr-Abl genes
@DataProfessor
@DataProfessor Жыл бұрын
Members in the Cytochrome P450 and Kinase protein family are also worth exploring
@alchemista947
@alchemista947 3 жыл бұрын
Hello professor your content is very interesting I just wanna ask you can we predict active site of enzymes using python ?
@josefranklinct
@josefranklinct 4 жыл бұрын
that's fantastic, thanks so much professor
@DataProfessor
@DataProfessor 4 жыл бұрын
Thanks for watching! Glad you liked it!
@arkadeepbanerjee2347
@arkadeepbanerjee2347 4 жыл бұрын
Looking forward to part 2.
@DataProfessor
@DataProfessor 4 жыл бұрын
Arkadeep, thanks for the comment! Part 2 is in the making.
@Its_InduB
@Its_InduB 2 жыл бұрын
Hi. Can you suggest tutorial for gRNA and target sequence designing besides online tools available. And working on resulting data after that.
@positive51
@positive51 3 жыл бұрын
Thanks for what you do!
@DataProfessor
@DataProfessor 3 жыл бұрын
A pleasure, thanks for watching!
@halizahasniaputri8869
@halizahasniaputri8869 3 жыл бұрын
I'm doing my final project for my graduation later, I thought of various ways and I didn't find any way out until I finally found your video, I'm very grateful that you made a video like this, it really helped me. May I ask a few things? how do you combine two chemical molecules so that you get a new molecular structure and then test the bioinformatics? I am very confused about that, please help me Can I talk about this with you please? Thank you very much..
@ias....burningdreamofmylif5516
@ias....burningdreamofmylif5516 Жыл бұрын
Have you done your project by this playlist?
@eduardourbano4596
@eduardourbano4596 4 жыл бұрын
very interesting. Great job!
@DataProfessor
@DataProfessor 4 жыл бұрын
Thanks for watching 😃
@eyupbilgi3191
@eyupbilgi3191 3 жыл бұрын
thank you for this amazing introduction. is it possible to use same approach to collect data from article repositories like elsevier, sciencedirect, scopus etc. with certain keywords and endpoints.
@DataProfessor
@DataProfessor 3 жыл бұрын
Similar concept applies but instead of the chembl API you can look into using the API of these article repositories.
@sakichan2640
@sakichan2640 2 жыл бұрын
thank you professor ! Do more bioinformatics video pls
@DataProfessor
@DataProfessor 2 жыл бұрын
Sure thing! More to come
@varshak4824
@varshak4824 4 жыл бұрын
Excellent and Awesome.I will share this tutorials with my friends. Sir, I have a request. Could you please make tutorial on how to collect data from BioLip, KEGG, PDBSUM etc and how to use various packages and tools such as Rcpi, Protr, PyPDI,Biopython, etc to generate descriptors? Just waiting for the second part. Many thanks!!!
@DataProfessor
@DataProfessor 4 жыл бұрын
Thanks Varsha for the suggestions. Currently in the making is how to use BioPython to do a protein blast. More on the molecular descriptors is also in the pipeline in forthcoming videos of the Bioinformatics Project series. 😃
@shwetaredkar734
@shwetaredkar734 4 жыл бұрын
@@DataProfessorthat's a great news. 👍
@utkar1
@utkar1 4 жыл бұрын
Hey dataprof! Really excited for this amazing project. I followed the same steps as your showed while creating the dataset. However, my findings were a little different. when looking for missing values, I found that - the dataset has 6 completely empty variables - 3 features with greater than 50% to 99% missing values - 3 features with less 50% missing values So for the empty features, I can straight away drop the features itself. How about the features with missing values ranging from 50% to 99%. Should I drop these features as well? Since simply applying dropna() is leaving me with an empty dataset. let me know your thoughts. Thanks a lot :)
@DataProfessor
@DataProfessor 3 жыл бұрын
Hi Utkarsh, great question! The data that is obtained in Part 1 is not yet ready for machine learning model building. It is still a raw data that needs to undergo molecular descriptor calculation that is covered in the subsequent Parts. As for missing values, once you have computed the molecular descriptors, yes you can remove those having significantly sparse data (normally we try >80% as the threshold but you are welcome to tune this threshold. Using dropna() is giving you empty data due to the sparseness of the data which leads us to rather use the low variance threshold (inverse of the % missing values as you had mentioned which is actually not missing values but rather they have 0 as the value). Hope this helps. 😃
@musicalrea5433
@musicalrea5433 3 жыл бұрын
Great video! I have a question: When we have Ki as standard_type, instead of IC50, what are the Ki intervals for labeling the compounds as active, inactive and intermediate?
@DataProfessor
@DataProfessor 3 жыл бұрын
Hi, Ki happens to have the same bioactivity range, so you can use the same threshold values.
@musicalrea5433
@musicalrea5433 3 жыл бұрын
@@DataProfessor Great! Thank you so much Data Professor!!
@user-bt2kf5ls2h
@user-bt2kf5ls2h Жыл бұрын
Hello! I am working on a drug target interaction prediction using ML project. I wanted to know if I can use the data for my project in the same way given by you?
@pravalikas.p8520
@pravalikas.p8520 2 жыл бұрын
Hai sir thank you so much for this explanation
@saipavanchary0521
@saipavanchary0521 3 жыл бұрын
Sir Subscribed just now!!
@DataProfessor
@DataProfessor 3 жыл бұрын
Thanks, welcome aboard 😄
@andyderek3021
@andyderek3021 2 жыл бұрын
Interesting tutorial! Please, I have a question: can someone explain to me why opt for "SINGLE PROTEIN" why not "Organism" ?
@Kelly-ks5nk
@Kelly-ks5nk 3 ай бұрын
are there any prerequisites to be able to do this project? like are we supposed to know some ML beforehand?
@devarakondahimaja8423
@devarakondahimaja8423 3 жыл бұрын
If i want to write a if statement where the id number in row(1) of a CSV file is == id number in my text file print the sequence that means print the information of that particular id How to write?
@priyankachauhan9513
@priyankachauhan9513 3 жыл бұрын
Hello sir i am not able to run the githubtools to calculating the network proximity of bladder and colorectal cancer so please 🙏 make a video on this
@JellaMaruti
@JellaMaruti 4 ай бұрын
Hello Dr Chanin, What if the units for IC50 are different? I am trying to create a project but the imported data from the ChEMBL contains different units (uM and nM). How to solve this problem? I went through one of your comment, but could not find the code in the notebook that you mentioned. Thank you for the help!
@DataProfessor
@DataProfessor 3 ай бұрын
Hi, you’ll need to convert the units to the same scale by either converting uM (micro-molar or 10^-6) to nM (nano-molar or 10^-9) or vice versa.
@mohtashimnizamani
@mohtashimnizamani 2 жыл бұрын
It'll be studying computer science at university and I'm also interested in bioinformatics, but I don't have any biology background whatsoever. How should I approach bioinformatics and can I self study biology concepts and start doing bioinformatics projects?
@chyokomizo
@chyokomizo 3 жыл бұрын
great video! thanks a lot!
@DataProfessor
@DataProfessor 3 жыл бұрын
Thanks for watching!
@UwU-uq9pq
@UwU-uq9pq 3 жыл бұрын
Hi Professor, I have face a problem when following your code on Google Colab. While at the first step of copying files to Google Drive, I encounter a “ModuleNotFoundError”. And I found that this is cause by a module 'termios' which is only available on Unix system. I found that you are also using Windows OS but how can you solve the problem faced? Regarding this problem, the googled result only show me to change to unix os.
@samarafroz9852
@samarafroz9852 3 жыл бұрын
You're doing such a great job sir please upload more tutorial about AI and drug discovery please upload drug design with GAN and AAE
@DataProfessor
@DataProfessor 3 жыл бұрын
Thanks for the suggestion!
@samarafroz9852
@samarafroz9852 3 жыл бұрын
@@DataProfessor welcome sir
@garrettmccue2644
@garrettmccue2644 3 жыл бұрын
creating the df from the bioactivity data dictionary is taking a long time (30+min) with no luck. Any suggestions on speeding this up? i am using jupyter fyi
@rhard007
@rhard007 4 жыл бұрын
Hi Professor can you recommend some books for self-study on the topic of bioinformatics? thank you.
@DataProfessor
@DataProfessor 4 жыл бұрын
Hi, there are quite a few books,the O'Reilly book "Bioinformatics Programming Using Python: Practical Programming for Biological Data" seems like a good starter to get you started with python coding bioinformatics examples.
@abolajishiwoku32
@abolajishiwoku32 3 жыл бұрын
As an aspiring data scientist, this project has been really helpful in building my portfolio with original data. Do you think building a portfolio of projects based on past academic papers is useful in today's biotech industry ?
@DataProfessor
@DataProfessor 3 жыл бұрын
Yes, definitely. The existing research community is based on doing cutting edge research that is based on past academic papers. The goal is to find knowledge gaps presented in the current research literature and to make incremental improvements that would lead to publishing the work itself as a research paper. Hope this helps.
@gayathrim3022
@gayathrim3022 3 жыл бұрын
Could you please post a video on extracting various important information say chemical, ADR etc related to the drug using NLP techniques.
@DataProfessor
@DataProfessor 3 жыл бұрын
Thanks for the suggestion, will definitely consider this for future videos.
@nguyenthanhtrung3768
@nguyenthanhtrung3768 3 жыл бұрын
Specially thank you for your useful and practical video. It would be much more applicable if you could instruct how to execute this work with R.
@DataProfessor
@DataProfessor 3 жыл бұрын
Great suggestion! I'll definitely consider this for future videos. In the meantime, please refer to the GitHub and link to the research paper of one recent publication from our research group: github.com/chaninlab/anti-sickling pubs.rsc.org/en/content/articlelanding/2018/ra/c7ra12079f
@nguyenthanhtrung3768
@nguyenthanhtrung3768 2 жыл бұрын
@@DataProfessor Thank you very much for your valuable suggestions. It would be very helpful if you could advise how to download necessary data from several other databases to apply for other drug discover.
@abdulrhmana.elshiekh383
@abdulrhmana.elshiekh383 3 жыл бұрын
Your are awesome man!
@DataProfessor
@DataProfessor 3 жыл бұрын
Hey thanks!
@muhammadjamalahmed8664
@muhammadjamalahmed8664 4 жыл бұрын
I love your work...
@DataProfessor
@DataProfessor 4 жыл бұрын
Thanks!
@ArvindPandeyeng
@ArvindPandeyeng 2 жыл бұрын
Can you post the link of the research paper you were referring?
@MatthewRalston89
@MatthewRalston89 2 жыл бұрын
This one is great!!
@MatthewRalston89
@MatthewRalston89 2 жыл бұрын
I find this channel problematic and controversial, up there with Ken Jee and others. So much SEO on data science when the majority of them/us don't have necessary foundations in mathematics to properly make the most of any data science adjacent topic. I wish that the govt would fund a channel like this because I do believe in your intentions, and the quality here is 5x better than Ken Jee's. But its not 100x like statquest or 3blue1brown. Anyways thanks for a *good* topic and keep up the work!
@elshroomness
@elshroomness 3 жыл бұрын
I am having an issue trying to install the chembl resource client. This the error that i am running into: ""ERROR: Cannot uninstall 'PyYAML'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall. "" Can anyone help? EDIT: I uninstalled Anaconda and reinstalled it and its working fine.
@shraddhaverma4111
@shraddhaverma4111 2 жыл бұрын
Sir can you please recommend any for Computational chemistry !!!
@johntichenor1601
@johntichenor1601 3 жыл бұрын
Hey there I'm getting an attribute error when setting target = new_client.target do you know the reason for this?
@wilzgaming777
@wilzgaming777 2 жыл бұрын
Thank you so much Professor, this is a very interesting project. I want to ask, what if at 8:30 (retrieving the bioactivity data) nothing appeared. Does it mean there is no data about the bioactivity of my target? Thank you Professor, I am Hawil, Biology student from Indonesia, love your channel !
@fazlurrahman5017
@fazlurrahman5017 Жыл бұрын
Following
@hermainrais2280
@hermainrais2280 3 жыл бұрын
Hello everyone can anyone please tell where i can find only computational based working article found in Bioinformatics?
@kampco9982
@kampco9982 23 сағат бұрын
I am trying to do follow along, doing this with dopamine d2 receptor protein, but after pre-processing the data, i noticed that there are some duplicate values that have the same molecule_chembl_id, but different standard_value, is this normal?
@arvinths9025
@arvinths9025 3 жыл бұрын
Sir, is these only for target types in chembl database or also for chembl compound types,tissue types, assay types, which is shown in search bar of chembl database?
@DataProfessor
@DataProfessor 3 жыл бұрын
Hi. We are searching for a "Target" of interest (a specific protein such as Aromatase) and the associated compounds that have been reported to bind to the target protein will be downloaded as our dataset.
@meghanakaranam1
@meghanakaranam1 Жыл бұрын
hi, chembl_webresource_client sometimes gives 'http error: 500 internal server error". can u please suggest any other alternative to this package?
@user-rq7em3vx1r
@user-rq7em3vx1r 19 күн бұрын
Hi, if any of my fellow bioinformaticians is watching this video. Let me know how does it help? I don't mean anything negative. The video seems great. But here you have already built a project and are running it in front of us. How do I build one of my own? Could you please teach us that? Or is this how people generally take inspiration about making projects? I am sorry I ma very new to this.
@koussaisalem4894
@koussaisalem4894 Жыл бұрын
why did u use IC50 to work on?why dont u use all the features?
@sadiafarzanadiya2800
@sadiafarzanadiya2800 3 жыл бұрын
As a high school student it has gone beyond my head now I need some sleep
@fahaddewtoniumfd5428
@fahaddewtoniumfd5428 4 жыл бұрын
Hi. Is there any prerequisite for fully understanding this video. I have an undergraduate biology background, with very little experience in programming. Will that be enough?
@DataProfessor
@DataProfessor 4 жыл бұрын
Yes absolutely, I go at the basic level that you can follow step by step in a guided way. I would recommend to also take an introductory python course via Kaggle mini-course to augment your understanding, please see this video also kzbin.info/www/bejne/kKW3pWtoZ8d3hKc
@adityakoushik5726
@adityakoushik5726 2 жыл бұрын
hello great tutorial! Is it possible to do this method with your own target protein (not from chembl)?
@DataProfessor
@DataProfessor 2 жыл бұрын
No for Part 1 of this series, this is for target protein that is available on the ChEMBL database. Yes to Part 2 and beyond where you can use whatever dataset that you may have compiled from other databases (PubChem or manually from the literature) and follow the same workflow.
@pradumnchavan1745
@pradumnchavan1745 4 жыл бұрын
thanks you sir for this amazing tutorial i am msc bioinformatics student can you plz make complete video on another bioinfo project using the approach text mining and data curation.
@DataProfessor
@DataProfessor 4 жыл бұрын
Thanks for suggestion 😃
@rajatnandi2175
@rajatnandi2175 3 жыл бұрын
how do I calibrate the standard unit column were the units are defined in either nM or µg per ml? do I need to separate the compounds into two dataset or there's any line of code that can equalize/ calibrate both the units into one? Thanks in advance
@DataProfessor
@DataProfessor 3 жыл бұрын
Hi there’s the code to do that in the subsequent Jupyter notebook, you can find it in the video description
@rajatnandi2175
@rajatnandi2175 3 жыл бұрын
@@DataProfessor ok i will check..thank you very much..
@rajatnandi2175
@rajatnandi2175 3 жыл бұрын
@@DataProfessor very sorry to disturb but got failed to find the codes in subsequent jupyter notebook,,would u please name it or the tittle in the git hub? thank you again.
@alexaiden.forever
@alexaiden.forever 3 жыл бұрын
It's an amazing series..
@DataProfessor
@DataProfessor 3 жыл бұрын
Thanks for the kind words!
@alexaiden.forever
@alexaiden.forever 3 жыл бұрын
@@DataProfessor Can we use any other database instead of chembl???
@azkajunaidlife
@azkajunaidlife 2 жыл бұрын
Lots of appreciation for this series and ur work. U are an inspiration for me. Please guide I'm stucked onto the 4th code where we need to select target after making targets search. I opened in colab and it's showing red exclamation.
@azkajunaidlife
@azkajunaidlife 2 жыл бұрын
I'm beginner and new to bioinformatics
@DataProfessor
@DataProfessor 2 жыл бұрын
Hi, what error message is it giving?
@cooldayka
@cooldayka 4 жыл бұрын
Dear Professor, Is it possible to somehow contact you, I have a small question about one thing not really about the topic but about the course that is available I am now a bit confused should I pay for it or better to not do it, since it most probably will not be much beneficial
@DataProfessor
@DataProfessor 4 жыл бұрын
Hi you contact me at dataprofessorofficial@gmail.com
@idanmorad4769
@idanmorad4769 4 жыл бұрын
Try and use more Of pandas’ built in method such as map or apply instead of looping over the data frame
@DataProfessor
@DataProfessor 4 жыл бұрын
Thanks Idan for the valuable suggestion!
@naganandhireddygunreddy5397
@naganandhireddygunreddy5397 Жыл бұрын
how to create a data folder in jupyter notebook and copy bioactivity_data csv file...please help me how do this part in jupyter notebook..help me out i got strucked at this part....in this video , how to create and copy explained colab notebooks on google drive
@lutfunchowdhury5711
@lutfunchowdhury5711 3 жыл бұрын
Hello sir, It was so organized and precise video. It will be great if you create how to use R studio in bioinformatics data analysis?... Thank you so much..🇧🇩
@DataProfessor
@DataProfessor 3 жыл бұрын
Thanks for the suggestion, yes I could definitely do that,
@benjamintwumasi2480
@benjamintwumasi2480 Жыл бұрын
At the importing Library section of the Colab, I still get a messages like "No module named 'chembl_webresource_client". Please look into that.
@DataProfessor
@DataProfessor Жыл бұрын
Have you installed this library before running the code cell.
@benjamintwumasi2480
@benjamintwumasi2480 Жыл бұрын
@@DataProfessor Yeah, that day I realised the mistake is from my end.
@gutsandglory7507
@gutsandglory7507 6 ай бұрын
Secondly, what does assay type A, B means. Thanks
@joe6372
@joe6372 2 жыл бұрын
Data Prof, thanks a lot for the information you are an amazing teacher. how do i inbox you for my project guidance
@angsumandas1
@angsumandas1 3 жыл бұрын
Sir I tried hard to use these techniques to Autoimmunity and allergenic drug target. But I did not succeed. Please show me oath!
@subramanianramajayam2467
@subramanianramajayam2467 2 жыл бұрын
I could just complete the program successfully after some attempts.I am yet to absorb it fully before making any attempt at a new one of my own.
@user-bn8vv8mk8f
@user-bn8vv8mk8f 3 жыл бұрын
hello! can you help me? im having this kind of error when importing the libraries: ImportError: cannot import name 'hashlib' from 'requests_cache.backends.base' (/usr/local/lib/python3.7/dist-packages/requests_cache/backends/base.py) do you know why this happening?
@user-bn8vv8mk8f
@user-bn8vv8mk8f 3 жыл бұрын
this error comes when i execute: from chembl_webresource_client.new_client import new_client thanks for the guidance
@DataProfessor
@DataProfessor 3 жыл бұрын
Hi @VxS replied to your other post
@lotfibourougaa6209
@lotfibourougaa6209 Жыл бұрын
Hi Sir Can you share a research paper that talks about these approaches.
5 Steps to Transitioning Into Bioinformatics As A Bio Student
28:19
OGGY INFORMATICS
Рет қаралды 24 М.
39kgのガリガリが踊る絵文字ダンス/39kg boney emoji dance#dance #ダンス #にんげんっていいな
00:16
💀Skeleton Ninja🥷【にんげんっていいなチャンネル】
Рет қаралды 7 МЛН
Alex hid in the closet #shorts
00:14
Mihdens
Рет қаралды 7 МЛН
bioinformatics ROADMAP + Q&A
20:36
agenomicsphd
Рет қаралды 35 М.
Five steps for getting started with bioinformatics
17:40
OMGenomics
Рет қаралды 86 М.
ML Was Hard Until I Learned These 5 Secrets!
13:11
Boris Meinardus
Рет қаралды 243 М.
Bioinformatics project ideas
21:33
OMGenomics
Рет қаралды 38 М.
How I Became A Data Scientist (No CS Degree, No Bootcamp)
12:28
Egor Howell
Рет қаралды 82 М.
Bioinformatics for Beginners
8:13
OMGenomics
Рет қаралды 23 М.
Я УКРАЛ ТЕЛЕФОН В МИЛАНЕ
9:18
Игорь Линк
Рет қаралды 127 М.
Здесь упор в процессор
18:02
Рома, Просто Рома
Рет қаралды 372 М.
КРУТОЙ ТЕЛЕФОН
0:16
KINO KAIF
Рет қаралды 6 МЛН
iPhone, Galaxy или Pixel? 😎
0:16
serg1us
Рет қаралды 504 М.
Это - iPhone 16 и вот что надо знать...
17:20
Overtake lab
Рет қаралды 88 М.