Whole Genome Sequence Analysis | Bacterial Genome Analysis

Whole Genome Sequence Analysis | Bacterial Genome Analysis | Bioinformatics 101 for Beginners

Рет қаралды 32,034

Bioinformatics Coach

Күн бұрын

Пікірлер: 96

@humphreyaddy7716 Жыл бұрын

I wish I discovered this channel long ago. I have all the resources to become a good in the area of bioinformatics.

@bioinformaticscoach 2 жыл бұрын

One-on-one coaching ______________________________________________________________________________________________ clarity.fm/vincentappiah Reach out ______________________________________________________________________________________________ bioinformaticscoach@gmail.com

@elizabethgyamfi1617 3 жыл бұрын

Great work. Simplified presentation. Well done

@ayoajayi280 3 жыл бұрын

Hello.

@bioinformaticscoach 2 жыл бұрын

One-on-one coaching: calendly.com/bioinformaticscoach

@yusufomowumi4771 3 жыл бұрын

This video was very helpful. Can you do a tutorial on how to detect contamination from reads? Thank you!

@alita2220 Жыл бұрын

This is an amazing tutorial, thank you! Because the sequence data is short + long, I am changing a few softwares for pacbio hifi data, it teaches me how to fish, it would be great if in the future there are videos for calling variants!

@bioinformaticscoach Жыл бұрын

You can watch the tutorial on snippy, bcftools and freebayes.

@naveedkhan-fi6ux 2 жыл бұрын

a great piece of work..... awesome explanation, make it easy to follow....... I wish you could upload a video for fungus comparative genome to sort out the effector

@jesusgiovanimamani5671 10 ай бұрын

thank you so much. But i have some doubts, Im using MacOS terminal, and I failed installing the environment. yaml. Is the problem for the type of OS? Is this tutorial only for Linux command?

@bluefox_genshin 9 ай бұрын

Hi, I'm also experiencing the same thing. :(

@johirislam8174 Жыл бұрын

hlw. does this lectures covers the WGS data analysis from initial to final in linux ??? I mean from quality check to variant calling variant annotation??

@dr.maqsoodahmad8572 5 ай бұрын

Great work, need more videos

@ldipotet 3 жыл бұрын

that's amazing work you have done here !! congrats

@bioinformaticscoach 3 жыл бұрын

Thanks. Expect more of such videos soon.

@ldipotet 3 жыл бұрын

@@bioinformaticscoach A challenge that could be interesting could be all these commands in a CWL pipeline.

@bioinformaticscoach 3 жыл бұрын

@@ldipotet Yes. That will interesting. Maybe we can take it up in the future.

@kubrateksen8845 3 жыл бұрын

Amazing, we are waiting more videos.

@SobinSGupta-vq3zn 3 жыл бұрын

your tutorial is amazing but when I try to follow the same steps automatically it loads the the sequence you have used for demonstration. How can I work with my own sequence following the same steps mentioned in the video. please reply me as early as possible. I will be really thankful

@manishvictor5293 2 жыл бұрын

Dear Dr. Vappiah very nice GitHub page and description of the same in the video. I am having problem in the ./polish.sh the program runs fine but in the end it returns CP: cannot stat 'pilon_stage1.fasta': no such file or directory Cat:polishing_process/pilon_stage1.changes:No such file or directory. Please can you sort the error

@bioinformaticscoach 2 жыл бұрын

Its likely you missed a step. Try to start the analysis from beginning

@muhammadshafiq3242 3 жыл бұрын

Hello, Sir, I have a problem with trimming. Could you kindly help me? When I write the skip it does not run for trimming.

@ehecatl3830 3 жыл бұрын

Thanks Dr. Your are very good!!!!!

@anindorahman2600 2 жыл бұрын

Hlew sir i have query, Conda env create -f environment.yaml, This code isnt working. Its remaining in the solving environment state for 2-3 days but still dosent work i have done all the things but dosent work can you please help in this matter

@bioinformaticscoach 2 жыл бұрын

Try updating your conda . If you still have issues, you can book a session with me and we can look at it.

@anindorahman2600 2 жыл бұрын

When are you available sir? I want to book e session... It still dosent work

@bioinformaticscoach 2 жыл бұрын

@@anindorahman2600 You can request a session here: clarity.fm/vincentappiah

@naveedkhan-fi6ux Жыл бұрын

Hi dear..... I was following your guideline for BRIGS but I can not able to compare my genomes because it shows the error of having big genome size, my specie genome size is 41mb so what other tool I can use for genome comparision

@bioinformaticscoach Жыл бұрын

You can use Circos. Alternatively you can book a session with me and we can discuss further

@MrManikprabhu Жыл бұрын

Hi, is it possible to make venn diagram for five or six genomes?

@leonmaric5055 3 жыл бұрын

Very helpful! greetings

@yushanlin2745 2 жыл бұрын

Thank you for such an amazing video, it really help me a lot with my research. I have a confusion: is reorder indispensable for bacterium assembly? Whether ignoring reorder affeccts pangenome analysis. I have finished mlst, detect virulence gene and it doesn't matter. My data is iIllumina NovaSeq Paired-end, 2×150bp. I read paper of Ragtag and find the data is long-read genome sequencing (average 15 kbp ) and from plant. Looking forward for your reply. Thans again.

@bioinformaticscoach 2 жыл бұрын

Reordering is not really necessary for pangenome. But I advise you do that if you want to generate a draft sequence of your sample.BEcause it maps your sequence to a reference genome and reorder the contigs using the reference genome as template. So the sequence is you get is better than the raw assembly assembly contigs.

@RuqaiyaTasneem-z5w 8 ай бұрын

hello, i am having issues trying to create the env.yaml in conda even after updating conda ... it says- warning libmamba Problem type not implemented SOLVER_RULE_STRICT_REPO_PRIORITY I am using WSL

@RuqaiyaTasneem-z5w 8 ай бұрын

Could not solve for environment specs The following packages are incompatible... its talking about bioperl

@RuqaiyaTasneem-z5w 8 ай бұрын

i have installed perl but it's showing the same issue

@biozarrice 3 жыл бұрын

Good Morning. I think your bioinformatics tutorials are amazing. Could you do a tutorial on genome annotation of eukaryotic organisms?

@bioinformaticscoach 3 жыл бұрын

Thanks Fernando for the suggestion. I will consider that.

@sciforlife 3 жыл бұрын

I have to work on project for this semester, and I want to do bioinformatics study on microbial data? I just need a direction..like what studies do I can do using bioinformatics techniques, or machine learning?

@bioinformaticscoach 3 жыл бұрын

That will be nice. First you need to read some papers to get to know what kinds of bioinformatics studies are done in the field. And that is why I made this video on bacterial genome analysis. You can do a similar analysis with the pipeline I demonstrated and use the explanation I gave as a guide. For the machine learning, you need to first identify your area of interest and look at how machine learning is being applied in that area. There are lots of dataset available for you to use. Just identify your area of interest and you will be able to connect the dots. For example if I am interested in cancer studies, then I would look at how machine learning is used to predict cancer. Get to know what datasets are available and choose the one that works for you.

@sciforlife 3 жыл бұрын

@@bioinformaticscoach thank you soo much. May God bless you

@bioinformaticscoach 3 жыл бұрын

@@sciforlife You are welcome. You can also share to those who need it.

@muhammadshafiq3242 3 жыл бұрын

Very Nice tutorial. Can I use XFTP and XShell instead of anaconda to to such kind of anlaysis ? Thanks

@bioinformaticscoach 3 жыл бұрын

Anaconda is used to install the tools. So it is important you install it. But if you have a server that has all the tools installed then you don't need to install it. XFTP and XShell are used to login to ssh servers. You use them if you are accessing the remote Server

@muhammadshafiq7141 3 жыл бұрын

@@bioinformaticscoach yes we have these servers providing by school, can i use these method's in xshell and xptp, which you used in this tutorial. I just watch the video.

@bioinformaticscoach 3 жыл бұрын

@@muhammadshafiq7141 you may want to ask your system admin for this. I personally use a linux system so I use my terminal to login. I have also used mobaxterm on windows.

@ldipotet 3 жыл бұрын

Hi Vincent I was trying yout our pipeline and I found that in my scenario spades.py fails wiht the option --carefull so I had to ran it with the --isolate option and the result is the same like you when running it with --carefull option. I guess that it is due to spade.py software version or any other aspect in this environment BUT in my scenario with --carefull generate an execution in Standar mode and rise different exceptions related with some internal compression processes. I'm new in these kind of ecosystem so what determine the the version of intalled software? because in your environment.yaml you never indicate any version. In my case I do it in my docker file that installs first the installation of my platform and after that I tailor every specific thing that I need for every especific channel. thanks in advance and a hint on this would be appreciated ..

@bioinformaticscoach 3 жыл бұрын

Hi @ldg, thanks for the message. For anaconda, if you don't specify a software version , it uses the most recent one. the --careful option works with spades 3.14 and upwards. So if you got the error , the its likely your spades version is lower than 3.14. Thanks for the suggestion as well

@ldipotet 3 жыл бұрын

@@bioinformaticscoach The version that I am running: SPAdes genome assembler v3.15.3. The manual indicate about Isolate : "This option is not compatible with --only-error-correction or --careful options." Thank you so much for your answer and for the clarification about versions management in Anaconda.

@bioinformaticscoach 3 жыл бұрын

@@ldipotet Yes its true about the compatibility. So you have to choose.

@nickalbbar Жыл бұрын

First of all, thanks for this very very helpful video. I was following the pipe line, but got stuck in an error in the step where you run the reorder_contigs script. It starts to run, but then i got the following message """ Traceback (most recent call last): File "extract_reordered.py", line 13, in reordered=[i for i in allseq if 'RagTag' in i.id and ID in i.id][0] IndexError: list index out of range """ Then it doesnt generate the P7741.reordered.fasta. I've tried to repeat the process, but can't find a solution What should i do?

@bioinformaticscoach Жыл бұрын

Hi @nickalbbar. Are you running the pipeline on your own dataset or the data provided in the tutorial?

@nickalbbar Жыл бұрын

@@bioinformaticscoach i'm using the dataset provided in the tutorial

@bioinformaticscoach Жыл бұрын

Hi @nickalbbar. I am investigating the issue. I will get back to you

@nickalbbar Жыл бұрын

@@bioinformaticscoach OK!! Once again, thank you so much

@bioinformaticscoach Жыл бұрын

@@nickalbbar In the meantime you can watch this tutorial. I am sure it will be useful: kzbin.info/www/bejne/enzNeIitpaiHeqM

@rajneeshdadwal Жыл бұрын

I wish to extract the draft genome from ragtag output how can i do the same??

@bioinformaticscoach Жыл бұрын

I do this by using some python codes. You can modify the extract_reordered.py file and use it to extract the draft sequence. If you still have issues, then you can book a session with me.

@josephwestley789 3 жыл бұрын

Hello, this is a great tutorial, thank you for putting it together! I am encountering an error when trying to run ./polish.sh. I am getting "Unable to access jarfile /bacterial-genomics-tutorial/apps/pilon.jar". Do you have any idea of what might be causing this error? Thanks in advance! EDIT: I am doing this is WSL1 by the way.

@bioinformaticscoach 3 жыл бұрын

This pipeline was designed to run directly on the bash. If you are having this error, then you have to modify the script and put the path of the pilon jar file in it. Or check to make sure the pilon jar file has been downloaded

@josephwestley789 3 жыл бұрын

@@bioinformaticscoach EDIT: Thank you for your reply, I had not extracted the jar folder contents. I have done so now, and it appears to be running!

@purvagohil2240 3 жыл бұрын

Is this possible for single-end reads from ion torrent?

@bioinformaticscoach 3 жыл бұрын

Yes. the procedure can be applied. This paper may help: dl.acm.org/doi/10.1145/3093338.3093362

@luisrendon5792 3 жыл бұрын

Hello, I'm still with problems in the step: reorder_contigs.sh... I've repeated the pipeline several timer but I can't continue, how can I solve this? Thanks

@bioinformaticscoach 3 жыл бұрын

What's the error message that is displayed?

@luisrendon5792 3 жыл бұрын

@@bioinformaticscoach when I execute: ./reorder_contigs.sh I have not results, this what the result told me: FileNotFoundError: [Errno 2] No such file or directory: 'P7741_reordered/ragtag.scaffolds.fasta'

@bioinformaticscoach 3 жыл бұрын

@@luisrendon5792 Its likely you missed one of the steps.I would like you to take your time and repeat them. Also, are you running the commands on Linux or MacOS?

@graphiomics 2 жыл бұрын

I also got an error in this process when I used my own sequences. "ragtag.scaffolds.fasta" is not found. It shows something wrong with the reference genome. Your help is an emergence. Thanks a lot for this tutorial.

@sidratahir3645 2 жыл бұрын

@@bioinformaticscoach Traceback (most recent call last): File "/home/sar/bacterial-genomics-tutorial/extract_reordered.py", line 10, in allseq=[i for i in SeqIO.parse(fastafile,'fasta')] File "/home/sar/miniconda3/envs/bacterial-genomics-tutorial/lib/python3.10/site-packages/Bio/SeqIO/__init__.py", line 605, in parse return iterator_generator(handle) File "/home/sar/miniconda3/envs/bacterial-genomics-tutorial/lib/python3.10/site-packages/Bio/SeqIO/FastaIO.py", line 183, in __init__ super().__init__(source, mode="t", fmt="Fasta") File "/home/sar/miniconda3/envs/bacterial-genomics-tutorial/lib/python3.10/site-packages/Bio/SeqIO/Interfaces.py", line 48, in __init__ self.stream = open(source, "r" + mode) FileNotFoundError: [Errno 2] No such file or directory: 'P7741_reordered/ragtag.scaffolds.fasta' this error occured while using ./reorder_contigs.sh?

@ayoajayi280 3 жыл бұрын

Hello. I love this presentation. I am a beginner, can someone please quickly take me through the system requirement, how I can get or install Linux, how I can get it installed with some of the tools for genome analysis. Thanks.

@bioinformaticscoach 3 жыл бұрын

First of all there are different flavors of linux (Ubuntu, CentOS,etc). They are all free to download and install. You can install in a virtual environment using the virtual box tool. Once you do that you can send a notice and we pick it up from there.

@ayoajayi280 3 жыл бұрын

@@bioinformaticscoach Thanks. Please what is the minimum system requirement that will be ideal for analysis of bacterial genomes and installation of those tools

@bioinformaticscoach 3 жыл бұрын

@@ayoajayi280 I will recommend a core i7 3.40GHz., 16GB RAM (32GB or higher will be great) and 1TB storage. I will recommend you install Linux as the main operating system instead of the virtual box approach.

@nmg1909 3 жыл бұрын

@@bioinformaticscoach I love your presentation here. I have been searching for a bacteria population dataset for my research: "Biocorrosion detection in structures" I would appreciate if you can point me to a link where I can get the microbial organism population dataset. Thanks.

@bioinformaticscoach 3 жыл бұрын

@@nmg1909 What I do is search for papers on bacterial genome analysis. Usually they show the list of genomes used and you can download them. Here is an example of a dataset: cge.cbs.dtu.dk/services/evolution_data.php We can discuss this further on my facebook page ( web.facebook.com/Bioinformatics-Coach-100614805459525 ) or twitter ( @BioinfoCoach )

@raselbarua4578 3 жыл бұрын

Good job

@bioinformaticscoach 3 жыл бұрын

Thanks

@billclintonaglomasa6543 4 жыл бұрын

Great.

@abdullahijama690 3 жыл бұрын

Thanks for your tutorial and I have learnt a lot from this tutorial. I have problem when I was doing bacterial-genomics-tutorial; when I want to create conda env create --quiet -f environment.yaml : Solving environment: ...working... failed ResolvePackageNotFound: - sratoolkit I am getting this message!

@bioinformaticscoach 3 жыл бұрын

I have made modification to the yaml file. Please run the command again and let me know if it works

@bioinformaticscoach 3 жыл бұрын

Please you have to download the updated yaml file or manually edit the yaml file and remove the line with the sratoolkit

@sheynjila2457 Жыл бұрын

I am encountering the following problems when installing the python packages: bacterial-genomics-tutorial> conda env create --quiet -f environment.yaml Retrieving notices: ...working... done Collecting package metadata (repodata.json): ...working... done Solving environment: ...working... failed ResolvePackageNotFound: - porechop - mash - samtools - spades - perl-db-file - roary - sra-tools - perl-padwalker - sickle-trim - bwa - mafft - minimap2 - mummer

@bioinformaticscoach Жыл бұрын

Try updating your conda before installing the packages

@sheynjila2457 Жыл бұрын

@@bioinformaticscoach Thanks for the reply. I have updated conda but it has not changed.