How to rarefy community data in R with vegan and the tidyverse (CC200)

Рет қаралды 11,647

Күн бұрын

Пікірлер: 29

@silviagonzalezcolino858 6 ай бұрын

Thank you so much for your tutorials!! You make complex things look easier which is very helpful (specially in analysing data)

@Riffomonas 6 ай бұрын

Thanks!

@de.bora.bora.yt-chan 2 жыл бұрын

Thank you so much for this detailed explanation! It was really useful for my own data.

@Riffomonas 2 жыл бұрын

Wonderful - I'm glad it was useful! Thanks for watching

@charleslehnen9636 2 жыл бұрын

Check out the parameter of vegan's function: "Instead of drawing a plot, return a “tidy” data frame than can be used in ggplot2 graphics. The data frame has variables Site (factor), Sample and Species."

@Riffomonas 2 жыл бұрын

Thanks - yeah i think that's new since I made the video

@cristianvillenaalemany7972 2 жыл бұрын

Thank you very much for all the material, very useful! I have different library sizes in my microbiome data and I would like to normalize it using rarefaction to min_n_seqs since the smaller sample contains more than 12000 reads, as you well explained. If I use vegan:rrarefy, I obtain the specified subset of reads from my original OTU table. One single random subset might not be representative enough for each sample since there is high diversity. Is there a way you recommend to carry on a multiple iteration rarefaction and a final OTU table in which the values are the average of the multiple subsets? Thanks for your attention.

@Riffomonas 2 жыл бұрын

Thanks for watching! RUnning rrarefy a bunch of times and then averaging the counts is effectively the same as using the relative abundance, which I showed in an earlier episode causes problems. I would suggest running whatever test you're doing on single a subsampling and then repeat it a few times to see if the results change any. In my experience, the low relative abundance taxa are what change the most and for most OTU-based analyses they don't come up as significant. if they do, I generally discount them because they're so rare.

@cristianvillenaalemany7972 2 жыл бұрын

@@Riffomonas I will try as you suggest, it makes a lot of sense. Again, thank you very much for your help! Your videos are awesome!

@meseretmuche6984 2 жыл бұрын

thank you so much for your unlimited help, Dr, if you have a lecture regarding Hill number (q=0, q=1, q=2) for diversity analysis of vegetation, please provide me.

@Riffomonas 2 жыл бұрын

My pleasure? Unfortunately I don’t have anything about hill numbers

@oluwafemioyedele 2 жыл бұрын

@Path, I think there is a function to deal with either rownames or columname in tibble package

@Riffomonas 2 жыл бұрын

Ah - you're right! Thanks :) tibble.tidyverse.org/reference/rownames.html

@gimanibe Жыл бұрын

Thanks Pat. How would you plot the output of drarefy?

@edwinimfumu3221 Жыл бұрын

Hi Sir, thanks for your videos. I rarefied my data using iNEXT. Now i am having problem to plot the data. Can you show how to resize plot, legend, etc when using iNEXT

@brantainman 2 жыл бұрын

@ 5:00 it is suggested that tibbles do not allow row names. I think this is incorrect and the following code is the tidy way to do it: shared %>% pivot_wider(names_from = name, values_from = values, values_fill = 0) %>% column_to_rowname('Group')

@brantainman 2 жыл бұрын

Also, vegan has a great new feature that avoids all the data manipulation for getting tidy data: my_curves

@Riffomonas 2 жыл бұрын

That’s great to see!

@Riffomonas 2 жыл бұрын

This actually creates a data frame rather than a tibble. A tibble is a special kind of data frame

@sven9r 2 жыл бұрын

we rarely see this face @20:44 ! Pat thinking longer than a nanosecond about one of his 2198321673213 variables.

@Riffomonas 2 жыл бұрын

Lol. Plus I think it was the end of a long day at the end of a long week 😂🤓

@GabrielYan-r3g Жыл бұрын

Thanks Pat. In QIIME2, your taxa will reach a plateau while your sampling depth increases to a certain level. Is there a similar approach to get that number of sampling depth while plotting the rarefaction curves in Vegan?

@Riffomonas 7 ай бұрын

Sorry, I don't use qiime and am not really familiar wiht why you see that. Perhaps because they're using closed reference clustering and it is saturating all of the available taxa in the reference?

@nendinosaurus Жыл бұрын

Ok nice, but what do you use for bar plots then for example? When you need a single dataframe. Do you use a single subsampling for things like that?

@Riffomonas 7 ай бұрын

I don't usually use barplots 😂 I would take the average value for each sample and plot that as a jittered plot

@bugslutt Жыл бұрын

If I want to loop the rrarefy command on my data matrix 1000 times and save all the output (to ultimately calculate an average), what code would I use? I've been trying to figure it out and am struggling!

@lisakelly4921 2 жыл бұрын

If you rarefy to min_n_seq is there a risk of removing significance between two groups of samples when you downstream statistical analysis?

@Riffomonas 2 жыл бұрын

Hi Lisa thanks for watching! I think there’s a trade off. If you increase the min_n_seqs value you will have a better limit of detection but fewer samples. With fewer samples you’ll have less statistical power to detect differences. It might be worth running an analysis at multiple levels and see what happens