Рет қаралды 974
Aligning RNA-seq consists of a two step process. The first step identifies coordinates of a read relative to a reference genome or reference transcripts. Due to similarities between genes and transcripts most reads can not be aligned uniquely, resulting in ambiguously aligned reads. To accurately estimate the correct transcript counts, a second step proportionally assigns these counts to transcripts. This step is called the Expectation Maximization of transcript counts. While the math can be a bit confusing, the actual implementation is easy. The Jupyter notebook with an example implementation can be found here:
github.com/lac...