Analysis of transcriptomes in a porcine tissue collection using RNA-seq and genome assembly 10

H. Hornshøj, B. Thomsen, J. Hedegaard, C. Bendixen, O. Madsen, R. Crooijmans, M. Groenen, A. Archibald, L. Rund, L. Schook
Plant and Animal Genome XIX Conference, January 15-19, 2011, San Diego, CA

Abstract:

The release of Sus scrofa genome assembly 10 supports improvement of the pig genome annotation and in depth transcriptome analyses using next-generation sequencing technologies. In this study we analyze RNA-seq reads from a tissue collection, including 10 separate tissues from Duroc boars and 10 fetal tissues from “Pinky”, a clone of Tabasco that was used for genome sequencing and assembly. Sequencing was carried out either on a mixed cDNA library for “Pinky” tissues or on individual libraries for the remainder of the tissues, all using the Illumina sequencing platform. Using the Tophat RNA short read alignment software we mapped the reads to the genome assembly 10. We extracted contig sequences of gene transcripts using the Cufflinks software. Based on this information we identified expressed genes that are present in the genome assembly. The portion of these genes being previously known was roughly estimated by sequence comparison to known genes. Similarly, we searched for genes that are expressed in the tissues but not present in the genome assembly by aligning the non-genome-mapped reads to known gene transcripts. For the genes predicted to have alternative transcript variants by Cufflinks we computed the occurrence of various alternative splicing events. Finally, we made a comparison of coding sequences represented by the genome and transcriptome respectively, to identify possible short sequence variations.