Porkchops and bacon: My two favorite animals – or how whole genome re-sequencing is revealing mosaic origins and footprints of selection in the pig genome

H.J. Megens, G. Larsor, R. Crooijmans, L. Frantz, M. Bosse, Y. Pauel, O. Madsen, L.B. Schook, C.J. R.L. Andersson, M. Groenen
Netherlands Bioinformatics Conference, April 24-25, 2012, De Werelt, The Netherlands


“The Pig” does not exist. What we refer to as pigs are animals that were domesticated in at least two very different areas – China and Asia Minor. The first farmers in Europe brought their pigs from the Fertile Crescent, but along the centuries, continued introgression from local wild populations resulted in extensive contributions from European wild boar in local pigs. During the onset of the Industrial Revolution in Europe, Asian pigs were imported to improve local European pigs, making them more prolific, fatter, and better adapted for living in sties, rather than outdoors. Therefore, what we call a ‘European pig’ nowadays, particularly those breeds that are currently used for commercial breeding, display a highly complex biogeographic past.

To gain a fundamental insight in the variation of the domesticated forms of the species, a comprehensive analysis is needed that includes wild relatives and divergent groups such as different subspecies and species. We re-sequenced around 100 complete genomes of European and Asian pigs and wild boar, and five outgroup species. Individuals were sequenced on average to a depth of 8 to 10X using Illumina GA2 and HiSeq.

Genome-wide analysis of phylogenetic patterns revealed regions that show an unusual coalescent pattern. Most notable is large, 50Mbp region, on the X-chromosome that shows very little recombination (LRX) that displays an unusual biogeographic distribution.  Various other genomic regions show phylogenetic discordance that could signify either selection or past hybridization events, even between Sus scrofa and related species. The genome-wide average for genetic divergence is congruent with species divergence at the onset of the Pleistocene, 2.5MA, but outlier regions can show a much higher divergence time, such as LRX, which has a deep coalescence time of around 6MA.

Further analysis of variation in the species furthermore revealed a very strong correlation to the genetic recombination landscape. Variation of recombination on a chromosome-scale is more pronounced than in many other mammalian species, and as a consequence the interplay between recombination and variation (and divergence) are also highly pronounced. This has important consequences for genetic conservation, but also for interpreting signatures of selection.

Comparisions between Western and Eastern populations, and between wild and domestic populations revealed a large number of genes that may be under natural and domestication selection. These analyses were based on aberrations in the ancestral allele frequencies, and degree of differentiation of genes between groups of population. This strategy furthermore revealed around 60 putative QTN, of which a large portion overlap with known or suspected candidate genes for behavior, body size and conformation, muscle development, fatness, and fertility.