Towards whole genome sequencing (WGS)-based Salmonella source attribution: Patterns of source distribution of Salmonella Typhimurium revealed by large-scale WGS

Routine application of whole genome sequencing (WGS) in the monitoring and surveillance of foodborne pathogens has created a wealth of publicly available genomes and associated metadata, particularly for prevalent pathogens such as Salmonella enterica serotype Typhimurium (ST). It is now possible to study population structures of such pathogens from various sources, at unprecedented scales and in a dynamic spatiotemporal context with constant feeds of new data.

In this study, we aimed to (1) build a comprehensive phylogeny of ST, (2) identify phylogenetic lineages associated with particular sources; (3) characterize identified groups/lineages with the purpose of exploring WGS as a method for ST source tracking.

Genomes of 1,268 ST isolates (between 1930 and 2014) were studied including (1) clinical isolates (n=107) representing diversity of ST molecular subtypes (PFGE and MLVA) from CDC; (2) public genomes of various sources and geographic locations (46 US states and 39 countries) (n=1,161). A comprehensive phylogeny was constructed using core-genome SNPs. Gene contents and pseudo genes were compared among major population groups. Overall metabolic potentials of selected strains from major lineages were evaluated with Biolog Phenotype Microarrays. Association between specific lineages and particular sources (e.g., poultry, swine, bovine, avian etc.) was statistically evaluated.

Ten major population groups were identified. Clustering of isolates from the same source was observed in multiple cases, including clades overrepresented by isolates from poultry, bovine, swine and avian samples. Evolutionary analyses suggest that a least two of these clades originated in recent decades and appeared to be associated with meat production. Representative isolates from an avian clade and a swine clade displayed distinct metabolic profiles from others featuring systemic incapability in utilizing multiple substrates.

This study built a comprehensive population structure of ST, which provides a framework for evaluating and exploring WGS for foodborne pathogen source attribution.