Skip Navigation

How to Download

FRDR offers multiple ways to download datasets. Learn more in our documentation.

Machine learning reveals the dynamic importance of accessory sequences for Salmonella outbreak clustering

Description: The dataset comprises a collection of Salmonella genomes associated with 58 historical outbreaks curated from the GenomeTrakr database and peer-reviewed PubMed articles. The outbreaks are linked to a diverse set of lineages that can be subdivided into 10 different serovars. The genomic data is accompanied by detailed contextual information describing the collection date, geographical origin and isolation source of the outbreak cases. The dataset was compiled to train classifiers to predict outbreak cluster labels from genomic data and infer a parsimonious set of outbreak predictive markers from the resulting classifiers.
Authors: Liu, Chao Chun; Simon Fraser University; ORCID iD 0000-0002-8645-2598
Hsiao, William; Simon Fraser University; ORCID iD 0000-0002-1342-4043
Keywords: Bacterial genomics
Salmonella
Whole genome sequencing
Enteric outbreaks
Infectious diseases
Field of Research: 
Biological sciences
>
Microbiology
>
Microbial genetics
Publication Date: 2024-02-20
Publisher: Federated Research Data Repository / dépôt fédéré de données de recherche
Funder: Genome British Columbia; 286GET
Mitacs; MITACS Accelerate
URI: https://doi.org/10.20383/103.0884
Related Identifiers: 






This dataset is derived from


This dataset is derived from
Geographic Coverage: 
Country
Argentina

Country
Australia

Country
Bangladesh

Country
Belgium

Country
Canada

Country
Ecuador

Country
France

Country
Mauritius

Country
Mexico

Country
Turkey

Country
Vietnam

State
New Jersey
Country
United States

State
Minnesota
Country
United States

State
Ohio
Country
United States
Appears in Collections:SFU Research Data

Files in Dataset 
No files uploaded
Download entire dataset using Globus Transfer. This method requires a Globus account and installing software. Watch Video: Get Started with FRDR: Download a Dataset
Download with Globus
Files for this dataset are currently being backed up so it cannot be approved at this time. Please try later.

Access to this dataset is subject to the following terms:
Creative Commons Attribution 4.0 International (CC BY 4.0) https://creativecommons.org/licenses/by/4.0/
Citation
Liu, C., Hsiao, W. (2024). Machine learning reveals the dynamic importance of accessory sequences for Salmonella outbreak clustering. Federated Research Data Repository. https://doi.org/10.20383/103.0884