The value of genomic analysis
Genetic heritability is responsible for 30% of individual health outcomes, but it isn’t widely used to guide disease prevention and care. Each individual carries 4-5 million genetic variants, each with differing influence on traits related to our health. The cost to sequence a genome has reduced drastically in recent years, and sequence data shows potential for more widespread use. However, the ability to read the sequence accurately and to meaningfully interpret it remain obstacles to broad adoption.

Improving the accuracy of genomic analysis

-
Identifying disease-causing variants in cancer patients
Researchers wanted to understand if incorporating automated deep learning technology would improve the detection of disease-causing variants in patients with cancer. In a cross-sectional study published in JAMA of 2,367 prostate cancer and melanoma patients in the US and Europe, DeepVariant found disease-causing variants in 14% more individuals than prior state-of-the-art methods.
-
Building large-scale cohorts for genetic discovery research
Large cohorts of sequenced individuals are the foundations for discovery of novel genetic associations with disease. We developed best practices for generating cohorts that substantially improves over previous methods, which has been adopted by the UK Biobank for its large-scale sequencing efforts.
Read the article
Improving genetic association discovery with machine learning

Our partners in genomics research
-
DeepVariant’s precisionFDA Truth Challenge V2 submission using PacBio HiFi reads achieved the highest single-technology accuracy, which has been featured on the PacBio blog and in a Nature Biotechnology retrospective. The collaboration also successfully launched DeepConsensus, which improves HiFi yield and read quality compared to existing consensus basecalling methods.
-
The Regeneron Genetics Center, one of the world’s largest human genomic research efforts, has adopted DeepVariant and re-trained specialized models for both internal projects and the delivery of 200,000 exomes to UKBiobank.
-
Benedict Paten’s lab at UC Santa Cruz collaborated with Google on PEPPER-Deepvariant, which won best accuracy in the Oxford Nanopore Technologies category of the PrecisionFDA. The paper was also published in Nature Methods.
-
NVIDIA Clara Parabricks Pipelines software provides a suite of accelerated bioinformatic tools to support DNA and RNA applications, running on a GPU. Their implementation of DeepVariant processes a 30x whole human genome in less than 25 minutes from fastq to vcf using their latest A100 GPU.