Familial long-read sequencing increases yield of de novo mutations

Studies of de novo mutation (DNM) have typically excluded some of the most repetitive and complex regions of the genome because these regions cannot be unambiguously mapped with short-read sequencing data. To better understand the genome-wide pattern of DNM, we generated long-read sequence data from an autism parent-child quad with an affected female where no pathogenic variant had been discovered in short-read Illumina sequence data. We deeply sequenced all four individuals by using three sequencing platforms (Illumina, Oxford Nanopore, and Pacific Biosciences) and three complementary technologies (Strand-seq, optical mapping, and 10X Genomics). Using long-read sequencing, we initially discovered and validated 171 DNMs across two children-a 20% increase in the number of de novo single-nucleotide variants (SNVs) and indels when compared to short-read callsets. The number of DNMs further increased by 5% when considering a more complete human reference (T2T-CHM13) because of the recovery of events in regions absent from GRCh38 (e.g., three DNMs in heterochromatic satellites). In total, we validated 195 de novo germline mutations and 23 potential post-zygotic mosaic mutations across both children; the overall true substitution rate based on this integrated callset is at least 1.41 × 10-8 substitutions per nucleotide per generation. We also identified six de novo insertions and deletions in tandem repeats, two of which represent structural variants. We demonstrate that long-read sequencing and assembly, especially when combined with a more complete reference genome, increases the number of DNMs by >25% compared to previous studies, providing a more complete catalog of DNM compared to short-read data alone.

PMID: 35290762 DOI: 10.1016/j.ajhg.2022.02.014

Am J Hum Genet. 2022 Mar 9;S0002-9297(22)00065-9.


Other Contributors

Michelle D Noyes 1William T Harvey 1David Porubsky 1Arvis Sulovari 1Ruiyang Li 1Nicholas R Rose 1Peter A Audano 1Katherine M Munson 1Alexandra P Lewis 1Kendra Hoekzema 1Tuomo Mantere 2Tina A Graves-Lindsay 3Ashley D Sanders 4Sara Goodwin 5Melissa Kramer 5Younes Mokrab 6Michael C Zody 7Alexander Hoischen 8Jan O Korbel 4W Richard McCombie 5Evan E Eichler 9


  • Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA.
  • 2Department of Human Genetics, Radboud University Medical Center, 6500 Nijmegen, the Netherlands; Laboratory of Cancer Genetics and Tumor Biology, Cancer and Translational Medicine Research Unit and Biocenter Oulu, University of Oulu, 90220 Oulu, Finland.
  • 3McDonnell Genome Institute, Washington University, St. Louis, MO 63108, USA.
  • 4European Molecular Biology Laboratory, Genome Biology Unit, 69117 Heidelberg, Germany.
  • 5Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA.
  • 6Department of Human Genetics, Sidra Medicine, PO Box 26999, Doha, Qatar; Weill Cornell Medicine, PO Box 24144, Doha, Qatar; College of Health and Life Sciences, Hamad Bin Khalifa University, PO Box 34110, Doha, Qatar.
  • 7New York Genome Center, New York, NY 10013, USA.
  • 8Department of Human Genetics, Radboud University Medical Center, 6500 Nijmegen, the Netherlands; Radboud Institute of Medical Life Sciences and Department of Internal Medicine and Radboud Center for Infectious Diseases, Radboud University Medical Center, 6500 Nijmegen, the Netherlands.
  • 9Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA 98195, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA