The human genome is officially fully sequenced. Now what?
The takeaway: Researchers sequenced the first gap-free human genome. More than 20 years after the human genome’s initial publication, researchers were able to finally sequence the once elusive and complex genomic regions popularly called “junk DNA,” which may, as it turns out, not be junky after all.
You probably saw the headline and might have even read the story: In the early spring of 2022, researchers published the first gapless sequence of the human genome.
This feat caps a decades-long effort coordinated by the U.S. Department of Energy and the National Institutes of Health, which sought to identify all of the sequences in the human genome. This work has helped researchers understand diseases, identify mutations linked to different kinds of cancer, advance forensic applied science, and boost commercial development of DNA-based products for genomics research.
Why is the Human Genome Project important and what were the goals of the human genome project? The Human Genome Project was originally slated to last 15 years, from 1990 to 2005, but saw its timeline accelerated thanks to the development of next generation sequencing (NGS). With the help of NGS, the human genome sequencing project wrapped up in 2003. The publication of the initial human genome and every subsequent update has proven to be a catalyst for genomic research developments from personalized medicine to on-going mega projects such as the Earth BioGenome Project. “Generating a truly complete human genome sequence represents an incredible scientific achievement, providing the first comprehensive view of our DNA blueprint,” said Eric Green, MD, PhD, director of National Human Genome Research Institute. “This foundational information will strengthen the many ongoing efforts to understand all the functional nuances of the human genome, which in turn will empower genetic studies of human disease.”
Human Genome Sequenced—Buuuuuuut Not Exactly
When the human genome was declared “fully mapped” in 2003, it actually was not 100 percent mapped. Instead, it was technically completed to the resolution that techniques at the time allowed. The genomics community has improved significantly on those technical limitations in the past 20 years and that is why last year’s announcement was so important. And interesting.
The remaining 8 percent of the genome to be mapped was made up of highly repetitive DNA sequences that—back in the early 2000s—were deemed “unreadable” due to technological and computational limitations.
The human genome is made up of about 3 billion base pairs. NGS works by reading short successive regions of the genome making it easier to process. The short reads are amplified, reassembled into longer contigs (or DNA segments), and then aligned in the correct order to create the larger and continuous sequence. This works great for distinct regions of the genome like coding regions, but reaches a serious roadblock in cases where a sequence contains many repetitive parts. Those repetitive elements, which were difficult to decipher, were originally called “junk” or “dark” DNA.
“Ever since we had the first draft human genome sequence, determining the exact sequence of complex genomic regions has been challenging,” said Evan Eichler, PhD, researcher at the University of Washington School of Medicine and a leader in the effort, in a statement. “I am thrilled that we got the job done. The complete blueprint is going to revolutionize the way we think about human genomic DNA variation, disease, and evolution.”
Feat Lauded as the “Method of the Year”
Advances in genomic sequencing have allowed us to map the complete human genome. While more established short-read sequencing has boomed in popularity as its cost has plummeted, this sequencing effort was made possible through long-read methods. The long-read sequencing method that helped make this completion of the human genome possible was lauded as the “Method of the Year” recently in Nature. Long reads, Nature noted, have been critical at ushering in new lab discoveries and are fueling projects such as the Vertebrate Genomes Project and the Telomere-to-Telomere Consortium, the group that completed the human genome sequencing. In contrast to its predecessor short read sequencing which caps out at about 500 base pairs in read length, long-read sequencing has achieved reads that stretch out to 1 million base pairs.
This new genomics tool is allowing us to learn more about humanity and diversity and giving rise to population-level data that is encompassing a variety of diverse people, including those who until now have not been the subject of many genomic studies. Long-read technology, the article added, is also of interest to “researchers working on cancers with copy number aberrations and unstable genomes, such as esophageal and ovarian cancers. Long-read approaches are generally better for detecting and characterizing the complex genome rearrangements and structural variation typical of many cancers.”
Long-read sequencing is also adept for tackling more difficult sections of a genome—sections with reoccurring short tandem repeats of base pairs—and is a key component of the coming genomic era “in which sequencing will be applied in many ways and become more widespread,” Nature noted.
Junk DNA: What was learned from sequencing it?
What is one benefit of mapping the human genome? The newly completed sequencing of the repetitive or once so called “junk” DNA, due to initial thoughts that it did not do much, has revealed that these regions actually play an important role in human biology.
“Nonetheless, biochemical functions have been identified for an increasing fraction of DNA elements traditionally seen as ‘Junk DNA,” notes a recent study. “These findings have been interpreted as fundamentally undermining the ‘Junk DNA’ concept.”
In fact, years ago, human junk or satellite DNA was found to play a role in chromosomes sticking together inside a cell’s nucleus during cell division. This process is critical to prevent errors when copying DNA. In studies, using fruit flies and mouse cells, researchers noticed that when that process was disrupted cells lost critical regions of the genome and subsequently died.
“When they removed a protein that normally binds to mouse satellite DNA, the cells again formed micronuclei and did not survive,” noted one account. “The similar findings from both fruit fly and mouse cells (led researchers) to believe that satellite DNA is essential for cellular survival, not just in model organisms, but across species that embed DNA into the nucleus—including humans.”
Why are scientists reconsidering the purpose of junk DNA? Human junk DNA may even be able to take credit for our own big brains. A study published early in 2023 suggests that junk DNA may have facilitated the growth of human brains with large lobes and complex information systems. Under this theory, ancestral junk DNA acquired the ability to code for proteins. In their work, authors compared the genomes of humans, chimpanzees, and macaques and located 74 examples of junk DNA that transformed into protein-encoding DNA.
What’s next for sequencing?
Though a maturing technology, sequencing is by no means done with improvements. Here are some advancements to look forward to:
- Simple, small technology tools available to labs that are easier to use and may even fit in a pocket.
- Even lower sequencing costs, with falling costs fueled by computational advances.
- Greater accuracy—reads that are perfect or nearly perfect.
- Faster processes that won’t break the bank.
To learn more about sequencing products and solutions from IDT, please visit the xGen™ NGS home page.
*RUO—For research use only. Not for use in diagnostic procedures. Unless otherwise agreed to in writing, IDT does not intend for these products to be used in clinical applications and does not warrant their fitness or suitability for any clinical diagnostic use. Purchaser is solely responsible for all decisions regarding the use of these products and any associated regulatory or legal obligations.