The ancient Japanese art of origami is based on the idea that nearly any design - a crane, an insect, a samurai warrior - can be made by taking the same blank sheet of paper and folding it in different ways.
The human body faces a similar problem. The genome inside every cell of the body is identical, but the body needs each cell to be different –an immune cell fights off infection; a cone cell helps the eye detect light; the heart’s myocytes must beat endlessly.
Appearing online this week in the journal Cell, researchers at Baylor College of Medicine, Rice University, the Broad Institute of MIT and Harvard, and Harvard University describe the results of a five-year effort to map, in unprecedented detail, how the 2-meter long human genome folds inside the nucleus of a cell. Their results show that the cell–- like a microscopic origamist – modulates its function by folding the genome into an almost limitless variety of shapes.
A centerpiece of the new study is the first reliable catalog of loops spanning the entire human genome. For decades, scientists have examined the regions in the close vicinity of a gene to understand how it is regulated. But as the genome folds, sequences far from a gene loop back and come in contact with those nearby elements.
Looping has been a blind spot for modern biology. “For over a century, scientists have known that DNA forms loops inside of cells, and that knowing where the loops are is incredibly important,” said co-first author Suhas Rao, a researcher at the Center for Genome Architecture at Baylor. “But mapping the positions of all those loops was long thought to be an insurmountable challenge.”
The researchers showed that the 3 billion DNA letters of the human genome are partitioned into roughly 10,000 loops, a surprisingly small number. (Prior work on loops had suggested that the genome contains over a million.)
“In the early days of human genome sequencing, scientists believed that humans had hundreds of thousands of genes. The genome project revealed far fewer genes than everyone was expecting,” said Dr. Erez Lieberman Aiden, senior author of the study, director of the Center for Genome Architecture, and an assistant professor of molecular and human genetics at Baylor College of Medicine and the departments of computer science and computational and applied mathematics at Rice University. “The fact that there are so few loops is a similar surprise.”
The team’s research showed that, although few in number, DNA loops play an essential role in nearly every process inside the cell. That’s because many loops have genes at one end. When the loop forms, the gene turns on.
“Folding drives function,”said co-first author Miriam Huntley, a Ph.D. student in the Harvard School of Engineering and Applied Sciences working with Aiden. At the other end of these loops--far away from the genes that they regulate--lay hitherto unknown genetic switches buried deep in so-called junk DNA.
“Our maps of looping revealed thousands of hidden switches that scientists didn’t know about before,” said Huntley. “In the case of genes that can cause cancer or other diseases, knowing where these switches are is vital.”
The team also discovered a series of rules about how and where loops can form.
“If DNA were a shoestring, you could make a loop anywhere. But within the cell, the formation of loops is highly constrained,” said Rao. “The loops we see almost all span fewer than 2 million genetic letters; they rarely overlap; and they are almost always associated with a single protein, called CTCF.”
CTCF is known to be involved in the regulation of the 3D structure of chromatin, the building block of chromosomes.
“The most stunning discovery was about how CTCF proteins form a loop” said Dr. Eric Lander, a corresponding author on the paper. “Even when they are far apart, the CTCF elements that form a loop must be pointing at each other – forming a genomic yin and yang.” Lander is director of the Broad Institute, professor of biology at MIT, and professor of systems biology at Harvard Medical School.
Interestingly, the team found that the largest loops in the genome are only present in women. Huntley pointed out that “the copy of the X chromosome that is off in females contains gigantic loops that are up to 30 times the size of anything we see in males.”
They also found that many of the loops present in humans are also present in mice, implying that these specific folds have been preserved over nearly one hundred million years of evolution.
“Our findings suggest that mammals share not only similar 1D genome sequences, but also similar 3D genome folding patterns,” said Aiden, also a McNair Scholar at Baylor.
In origami, all designs, no matter how complex, can be created using one of two fundamental folds: the mountain fold, and the valley fold. Rao noted that loops play a similar role for the genome.
“The loop is the fundamental fold in the cell’s toolbox. We found that the formation and dissolution of DNA loops inside the nucleus enables different cells to create an almost endless array of distinct 3D folds and, in so doing, accomplish an extraordinary variety of functions,” he said.
Rao, Huntley, Lander, and Aiden’s co-authors are Neva C. Durand, Ivan D. Bochkov, Adrian L. Sanborn, Ido Machol and Arina D. Omer, at the Baylor College of Medicine, and Elena K. Stamenova and James T. Robinson, at the Broad Institute.
This work was supported by the McNair Medical Institute, the National Science Foundation, the National Institutes of Health, the Cancer Prevention and Research Institute of Texas, IBM, Google, and NVIDIA, a company focusing on visual computer and the art and science of computer graphics.