‘You can’t predict the middle from what’s on the edge’
Other research teams had modeled what human centromeres might look like based on their immediate flanking sequences.
“I didn’t believe that this approach would identify the functional centromere.” Henikoff said. He and his colleagues have spent many years studying how and why centromeres evolved to do their crucial job. “What we think we know about how centromeres have evolved would say that you can’t predict what’s in the middle from what’s on the edge.”
So he and his team decided to take a different approach. Ignoring the edges, they isolated certain proteins found only at centromeres from two different types of human cells, male and female, and looked at the DNA stuck to those centromere proteins in an unbiased way.
The term “unbiased” is important, Henikoff said: The researchers didn’t make any assumptions about what they might find before they found it. Such assumptions are common, and often useful, in DNA sequencing studies, but they didn’t apply here. The techniques the team used weren’t available when the human genome sequence was first published.
And what they found wasn’t exactly more of the same, as researchers had previously hypothesized by looking at the edge sequences. Actually, it was even more of the same.
Instead of a complex series of slightly different higher order repeats, as in the edges, Henikoff’s team found just two small stretches of DNA that repeat, over and over. Those paired repeats dominated centromeres in the male and female human cells the researchers tested. The team also found them in a publically available database of another complete human genome.
Precision where precision is required
Their findings showed that centromeres are even more uniformly repetitive than their flanking sequences. And that is almost certainly not an accident, Henikoff said. He thinks that what made the centromeres so hard to sequence in the first place is also what allows them to work so perfectly every time a cell divides.
“The way I interpret it is that chromosome segregation has to be as close to 100 percent as is physically possible,” Henikoff said. “Centromeres are really precise because they have to work so well.”
If chromosomes aren’t perfectly distributed when cells divide, the consequences are often dire. Depending on the cell type, the wrong number of chromosomes could trigger cancer or even kill the entire organism – miscarriages are often due to imperfect chromosome shuffling early in embryonic growth.
The researchers also found that the centromere proteins they’d used to access centromere sequences are precisely positioned along that repetitive stretch of DNA, together forming a protein-DNA unit that repeats at an exact frequency. That’s different from the higher order repeats just to the sides of centromeres, where there’s nearly no pattern in how similar proteins bind the DNA. When cells divide, the machinery responsible for partitioning chromosomes to each progeny cell attaches to those precisely repeated protein-DNA units – their regularity may be crucial for that process, Henikoff said.
The team’s findings not only help fill in the missing gaps of the human genome, they may help build better human artificial chromosomes, Henikoff said, very small engineered chromosomes with applications in basic and applied research.
Current artificial chromosomes were constructed with sequences from the edges of human centromeres and have high failure rates, Henikoff said. He thinks they may be able to lower those failure rates by using a true centromere’s sequence. His team is now testing out that idea.
But for now, he’s happy to have simply made some headway on a biological mystery.
“We’re thrilled about being able to understand something that I didn’t know we’d ever understand,” Henikoff said. “It makes sense, finally.”