Print this page

Cracking the DNA-protein recognition code

Computational biologist Harmen Bussemaker prefers to solve problems by starting from scratch. At NBIC’s 2012 Conference he showed how he investigates DNA-protein interactions.

Harmen Bussemaker (Photo by Thijs Rooimans).

Thirty years ago, shortly after the first crystal structure of a DNA-protein complex was published, biologists believed that DNA-protein binding would soon be understood. However, with every new structure published, it became clearer that the code wasn’t straightforward. Instead, protein-DNA recognition became one of the ‘mysteries’ of modern biology. Today, there is ground for renewed optimism about cracking the code, believes Harmen Bussemaker. “There is so much more data available now, even compared to only five years ago. Not only microarray data, but also high-throughput sequencing data. The picture is getting much, much sharper. I’m pretty optimistic that this Holy Grail can be found.”

Hox protein family

Bussemaker’s optimism is partly based on his own work, especially a recent study on the binding of so-called Hox proteins to DNA (Cell, December 2011) . In Drosophila, eight different Hox proteins control the development of all different tissues and thus guarantee a correct morphology. Genetic mutations have dramatic consequences that are unique to each HOX gene: an extra set of wings for Ultrabithorax, a leg growing from the head for Antennapedia. “Our hypothesis is that protein complex formation can modify the binding specificities significantly.” Bussemaker and colleagues called in massive experimental and computational power to address the hypothesis. In the case of the Drosophila Hox proteins, it revealed that individual members of the Hox protein family acquire novel DNA recognition properties when binding the cofactor Extradenticle. Bussemaker: “Other transcription factors undoubtedly use similar strategies.”

Right questions

DNA-binding proteins, and especially transcription factors, have Bussemaker’s special attention. Why? “They are the middle-managers of cell biology. Transcription factors get things going; they instruct many genes at the same time. They are the hubs in the network, which also means that they are important targets for medicines.”

Bussemaker is a physicist by education but switched to bioinformatics because of the many intriguing questions in biology. The essence of being a good computational biologist is also in asking the right questions, says Bussemaker. “The creative process is the most important. I like to take a step backwards and start thinking ‘What is the real question?’ Solving a problem from scratch creates real innovations and the optimal solution.”

Harmen Bussemaker is Associate Professor in the Dept. of Biological Sciences, Columbia University, New York City and a core faculty member of Columbia’s Centre for Computational Biology and Bioinformatics. He presented a keynote lecture entitled ‘Dissecting transcription factor networks using high-throughput sequencing and quantitative genetics’. An extended interview with Harmen Bussemaker will be published in the November 2012 issue of Interface.

Author: Marga van Zundert