Guarding the reliability of protein structures

Making errors during 3D model building and refinement is inevitable. Fortunately, many errors can be detected and fixed prior to publication and deposition, says Gerard Kleywegt. 

Gerard Kleywegt (Photography Thijs Rooimans)

The mission to improve validation of structures gradually took shape during Kleywegt’s career in structural biology. After discovering errors in published structures for the first time, he was shocked and became determined to improve validation. He developed many software tools to facilitate this work and to improve validation of protein crystal structures. Nowadays, he runs a team of 24 people to manage the Protein Data Bank in Europe (PDBe).

Strict rules

Since February 2008, the experimental data underpinning 3D structures must be deposited at the Protein Data Bank (PDB). This decision was taken by the international scientific advisory board of the Worldwide Protein Databank (wwPDB) organisation after it became clear that occasionally published papers had to be retracted due to serious flaws in the structures or even worse, fabricated data. Kleywegt explains: “The problem is that many biologists rely on crystallographers to interpret the experimental data from X-ray experiments correctly. However, interpretation of electron density maps can be difficult when the data are of limited resolution. There’s a lot of subjectivity involved and even experienced people can make mistakes.”

Joining partners

The PDBe is one of the four partner sites from the wwPDB that jointly manage the PDB. The other partners are Protein Data Bank Japan, RCSB Protein Data bank USA and Biological Magnetic Resonance Data Bank, USA. “We extend our attention to currently available structure elucidation techniques. Presently, for example, we are working on a pipeline of software tools for validation of new structures from X-ray data,” says Kleywegt. “The aim is to facilitate protein storage by fast and simple validation procedures. Each structure receives a number of quality scores that are visualised on sliding colour scales: from red (poor score) to blue (better scores). Once the pipeline is ready, it will also be implemented at the other three wwPDB sites.” In the meantime, validation pipelines for models derived from 3D Electron Microscopy and NMR data are also being developed. “I hope that once all these validation tools are in place, we’re going to minimise the occurrence of avoidable and embarrassing errors in the future, “says Kleywegt.

Gerard Kleywegt is professor of Structural Molecular Biology (University of Uppsala) and head of the Protein Data Bank in Europe (PDBe) at EMBL-EBI. At NBIC2012 he presented a keynote lecture about the importance of appropriate protocols and validation procedures in 3D protein structure modelling.

Author: Lilian Vermeer