Introduction to the PDB
Queen's University Protein Function Discovery
and Department of Biomedical and Molecular Sciences
Molecular Modelling and Crystallographic Computing Facility
Crystallography and Modelling:
Other:

Introduction to the
Protein Data Bank

The Protein Data Bank is run by the Research Collaboratory for Structural Bioinformatics composed of Rutgers University, the San Diego Supercomputing Center, the Center for Advanced Research in Biotechnology and the University of Wisconsin.

The following is now out of date

Please go to the RCSB web site and follow the tutorials listed on the left-hand menu ("Site Tutorials").

  1. Searching for structures in the PDB

    The PDB currently contains 29956 structures. Searching the databank can be a bit difficult, unless you know the PDB ID (4 character) code. Enter your search term in the box:

    and check the PDB ID, Authors or Full Text Search buttons as appropriate. You can uncheck the match exact word checkbox to get more hits, but you may want to try variations on your query. Despite the instructions in their query tutorial I have found, for example, that the terms domain and swap, domain and swapped and domain and swapping gave different results.

  2. Biological Unit

    Structures in the PDB determined by crystallography may not contain the coordinates for the structure as it exists in solution, because part of the structure may be related to the rest by crystallographic symmetry. Only the unique portion of the structure is stored in the database. In order to look at the complete structure you may wish to depend on the abilities of programs like PyMOL to display symmetry-related peptide chains, or you may wish to download the biological unit from the PDB itself. Once you have found a structure in the PDB, from the Summary Information page, click on Download/Display File. Near the bottom of that page, you will see Download the Biological Unit File:. This will list a file containing the Distinct Biological Assembly. Unfortunately, this contains the parts of the structure as separate models and programs like PyMOL like to show those separate models as a movie, rather than all together on the screen. There is more information about the Biological Unit in a tutorial at the PDB web site.