FORTRAN programs for crystallography
Queen's University Protein Function Discovery
and Department of Biomedical and Molecular Sciences
Molecular Modelling and Crystallographic Computing Facility
Crystallography and Modelling:

My FORTRAN programs for crystallography

These programs are found locally in /software/misc/bin_${OS}, where "${OS}" is either Linux or Linux_x86_64. As far as I know, all of these programs work as I meant them to work. Though there are no guarantees, if you find bugs/problems, if you want a new program, or have suggestions for a better way, then please let me know! Some of these programs are called by my scripts.

These programs are "free." You may do with them as you please.

The individual programs are listed below, but you may download gzipped tar files of the source and Linux excutables. The individual programs are more likely to be up to date. Binaries are not available for both Linux for everything. I've been re-writing some of the small programs in python for greater portability and you'll find them on my scripts page.

Structure analysis and manipulation

  • atom_cmpl atomic complementarity between two pdb files (i.e. minimum distance from each atom in one PDB file to the atoms in the other PDB file) source,
  • average average B factors for a pdb file for plotting source, Linux
  • cell create a pdb file with all atoms within one unit cell from an input pdb file source, Linux
  • close_contact calculate close contacts between two structures in two pdb files -- useful for examining ligand-protein interactions source, Linux
  • coordiff calculate atomic position differences between two pdb files source,
  • ddm calculate a difference distance matrix from two structures source, Linux
  • phipsi calculate phi/psi angles from a pdb file source, Linux
  • phipsidiff calculate the change in phi/psi angles from the output of running phipsi on two pdb files source, Linux
  • distance calculate the distance between two pdb files source,
  • rmsdiff calculate RMS differences between two pdb files source, Linux (must be identical number of atoms to run, and in the same order to be meaningful)
  • rmsdiff_nowat as above, but ignores any water molecules (residue names HOH, TIP3 and WAT) source, Linux
  • rmstor calculate the RMS differences in the phi/psi backbone torsion angles between two phipsi output files (again, there must be the same number of atoms in each file) source, Linux
  • symm_apply apply a crystallographic or user-specified symmetry operator to the contents of a PDB file. source, Linux
  • symm_calc source, Linux

Crystallographic data analysis and manipulation

  • ortho_fract orthogonalize/fractionalize a coordinate (not a file of coordinates) -- also prints out the matrix for fractionalization or orthogonalization. source, Linux
  • reciprocal - convert real cell to reciprocal cell and vice versa source, Linux

Sequence manipulation and analysis

  • Now superceded by the python script.
    seq_convert source, (not available for Linux, try the python script instead)
  • seq_mw source, (not yet available for Linux
  •   usage:  seq_mw [options...] <file_in >file_out
        where [options] are:
          -no_res {only give summary counts for atoms}
          -all  {list summary plus counts for each amino acid type}
          -part or -part_spec_vol  {calculate partial specific volume}
          -h or -help or -?  {this help message}
      When single-letter code is input it may have any number (up to 999)
      of letters per line, but it may not contain spaces.
      When three-letter code is input it may have up to 250 residues per
      line, but there must be one and only one space between each residue.
  • charge calculate unit charges for a sequence (using the simple-minded assumption that K,R = +1, H = +0.5, D,E = -1). For plotting to compare sequences. Same options as for hydropathy
  • source, (not yet available for Linux
  • hydropathy calculate hydropathy (hydrophobicity) values (Kyte and Doolittle) from a sequence source, (not yet available for Linux)
  •   usage:  hydropathy [options...] <file_in >file_out
        where [options] are:
          -3      {meaning that the sequence is given in 3-letter code, 
                   one per line}
          -edseq  {meaning that the sequence is given in 1-letter code, one
                   per line with blank lines for insertions as output by EDSEQ}
           <default sequence format is single letter code (up to 1000 per line
           but can be as few as 1) with no blanks>
          -blank  {meaning that insertions will be written as a blank.}
                   <default is to write the sequence number and a "#" sign for
                   the residue name>
          -h      {this help message}
    The hydrophobicity table comes from Kyte and Doolittle such that fully charged is 0 and maximum hydrophobic is 9.0. This is adjusted by the program to be in the range of -4.5 to +4.5 (due to copying from the Kyte and Doolittle C code in their paper ... why they did it this way I have no idea!)

    AA Arg/R Lys/K Asp/D /B Asn/N Ser/S Glu/E His/H /Z h-pathy 0.0 0.6 1.0 1.0 1.0 3.7 1.0 1.3 1.0 AA Gln/Q Thr/T Gly/G /X Ala/A Pro/P Val/V Tyr/Y Cys/C h-pathy 1.0 3.8 4.1 4.1 6.3 2.9 8.7 3.2 7.0 AA Met/M Ile/I Leu/L Trp/W Phe/F Insertion/- h-pathy 6.4 9.0 8.3 3.6 7.3 -9994.5
  • variability calculate sequence variability (as often used for antibody sequences)
        var = (# diff aa types) / (freq of most common)
    Expects input as one residue per line and equal numbers of lines for each sequence (i.e. aligned with blank lines for gaps) source,
  • variability_all Like variability but over all residues in the list of sequences
        var = (# diff aa types) / (freq of most common)
    Expects input as one residue per line and equal numbers of lines for each sequence (i.e. aligned with blank lines for gaps) source,


  • Now superceded by the python script.
    window_avg calculate a moving-window average of data from an input file (expects the file to have two columns of data, i.e. X and Y values). Typically used to provide a smoothed plot of some feature versus residue number for a protein sequence. source,
  •   usage:  window_avg [options...] <file_in >file_out
        where [options] are:
          -w  for averaging (the default window size is 7)
          -h      {this help message}    
  • Now superceded by the python script.
    mean_stdev calculate the mean and standard deviation for input data file (ignoring lines that begin with "#"). source, Linux
  •   usage:  mean_stdev [options...] <file_in >file_out
        where [options] are:
          -c icol {number of input columns of data}
          -h   {this help message}
      Example:  tail +2 native-mutant.ddm | mean_stdev -c 3
      will ignore the one-line header of the input file and
      use the first three columns of data in the file.
  • vm_calc calculate the Matthew's coefficient (Vm) given the unit cell, the molecular weight of the asymmetric unit and the number of asymmetric units. The solvent content is also calculated by the formula (1-1.23/Vm). source, Linux
  •   usage: vm_calc <a> <b> <c> <alpha> <beta> <gamma> <mol_mass> <nasym>
             cell angles will be set to 90. if entered as 0.0
      e.g.  vm_calc 123 44 60 90 99 90 35000 4 
        cell_vol     320722.2      vm   2.291  solv_cont   0.463

Last revised: Thursday, 07-Feb-2013 11:16:50 EST