|
Crystallography and Modelling:
|
|
|
|
Other:
|
|
|
|
Scripts
|
|
|
|
Scripts for crystallographic data manipulation
My scripts for crystallographic data manipulation
Except for the setup script, these scripts are found locally
in /software/misc/scripts and some run
FORTRAN programs that are found in /software/misc/.
The FORTRAN programs are also freely available. As far as I know, all of these
scripts work as I meant them to work. Though
there are no guarantees, if you find bugs/problems, if you want a new program,
or have suggestions for a better way, then please let me know !
These programs are "free." You may do with them as you please, but please
let me know if you find bugs or have questions about the use of them.
Setup script for all crystallographic software
These scripts are used for setting environment variables (e.g. $PATH) and
aliases for various software packages. They are designed to make it more user friendly in that your PATH will
not increase in length if you repeatedly set up the environment over and over again.
- setup Do source /software/setup <progname>
to run. <progname> may contain version information, e.g.,
source /software/setup ccp4_4.1.1
This calls one of the two following shells scripts, depending on which shell you are using:
- setup.csh Works with the tcsh shell.
- setup.sh The bash shell variant of the above
(works with sh and zsh as well).
- add_path
used in the setup scripts above like:
export PATH=`add_path -sh /software/progname/v4.3.2.1/bin_Linux/`
to add that directory to the PATH environment variable only if it isn't already there.
- remove_path
used in the setup scripts above like:
export PATH=`remove_path -re -sh /software/progname/`
to remove all instances of the "progname" directory from the PATH environment variable .
Miscellaneous useful scripts
- list_symm.py
Python script that lists the symmetry operators for any (or all) space groups
in both (x,y,z) format and in matrix form (rotation first, then translation).
This requires that you have the cctbx Computational
Crystallography Toolbox installed. E.g.:
./list_symm.py p212121
19 4 4 P 21 21 21
1 x,y,z [1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 1.0] [0.0, 0.0, 0.0]
2 x+1/2,-y+1/2,-z [1.0, 0.0, 0.0, 0.0, -1.0, 0.0, 0.0, 0.0, -1.0] [0.25, 0.25, 0.0]
3 -x,y+1/2,-z+1/2 [-1.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, -1.0] [0.0, 0.25, 0.25]
4 -x+1/2,-y,z+1/2 [-1.0, 0.0, 0.0, 0.0, -1.0, 0.0, 0.0, 0.0, 1.0] [0.25, 0.0, 0.25]
- mean_stdev.py
Python script for the calculation of the mean and standard deviation for input data file
(ignoring lines that begin with "#").
- pdb_get.py A python script to download PDB files from rcsb.org.
Options include:
- -c to specify mmCIF format and
- -s to include downloading of a structure factor file, if present.
Multiple codes can be listed on the command line to download multiple files at once. For example
pdb_get.py -s 1f83 1f82
Compressed files are uncompressed using gunzip.
- stats.py
Python library of useful statistical calculations. Can be used to calculate some basic numbers on any file read in
via stdin or as a filename on the command line. It splits columns of data into separate data sets and calculates
mean, stdev, median,
max, and min. Other functions include:
avg_dev (average deviation), var (variance), skew,
kurtosis, mode, histogram, lsq (least-squares fit)
- window_avg.py
Python script for the calculation of a moving-window average of data from an input file (expects the file to
have two columns of data, i.e. X and Y values). Typically used to provide a smoothed
plot of some feature versus residue number for a protein sequence.
Options include:
- -w # or --window=# to
specify a window of size # (default 7)
Sequence manipulations and analysis
- seq_convert.py -
Python script to convert amino acid sequence
from 1-letter to 3-letter code and vice versa. Can also read SEQRES records
or determine the sequence from the coordinates (using MyPDB.py) in PDB files.
Try seq_convert.py --help for instructions.
- seq_mw.py
- python script to calculate molecular mass and count
individual atom types from a sequence fed via standard input
Requires the above seq_convert.py
- seq_pattern.py -
Python script to search for repetitive patterns in sequences. Input sequence is expected to be in single letter code.
Lines beginning with '>', such as in PIR/FASTA format files, are ignored. End-of-line numbers and spaces
are also ignored.
Patterns are entered as regular expressions, e.g.:
seq_pattern.py -p '[GP].{9,12}[TV]' < file.seq
This will find repetitions of a pattern that begins with glycine or proline, followed by between 9 and 12
other amino acids, followed by threonine or valine.
For more information on regular expressions, see Regex HOWTO,
or Python 2.3 Quick reference re module or
Regular Expression for Protein Motif Search.
- text_highlight.py -
Python script for highlighting patterns in text file. Can be used with
seq_pattern.py, above. E.g.
seq_pattern.py -p '[GP].{9,12}[TV]' < file.seq | text_highlight.py -p 'A.*S' --colour=bold,blue,yellow_back
- variability.py -
Python script to calculate sequence variability from pre-aligned sequences.
Requires the above seq_convert.py
Dealing with multiple conformations for O and PROTIN/REFMAC
- strip_mult
Awk script to strip multiple conformations
out of one file and write to two separate files for rebuilding with
O. The original "chain_id" is maintained to enable working with
multi-subunit (or multiple molecules in the asymmetric unit) structures.
- re_mult
- Restore multiple conformations to one
PDB file for refinement with PROTIN/REFMAC from your two PDB files
that were used for rebuilding in O.
Diffraction Data analysis and format conversion
- axial_refl
- strip axial reflections out of scalepack output to look for systematic absences
- denzocell
- strip unit cell and crystal orientation values from the denzo integration log file.
- denzohist
- strip histograms from denzo integration log file
- denzostats
- strip chi**2 values from denzo integration log file
- denzolog_strip
- runs denzocell, denzohist and denzostats at one time
- scale2xplor
- convert scalepack merged I and sig(I) to F and sig(F) for X-plor usage. Option to
convert negative I's to -sqrt(|I|).
- scalepack_cell
- strip out last refined unit cell value from scalepack log file for use in refinement scripts.
Structure analysis and format conversion
Refinement and Molecular Replacement statistics
- refmacR
- strip out R, Rfree, CC, CCfree, FOM, FOMfree from refmac log files for plotting with gnuplot
- ref.R.plt
- sample gnuplot file for printing 'ref.R', where 'ref.R' is the output from refmacR
- oversig
- convert AMORE rotation and translation values to peak divided by sigma
- xplorR
- strip out R and Rfree from X-plor log files for plotting with gnuplot.
- rf_oversig
- convert X-plor rotation function values to peak divided by sigma from X-plor RF log files
- tf_oversig
- convert X-plor translation function values to peak divided by sigma
Last revised: Monday, 15-Sep-2008 13:44:17 EDT
|