| Related sites for http://shelx.uni-ac.gwdg.de/SHELX/ |
| TOPXD Topological analysis program for experimental static electron density based on Hansen-Coppens multipole formalism. | | UMWEG_and_PSILAM Programs for calculation and graphical representation of multiple diffraction patterns. | | Uppsala_Software_Factory Software for macromolecular crystallography and structural biology. Many of these programs collaborate with "O" (see there). | | WinGX System of programs for solving, refining and analysing single crystal X-ray diffraction data for small molecules. Provides a consistent and user-friendly GUI for some of the best publicly-available cr | | XAct An application that can be used to construct, maintain, and record the results of many crystallisation experiments. | | XPowder A program for qualitative (PDF2 data base) and least-square full-profile quantitative analysis of phases in crystaline and amorphous powder samples by X-Ray diffraction. Windows platform. | | Xtal A package of over sixty programs for calculations ranging from the reduction of raw diffraction intensities, to the solution, refinement and publication of crystal structures. These are applicable to | | Alice_Software Database system to build and publish synonymised checklists of species. Species descriptions include pre-established fields and detailed descriptions employing any number of user-defined fields. | | Alternative_Splice_Site_Predictor ASSP predicts putative alternative exon isoform, cryptic, and constitutive splice sites of internal (coding) exons. | | The_BioCatalog Software directory of general interest in molecular biology and genetics. Not updated since 2000. | | BioConstructor_Molecular_Cloning_Software BioConstructor is a bioinformatics solution focussing on molecular cloning. Runs on Windows, Mac OS X, Linux. | | Broad_Institute_Genetics_Software Offers a wide range of software for genome sequence analysis, genetic variation, linkage analysis, expression analysis, and PCR primer selection. | | CREME__Cis-REgulatory_Module_Explorer Identifies modules of putative transcription factor binding sites that are specific to promoters of co-expressed human genes. Provides a resource for decoding microarray experiments. | | Free_FASTA/MultiFASTA_converter_tool This tool will look for all FASTA files in the specified folder and will merge their content in a single file, generating a standard or custom MultiFASTA file. Just download the program and double cli | | Gene_Array_Analyzer_Software Software for the efficient management, analysis and visualization of large amounts of gene expression data. | | Gene_Expression_Open_Source_System Open source database and analysis package used to archive and analyze expression data from gene chip experiments. Includes microarray center workflow management, and researcher tools to analyze final | | GeneSifter Web-based microarray analysis system that combines data management and analytical functions with integrated, current gene annotation from public databases. Product overview in pdf format. | | Genetic_Linkage_Analysis Offers links and other resources for genetic analysis software from the Rockefeller University. | | Genetic_Network_Maps_as_Java_Applets Genetic networks controlling development in Sea Urchin represented as Java applets. Formal scheme. Molecular interactions modelled as Petri Net. | | Genex_Open_Source_Gene_Expression_Database Genex is an open source database for storing, processing, and analyzing gene expression microarray data.It is intended for local installation by medium sized laboratories with some system administrati | | GENtle GPL software for viewing and editing DNA and amino acid sequences. Integrated alignment, PCR, BLAST searches, and shared databases. Windows version is available for download (English and German). | | Hands_on_Genetics Instructional software for basic genetics, such as DNA, PCR, Mendel, Hardy-Weinberg, meiosis and computer simulations. Freeware for Mac and Windows. | | MapMakerDrawer Program that draws chromosome maps (as EMF files) using the output of MapMaker EXP. Delphi source code is freely available upon request. The software is available under the MIT licence. | | MuStaR_Sequence_Variant_Database_Software Designed to create a database of mutation loci, to make this information available to as many people as possible. No longer being maintained. | | Mutation_Explorer Software to automatically detect mutations from DNA sequence data, including homozygous, heterozygous, insertion and deletion. | | Pegasys Open source software for executing and integrating analyses of biological sequences. Includes documentation and installation instructions. | | PyMoods A software application for visualizing genomic data. Can be used to analyze and display relations between complete genomes, genomic fragments, proteins, ESTs, full length cDNAs, and gene expression d | | The_Sequence_Manipulation_Suite The Sequence Manipulation Suite is a collection of JavaScript programs for generating, formatting, and analyzing short DNA and protein sequences. It is commonly used by molecular biologists, for teach | | STRand Open-source Windows program to automate or speed up the analysis of DNA fragment length polymorphism samples run on ABI3730, 3700 and 377 sequencers. Instructions for use and program download availabl | | Synopsis Commercial Windows software for visualization and annotation of genomic data. | | TagSorter Database application designed to simplify the collation, sorting, comparison and identification of SAGE tag sequences. | | Kevin_Kelly_--_The_Technium From the founding editor of Wired, a series of posts, with public comments, on the future of science, technology, and society. | | Nic\'s_Corner__From_the_Stars_to_the_Mind Focused on science, arts and letters. | | Science-as-Culture Internet mailing list. Site includes subscription tools and archives. | | Sci-Tech Science, Technology and their interactions; includes a discussion board. | | Southern_California_Sigma_Xi_Science_Café Monthly meetings in local coffee shops to discuss the science behind current events. Includes information about current meeting, subject of future meetings, and how to register to obtain the location | | Aetiology Discussing causes, origins, evolution and implications of disease and other phenomena. | | All-Too-Common_Dissent Political and scientific commentary on the creation/evolution/intelligent design debate. | | The_Bioethics_Weblog Commentary on ethics and the biological sciences, written by the editors of the American Journal of Bioethics. | | Dispatches_from_the_Culture_Wars Thoughts from the interface of science, religion, law and culture. |
|
SHELX Program Page
At last there is a paper that should be cited whenever any program
that begins with "SHELX..." (including the Bruker SHELXTL) is used:
"A short history of SHELX". Sheldrick, G.M. (2008). Acta Cryst.
A64, 112-122.
It is available as "Open Access".
The book "Crystal Structure Refinement: A Crystallographer's Guide to SHELXL" edited by Peter Müller (IUCr/Oxford, 2006) is recommended reading for anyone planning to refine small moiety or macromolecular structures.
A useful quick online source of practical information for macromolecular
applications of
SHELXL
(structure refinement) and
SHELXC/D/E (experimental phasing) is provided by the CCP4 Wiki.
On June 1st 2008 our fax number will change to: +49-551-3922582.
Introduction
This homepage
provides general information about the SHELX system of crystallographic
programs, and should be read by anyone planning to use the programs for
the first time. See below for how to obtain
the programs. Existing users should check this site at regular intervals
for the latest information.
The current
full release of the complete SHELX system is still 97-2 (24 March 1998), but
the new macromolecular phasing programs SHELXC, SHELXD and SHELXE been added
and the Linux and Windows executables have been recompiled with the new Intel
Fortran-95 compiler and are particularly fast on Pentium-4 systems (see
benchmarks).
The new Windows executables should run under Windows 95, 98, ME, NT, 2000
and XP but not under DOS or Windows 3.1. They should be started from a MSDOS
input window. The SHELX users' list has been added to this homepage and is
being updated at regular intervals. Please let me know if your email address
is given incorrectly or if you wish to be removed from the list for any
reason. SHELXC and SHELXE are still at beta-test stage and will expire at the
end of 2003 (and will then be replaced by a new version). Registered users
of SHELX-97 may download SHELXC, SHELXD and SHELXE using the same procedure
as before, no further application form is required. The documentation for
these programs is in
the files fastphas.pdf and shelx-de.pdf.
Several tutorials
have been provided by experienced SHELX users; the corresponding test data
files are in the tutor subdirectory on the SHELX fileserver. This
directory also contains the commented .ins file (and an .hkl
file) for a typical refinement of tetragonal lysozyme since this is frequently
used as a test crystal (kindly provided by Peter Mueller). New sets of
RNA and DNA restraints were added in February 2000 (thanks to Peter S.
Klosterman and Chad C. Sines resp.). The notes for the July 2000 SHELX
Workshop at the ACA Meeting in St. Paul are in the MSWord file aca2000.doc
on the SHELX ftp server. These are now somewhat dated but contain background
information to SHELXD and an introduction to SHELX for first-time
macromolecular users). Further details may be found in Volume F of
the International Tables for Crystallography.
To obtain the
programs it is first necessary to send us a completed
application form by fax or post.
Back to main menu
What
is SHELX ?
SHELX is a
set of programs for crystal structure determination from single-crystal
diffraction data. The first version of SHELX was written at the end of
the 1960's. The gradual emergence of a relatively portable FORTRAN subset
enabled it to be distributed (in compressed form including test data as
one box of punched cards) in 1976. SHELX-76 survived unchanged - the extremely
compact globally optimized code proved resistant to mutations - until major
advances in direct methods theory made an update of the structure solution
part necessary (SHELXS-86). Rewriting and validating the least-squares
refinement part proved more difficult, but was finally achieved with the
release of SHELXL-93. During this time operating systems such as RDOS,
VMS and MSDOS, under which FORTRAN and SHELX flourished, rose and fell.
Even punched cards became obsolete (except in Florida). The current version
SHELX-97 is essentially upwards compatible with SHELX-76, for example the
format of the reflection file remained unchanged (Microsoft please note).
These programs are used in well over 50% of small-molecule structure determinations.
Although SHELX was originally intended only for small molecules, improvements
in computing performance and data collection methods have led to increased
use of SHELX for macromolecules, especially the location of heavy atoms
from MAD, SIR and SAD data using SHELXS (and recently SHELXD and SHELXE),
and the refinement of proteins against high-resolution data (2.5A or better)
using SHELXL.
SHELX-97
consists of the following programs:
SHELXS
- Structure solution by Patterson and direct methods
SHELXC
- Preparations of files for macromolecular phasing with SHELXD and SHELXE
SHELXD
- Structure solution for difficult problems (and location of heavy atom
sites)
SHELXE
- Phases from SHELXD heavy atom sites (and density modification)
SHELXL
- Structure refinement (the version SHELXH is for large structures)
CIFTAB
- Tables for publication via (small molecule) CIF format
SHELXA
- Post-absorption corrections (for emergency use only)
SHELXPRO
- Protein interface to SHELX
SHELXWAT
- Automatic water divining for macromolecules
The program
SHELXA was kindly donated by an anonymous user. It applies "absorption
corrections" by fitting the observed to the calculated intensities as in
the program DIFABS. SHELXA is intended for EMERGENCY USE ONLY, eg. when
the world's only crystal falls off before there is time to make proper
absorption corrections. Under no circumstances should the results be published;
the anonymous author does not wish to be cited in this non-existent publication
because it might ruin his reputation!
Back
to main menu
How
to obtain SHELX
SHELX-97 is
currently available for downloading from the internet and on CDROM. It is free
to academics and requires
a license fee of US$2499 for for-profit organizations. One license
covers the use of all the programs for an unlimited time on an unlimited
number of computers at one geographical location. This license income is
essential for supporting the distribution and further development of the
programs; we do not make a profit. A two-month free trial is available
to for-profit users; at the end of that time they should either pay the
license fee or send a signed declaration on company notepaper that they
have destroyed all their copies of SHELX. Academic users who request the
programs on CDROM are expected to contribute US$99 (plus FedEx costs for
some countries); this may be waived for poorer countries without adequate
internet access. Applications should be made by post or by fax (NOT email)
using the application form. Users intending to
download the programs will receive the necessary instructions by email, so
please make sure that the email address on the form is correct and legible!
As part of the license agreement, users are expected to cite SHELX-97 in
any publication in which it proved useful. It is understood that the author
has no liability for any damage or loss caused by the programs; they may
prove addictive!
Potential users
of the Bruker AXS SHELXTL version (primarily intended for small molecules)
or the program XPREP (very useful for macromolecules) should contact
demolicense@rt.bruker-axs.nl.
Back
to main menu
List
of files on fileserver and CDROM
To obtain a
password for the ftp site, you first have to send a completed
application
form! The fileserver and CDROM contain the following files and
subdirectories:
'applfrm.htm' - application form
in HTML format.
Subdirectory
'unix' contains the sources of all programs for relatively standard
UNIX systems. These should also compile successfully on most other operating
sytems too. Note that the sources of the beta-test SHELXC and SHELXE
have not yet been released.
Subdirectory
'doc' cotains the full manual in MSWord format, one file per chapter.
It is designed to print on letter sized paper.
Subdirectory
'ps' cotains the full manual in Postscript format, one file per
chapter. It is also designed to print on letter sized paper.
Subdirectory
'egs' contains the test jobs and other examples files.
Subdirectory
'cde-data' contains the test jobs for
macromolecular phasing with shelxc, shelxd and shelxe.
Subdirectory
'sgi' contains the SGI IRIX executables; they should run under
relatively recent versions of IRIX, but have been compiled for compatibility
rather than speed.
Subdirectory
'linux' contains the Linux executables for Intel (and compatible)
processors (compiled using the new Intel Fortran95 version 7.0).
Subdirectory 'win' contains executables compiled with Intel Fortran-95
version 7.0 that should run under Windows 95, 98, NT, 2000 or XP.
It is recommended that they are started
from an MSDOS input window. Note that you can define a batch file that
can set the PATH etc. on opening such a window.
Subdirectory
'mp' contains the Linux executable, source files and make file for
Kay Diederich's multi-processor version of SHELXL.
Subdirectory
'dig-unix' contains executables for digital UNIX systems for Alpha
CPUs; they may also run under True64UNIX.
Subdirectory
'tutor' contains extra tutorials and test data for macromolecular
refinement.
Subdirectory
'bench' contains the sources, test data and readme file for the
SHELX-based benchmarks.
'shelx.pdf'
- SHELX manual, as in subdirectories ps and
doc, but in pdf
format with links from the subject index. Converted and kindly made available
by Mitch Miller.
'fastphas.pdf'
and 'shelx-de.pdf'
- Instructions for using the new programs SHELXC, SHELXD and SHELXE (not yet
included in the other versions of the manual)
'aca2000.doc'
- MSWord format notes distributed at the SHELX Workshop at the ACA2000
Meeting. Dated but contains background information to the use of SHELXD
for ab initio structure solution and location of anomalous scatterers
in MAD experiments etc., as well as advice on refining macromolecules with
SHELXL.
'new-dna.dic'
and 'new-rna.dic' - DNA and RNA restraints in SHELXL format, kindly
provided by Peter S. Klosterman and Chad C. Sines respectively.
'sfac.dsp'
- table of f', f" and linear absorption coefficients for most elements
as a function of wavelength; useful for planning SAD and MAD experiments
and refinement against Laue data.
In addition,
the main directory contains gzip compressed tar files (*.tgz)
of the above subdirectories. These are recommended for faster downloading.
Back
to main menu
How
to install SHELX-97
In many cases
it will be possible to use the precompiled versions provided. The executable
programs (and the CIFTAB format files) should simply be copied from the
appropriate directory on the CDROM or ftp site to a directory on your machine.
This directory should be specified in the PATH so that the executables
can be found. On UNIX systems the lazy way is to copy the programs into
/usr/bin or /usr/local/bin; under Windows they are usually copied to C:\EXE
and this directory name is then added to the PATH specified in a batch
file that is called when you open the MSDOS input window. You may also
wish to copy the documentation and examples files. By way of example, for
a PC running Linux linux.tgz and (as required) test data, tutorial and .pdf
documentation files should
be fetched to your working directory using a browser. The compressed
tar files can then be unzipped and extracted:
gunzip *.tgz
tar -xvf
linux.tar
etc., which will
create the subdirectory linux containing the Linux executables.
The executables can be copied to /usr/local/bin or to /usr/bin (needs system
manager priviledges!). Note that this will also put the CIFTAB format files
ciftab.def, ciftab.rta
and ciftab.rtm into /usr/local/bin, which is where the precompiled
Linux version of CIFTAB expects to find them. For IRIX they are assumed
to be in /usr/bin. Note that CIFTAB will first look for the format files
in the current working directory, so that individual users may create special
versions (which may also use different filename extensions).
cp linux/*
/usr/local/bin
The Adobe Acrobat
reader can be used to print the files fastphas.pdf, shelx-de.pdf
and shelx.pdf (rather long!).
The installation is now complete!
Back
to main menu
Program
compilation (where necessary)
The UNIX version
(sources in the unix subdirectory) has been designed to be easy
to compile on a wide range of UNIX (and other) systems, but of course you
will need a FORTRAN compiler! The prorams are in the course of being converted
to Fortran-95 in order to use the dynamic memory allocation, so it is best to
use a Fortran-95 compiler and ignore all warning messages about obsolete
features etc. The resulting compiled programs do not need
any environment variables or hidden files to run; it is simply necessary
that the executable program is accessible via the PATH or an alias.
For example,
for True64Unix the switch -fast is used to optimize the code for
the current processor:
f95 shelxl.f
shelxlv.f -fast -o shelxl
For the new Intel compiler under Linux the compilations were performed
using e.g.:
ifc shelxl.f
shelxlv.f -axKMW -pad -nbs -static -Vaxlib -o shelxl
On modern IRIX
systems -Ofast should be substituted for -fast here. IT IS
NECESSARY TO BE VERY CAREFUL ABOUT OPTIMIZATION. The safest is to compile
without any optimization first (e.g. -g rather than
-O3),
run the ags4, sigi and 6rxn tests, and rename the resulting output files
*.res,
*.lst, *.fcf and *.pdb. Then recompile with highest optimization
(e.g. -O3), rerun the tests, and use the UNIX diff instruction to
compare the results with those from the unoptimized version. Small differences
in the last decimal place do not matter, and of course the CPU times will
differ, but if there are significant differences then the optimization
level should be lowered and the tests repeated. For some systems only the
shelxlv.f
file (containing the rate-determining routines) can be compiled with the
highest optimization level; shelxl.f must be compiled at a lower
level. SHELXS should be compiled just like SHELXL; the same applies
to SHELXH, the large version of SHELXL, which uses shelxh.f and shelxlv.f
etc. The rate-determining routines for SHELXS are in shelxsv.f,
the rest in shelxs.f. One commented
line near the start of SHELXL, SHELXH and SHELXS needs to be changed if
these programs should write MSDOS format ASCII text files rather than UNIX
format when run on a UNIX system. This is useful for a heterogeneous UNIX/MSDOS
network, because the UNIX versions of all SHELX programs can read MSDOS
format files. but not vice versa. The remaining
programs do not require optimization (except possibly SHELXA and SHELXPRO)
so are straightforward to compile, e.g.
f95 shelxpro.f
-o shelxpro
However note
that one line near the start of the SHELXPRO source may need to be altered
to define the size unit of the binary files used for maps for the program
O, see the comments in the source. SHELX does not itself use binary files,
this avoids many incompatibilities, especially on networks of mixed computer
types. Unlike SHELXL and SHELXS, there are some intentional deviations
from the strict FORTRAN-77 standard in these programs. REAL*8 and list-directed
reading of internal files are used in several cases, and SHELXPRO uses
types INTERGER*2 and BYTE in order to produce binary map files for O. Most
FORTRAN compilers have no problems with these extensions, but may output
warning messages.
CIFTAB
will search the current directory for a specified format file, and if it
doesn't find it there it will look for it it a directory that is defined
in the source. Unless this is edited before compiling, the directory is
set to /usr/bin, so if the executable programs are located in /usr/bin
the format files ciftab.* should be there too.
Parallel processing and vector machines
Kay Diederichs has adapted SHELXL to OpenMP and a Linux executable together
with a make file and the modified sources may be found in the subdirectory
mp. On a two-CPU Xeon system this achieves almost the maximum theoretical
speedup. Is is expected that users will be able to port this version to
other OpenMP systems. Questions about this version should be sent to
Kay.Diederichs@uni-konstanz.de with a copy to gsheldr@shelx.uni-ac.gwdg.de.
SHELXL and
SHELXS are designed to run very efficiently on vector computers (such as
older Cray and Convex machines).
SHELXH -
version of SHELXL for very large structures
SHELXH is a
special version of SHELXL for the refinement of very large structures (with
more than about 10000 unique atoms). The only difference between
shelxh.f
and shelxl.f is the first FORTRAN statement in which the array dimensions
are specified by means of a PARAMETER statement. If even SHELXH is not
large enough, you will need to change the dimensions of the arrays A and
B as explained in the comments at the start of the source file. Large versions
of SHELXS, SHELXPRO and SHELXA may be created in the same way, but it is
rather unlikely that they will ever be required. Further details are provided
by comments in the respective sources. SHELXL will print a suitable error
message if it is necessary to increase the dimensions of the large arrays
A or B.
A little care
and fine-tuning may be required so that such large structures can be refined
efficiently. If the computer does not have enough physical memory available,
or if the 'maximum vector length' is set too large, SHELXH will run in
disk exercising mode. This 'maximum vector length' refers to the number
of reflections that are processed in one vector run, which may be smaller
than the number in the input/output buffer. Some trial and error is needed
to set the maximum allowed value so that the physical memory is fully exploited
with a minimum of disk I/O for the virtual memory swap file. This number
is set as the fourth parameter on the L.S. or
CGLS instruction,
and should be a multiple of 8; a good value to try for a 64MB computer
is 64 (the third number on the L.S. or CGLS instruction is
almost always zero). The array B is used as working space for these vectors
(CGLS and L.S.) as well as for the least-squares matrix (L.S.).
If the array B is not big enough, the program will use a smaller maximum
vector run.
Back
to main menu
Macromolecular phasing with SHELXC/D/E
The new program SHELXC is designed to provide a simple and fast way of setting
up the files for the programs SHELXD (heavy atom location) and SHELXE (phasing
and density modification) that are used for macromolecular phasing by the MAD,
SAD, SIR and SIRAS methods. These three programs may be run in batch mode or
called from a GUI such as hkl2map (Thomas Schneider & Thomas Pape).
SHELXC is much less versatile than the Bruker Nonius XPREP program for this
purpose, but if you are sure of the space group and there are no problems with
the indexing or twinning and the f and f" parts of the
scattering factors do not need to be refined, SHELXC should be adequate.
SHELXC can read either HKL2000 .sca files or SHELX .hkl files (F-squared
unless the -f switch is given after the filename to specify F).
To transfer data from CCP4
it is advisable to generate .sca files using CCP4i. Scripts for batch mode
under UNIX are described here but users are encouraged to call one or more of
these programs from their own GUI-based high throughput pipelines.
MAD example
shelxc jia <<EOF
NAT jia_nat.hkl
HREM jia_hrem.sca
PEAK jia_peak.sca
INFL jia_infl.sca
LREM jia_lrem.sca
CELL 96.00 120.00 166.13 90 90 90
SPAG C2221
FIND 8
NTRY 10
EOF
shelxd jia_fa
shelxe jia jia_fa -s0.6 -m20
shelxe jia jia_fa -s0.6 -m20 -i
In this example (kindly donated by Zbigniew Dauter; Li et al., Nature
Struct. Biol. 7 (2000) 555-559), Se-Met MAD data at four
wavelengths are used to calculated the FA-values
and phase shifts alpha
that are written to the file jia_fa.hkl. The native (S-Met) data are read
from jia_nat.hkl and written to jia.hkl. The file jia_fa.ins is prepared using
the given cell, space group, FIND and NTRY instructions as well as a suitable
SHEL command to truncate the resolution. SHELXD then searches for 8 (FIND)
selenium atoms using 10 attempts (NTRY), and SHELXE is run for 20 cycles
(-m) of density modification for both heavy atom enantiomorphs
(-i inverts) with a solvent content (-s) of 0.6. The protein
phases are written to jia.phs and jia_i.phs resp.
If NAT is not specified, SHELXC would analyze the four MAD
datasets to
generate the (SeMet) native data jia.hkl, in which case -h should be
specified for SHELXE since the selenium atoms are present in the native
structure. For MAD at least two wavelengths are required, at least one of
which should be PEAK or INFL.
SAD Example
This example of thaumatin phasing by means of the native
sulfur anomalous
signal (Debreczeni et al., Acta Cryst. D59 (2003) 688-696) uses
1.55 Å resolution in-house CuKalpha data:
shelxc thau <<EOF
SAD thau-nat.hkl
CELL 58.036 58.036 151.29 90 90 90
SPAG P41212
FIND 9
DSUL 8
MIND -3.5
NTRY 100
EOF
shelxd thau_fa
shelxe thau thau_fa -h -s0.5 -m20
shelxe thau thau_fa -h -s0.5 -m20 -i
The anomalous differences are extracted from the native
data so only one
data file is required. The sites specified by FIND consist of one methionine
and 8 super-sulfurs, which are then resolved into disulfides using the DSUL
instruction that is passed on to SHELXD. Alternatively one could try to find
the individual sulfurs with:
SHEL 999 2.0
FIND 17
MIND -1.7
Here the resolution cutoff has been reduced from 2.1 Å
(which SHELXC
would have suggested) to 2.0 Å to improve the chances of resolving the
sulfurs. The SHEL, FIND, MIND and NTRY instructions are transferred to the
file thau_fa.ins for the sulfur atom location with SHELXD. Note that the
phases can be improved further by using more SHELXE cycles than the usual 20.
SIRAS example
This involves the solution of the thaumatin structure using
the above 1.55 Å data as native and 2.0 Å CuKalpha
data from a quick iodide soak. SIRAS usually gives the best results for
iodide soaks, but it is also possible in this case to use SIR (change 'SIRA'
to 'SIR') or iodine SAD (change 'SIRA' to 'SAD').
shelxc thaui <<EOF
NAT thau-nat.hkl
SIRA thau-iod.hkl
CELL 58.036 58.036 151.29 90 90 90
SPAG P41212
FIND 17
NTRY 10
MIND -3.5 -0.1
EOF
shelxd thaui_fa
shelxe thaui thaui_fa -s0.5 m20
shelxe thaui thaui_fa -s0.5 m20 -i
Critical parameters
In general the critical parameters for locating heavy atoms with SHELXD are:
The resolution cutoff. In the MAD case this is best determined by finding
where the signed correlation coefficient between the anomalous differences for
wavelengths with the highest anomalous signal (PEAK and HREM or PEAK and INFL)
falls below about 30%. For SAD a less reliable guide is where
delta(F)/sigma(F) falls below about 1.3, and for S-SAD with
CuKalpha the data can be truncated where the mean I/sigma
for the native data falls below 30.
The estimated number of sites (FIND) should be within
about 20% of the
true number. For SeMet or S-SAD phasing there should be a sharp drop in the
occupancy after the last true site. For iodide soaks, a good rule of thumb is
to start with a number of iodide sites equal to the number of aminoacids in
the asymmetric unit divided by 15. If after SHELXD occupancy refinement the
occupancy of the last site is more than 0.2 it might be worth increasing this
number, and vice versa.
If heavy atoms can lie on special positions (as is the
case with an
iodide soak) the rejection of atoms on special positions should be switched
off by giving the second MIND parameter as -0.1 (as in the above example).
For MAD, a CC of 40 to 50% indicates a good solution, for
SAD etc. values around 30% may well be correct, especially if the same solution
or group of solutions has the highest values of CC, CC(Weak) and PATFOM, and
they are well separated from the values for the non-solutions.
For SHELXE, a big difference in the contrast between
the two heavy-atom enantiomorphs usually indicates a good solution. However in
the case of SIR, both have the same contrast but one gives the inverted protein
structure. The contrast is also the same for both if the heavy-atom
substructure is centrosymmetric. In the case of SAD both heavy atom enantiomers
then give the correct structure, for SIR the result is an uninterpretable
double image.
Sometimes it is necessary to use many (several hundred)
cycles (-m) if the starting phase information is weak but the
resolution is
good, and it may be necessary to try different values for the solvent content
(-s). Good quality MAD data, a high solvent content and/or high
resolution for
the native data can lead to maps of high quality that can be autotraced (e.g.
with wARP) immediately. The .phs files contain h, k, l, F, fom,
phi and sigma(F)
and can be read directly into XtalView or converted to CCP4 .mtz
format using f2mtz, e.g. for further density modification exploiting NCS using
the CCP4 program DM. Note that if the inverted heavy atom enantiomorph is the
correct one, the corresponding phases are in the *_i.phs file and SHELXE may
have inverted the space group (e.g. P41 to P4
3), which
should be taken into account when moving to other programs!
The data files for these examples are available for
downloading from the subdirectory cde-data.
Back
to main menu
Frequently
asked questions (small molecules)
Q1: Please
send me a copy of SHELX-76. I am afraid that I cannot use the new version
because my diffractometer measures F-values, not intensities.
A: Buy
a CCD detector. They measure intensities! In fact, diffractometers measure
intensities too. You just need the right data reduction program. If you
are desperate you can even feed SHELXL with F-values using HKLF 3.
Q2: When
I start SHELXL on my PC the disk rattles loudly for several hours and smoke
comes out of the back. Is this a bug?
A: You
must be trying to run SHELX under some version of Windows! The best solution
is to reformat the hard disk and install Linux. However the current release
should produce less smoke.
Q3: The
referee rejected my paper because the weighted R-factor was too high and
because the stupid program had forgotten to fix the y coordinate of one
atom to fix the origin in space group P21. What should
I do?
A: Try
another journal; if you emphasize the 'biological relevance' enough, they
may not notice the R-factor! Note that wR2 (based on intensities and all
data) is of necessity 2 to 3 times higher than wR1 (based on F and leaving
out reflections with say F<4sigma. Unfortunately SHELXL cannot work
out wR1, because the weighting scheme for intensities does not apply to
F-values. It is better to quote the unweighted R1 (with or without a 4sigma
threshold) anyway, because it is too easy to cheat on wR2 by modifying
the weights! It is no longer necessary or desirable to fix the origin by
fixing coordinates, the program applies appropriate floating origin restraints
automatically when they are needed.
Q4: The
program tells me to refine extinction, this does reduce the R-factor but
the extinction parameter becomes very large although my crystal could hardly
be described as 'perfect'. Is this reasonable?
A: No.
The most likely causes of large apparent extinction are: (a) you have input
F with HKLF 4, (b) A few reflections that should be very strong have been
measured as weak because they were cut off by the beam-stop, (c) your counter
was saturating and an inadequate dead-time correction was made (in the
case of an image plate this is an 'overload'), or (d) your counter was
defective or the energy discrimination was set wrongly. Overloads may be
eliminated by 'OMIT h k l' if necessary.
Q5: The
structure could only be solved in P1, not P-1, but on refinement some of
the bond lengths and U-values are wildly different in the two molecules.
If I use SAME the geometries of the two molecules become very similar but
how do I restrain the Uij components of equivalent atoms to be the same?
A: You
could use EADP, but it might be better to look for the inversion
center instead, otherwise you will probably be 'Marshed'.
Q6: I included
batch numbers in the .hkl file and BASF parameters in the .ins
file, but the stupid program still didn't refine the batch scale factors!?
A: You
need MERG 0 (the default MERG 2 will average the batch numbers).
Q7: How
do I obtain the molecular replacement program PATSEE?
A: PATSEE
has been maintained by its author, Ernst Egert, since he moved from Goettingen
to the University of Frankfurt. He can be contacted by fax (+49-69-7982-9128)
or email (bolte@chemie.uni-frankfurt.de).
Q8: What
should I do about 'may be split' warnings?
A: Probably
nothing. The program prints out this warning whenever it might be possible
to interpret the anisotropic displacement of an atom in terms of two discrete
sites. Such atoms should be checked (e.g. with the help of an ORTEP plot)
but in many cases the single-site anisotropic description is still eminently
suitable.
Q9: I get
the message ' ** UNSET FREE VARIABLE FOR ATOM ... **' but I haven't used
any 'free variables'!?
A: There
is a typo in your atom coordinates, e.g. a decimal point missing or replaced
by a comma. Alternatively you may have really referenced a free variable
that wasn't defined by FVAR!
Q10: The
program prints out a Flack x parameter of 0.3 with an esd of 0.05. Is the
crystal racemically twinned?
A: Not
necessarily! The Flack parameter estimated by the program in the final
structure factor calculation ignores correlations with all other parameters
(except the overall scale factor). Since these parameters may have refined
so as best to fit a wrong absolute structure, it is quite possible to get
an estimate of about 0.3 for the Flack parameter when the true value is
1, i.e. the structure needs to be inverted and is not racemically twinned.
On the other hand a value close to zero with a small esd is a strong indication
that the absolute structure is correct. If there is any doubt the Flack
parameter should be refined together with all the other parameters using
TWIN
and BASF.
Q11: How
can I produces nice tables from the final .cif file to pad out my
thesis?
A: Run
CIFTAB and specify the formats 'rta' (Angstroms) or 'rtm' (SI units). This
will produce a Rich text format (rtf) file that can be read directly
into MSWord or other word processors. The tables will be formatted, but
it is then easy to add a personal touch with the word processor.
Back
to main menu
Frequently
asked questions (macromolecules)
Q12:
The manual is too long and was clearly written for small molecule crystallographers
who still seem to have time to read that sort of thing. How about a short
guide for stressed protein crystallographers?
A: Print
out (and maybe even read!) the files fastphas.pdf and shelx-de.pdf.
This should help
with MAD, SAD, SIR etc. phasing. Thomas Schneider's tutorials on MAD phasing
and refinement of triclinic lysozyme are particularly recommended.
Q13: How
do I transfer my data, including Rfree flags, from X-PLOR etc. to
SHELX?
A: Use
the 'Y' option in SHELXPRO to convert the .fob file to .hkl,
and the 'I' option to convert .pdb to .ins. Although SHELXL
prefers intensities, for macromolecules it is OK to continue use F-values
if you were using them with X-PLOR or CNS. In CCP4, the mtz2various program
can write SHELX format files. The Bruker Nonius XPREP program provides
a space-group general option for transferring Rfree flags from one
dataset to another, taking equivalents into account. It can also process
MAD or SIRAS data to generate input files for SHELXD etc.
Q14: I have
a non-standard ligand, how do I make the topology file?
A: SHELXL
doesn't have a topology file, the restraints etc. are all included in the
.ins
file. A good way to generate such restraints is to find a suitable fragment
in the CSD, then use the 'J' option in SHELXPRO. If it's not in the CSD,
you could do a quick small-molecule structure determination (using SHELX
of course) and feed that into SHELXPRO.
Q15: Why
are the R-factors different from those output by X-PLOR, TNT etc.?
A: Check
that you are using the same data (F or intensity, resolution, Rfree
flags ?) and that the bulk solvent model is not causing problems (it tends
to interact with the B-values, so it might be best to do a few refinement
cycles first to sort it out).
Q16: After
using SHELXPRO to prepare the .ins file from a PDB file and then
running SHELXL, I get the message: 'No match for 2 atoms in DFIX'
but otherwise everything seems OK.
A: This
message probably refers to the fact that SHELXPRO labels the oxygens of
the carboxy-terminus OT1 and OT2 so that different bond length restraints
can be applied than to the same type of amino-acid when it is within a
peptide chain, in which case the message can be safely ignored. Other such
messages should always be investigated carefully, they may indicate missing
or bad restraints or bad initial connectivity (which can be corrected using
BIND and FREE).
Q17: I can
solve the structure by molecular replacement in the space group P32
but the R-factors are high and the Rsym for P3221
was not much higher than for P32. What should I do?
A: Your
structure may well be merohedrally twinned, but don't panic! The E-statistics
can be calculated using e.g. SHELXS, SHELXD or XPREP; <|E2-1|>
<< 0.736 would also indicate twinning. All you need to do in this
particular case is to include the two instructions:
TWIN
0 1 0 1 0 0 0 0 -1
BASF
0.3
in your .ins
file and repeat the refinement job! If the BASF parameter (you can
find it in the .lst or .res file) refines to a value intermediate
between 0 and 1, and Rfree drops significantly, you are winning.
No other special action is needed, SHELXPRO and XtalView can be used in
the usual way because the .fcf file is effectively 'detwinned'.
Q18: Where
is the best place to look to see where the model needs improving?
A: Atoms
that are involved in restraint violations, large positive or negative difference
electron density, regions with high and/or very anisotropic displacement
parameters, violations of non-crystallographic symmetry, and residues in
disallowed regions of the Ramachandran plot. The .lst file contains
most of this information, and SHELXPRO may be used to plot displacement
parameters or NCS differences as a function of residue number as well as
Ramachandran and Kleywegt plots.
Q19: When
is it justified to refine anisotropically?
A: In
general if the resolution is worse than about 1.5 A, it is unlikely to
be worth trying, but it depends on the completeness and quality of the
data as well as the percentage of the solvent. A drop in Rfree of
about 1% or more might be considered to justify full anisotropic refinement.
In borderline cases tighter restraints, including ISOR for all atoms,
may well be required. However anisotropic refinement of selected heavier
atoms (iron and sulfur in metalloproteins, selenium for native selenomet
data) may well be justified at lower resolution.
Q20: When
should I add hydrogen atoms?
A: As
late as possible because they cost computer time, though including hydrogens
usually brings a drop in Rfree of between 0.5 and 1.0%. In many
cases it is more trouble than it is worth to include the OH hydrogens;
they tend to have higher B-values and are more difficult to position automatically.
If one is unlucky, the stupid program will put two OH hydrogens along the
same hydrogen bond, and the combination of antibumping restraints and the
riding hydrogen model can then distort the rest of the structure.
Q21: How
can I obtain real esds on the structural parameters?
A: If
high resolution data are available - there must be appreciably more data
than parameters - and the structure is not too large, it may be possible
to obtain rigorous esds by matrix inversion. The structure should first
be refined to convergence with CGLS, setting the second parameter
to -1 to calculate Rfree, than a further refinement should
be performed against all data by deleting this second parameter, and finally
a single full-matrix cycle should be performed with L.S. 1 and zero
damping and a zero shift multiplier (DAMP 0 0) in which all restraints
have been removed. Often BLOC 1 will be needed to reduce the size
of the matrix by leaving out the displacement parameters; the inversion
will then also be more stable. BOND, RTAB, HTAB and
MPLA
instructions may be needed to define the dependent parameters for which
esds are required. SHELXL uses the full correlation matrix to estimate
the standard deviations in all dependent parameters, with the exception
of the angles between least-squares planes for which an approximate treatment
is used. Although the marix inversion in SHELXL is both efficient and robust,
if the data to parameter ratio is too small or the matrix is too large
it may fail. In principle recompiling SHELXL with 8-byte reals (for many
compilers the switch -r8 suffices) should reduce numerical problems
on matrix inversion at the cost of doubling the memory requirements, but
in practice this has surprisingly little effect.
Q22: SHELXL
complains that it does not have enough memory, what should I do?
A: Use
the larger version SHELXH. If even this is not large enough, you will have
to change the dimensions of the arrays A and B and recompile the program.
This is explained in the comments at the start of the file
shelxl.f
as well as in the section on compiling in this
homepage. If the problem occurs when estimating standard deviations, use
BLOC
to break up the matrix into smaller blocks.
Q23: What
does 'nan' mean?
A: 'Not
a number'. Of course no self-respecting crystallographic program should
ever generate nans, so it means that something has gone very seriously
wrong with the calculation. Check the .lst file for other warning
messages and in particular the list of disagreeable restraints for
indication of the cause. Perhaps you are simply trying to refine more parameters
than the data can support. It is always a good idea to introduce changes
in small steps rather than changing everything at once. Note that MORE
3 can be used to write more diagnostic information to the .lst
file (which may then get rather large). It could be a compiler optimizing
error, maybe it is worth trying a different type of computer system. Often
refining first with STIR prevents the refinement from exploding.
Q24: What
is the worst resolution that is acceptable for: (a) solution of a structure
by direct methods using SHELXD or SHELXS, (b) refinement with SHELXL?
A: Direct
methods assume randomly distributed resolved atoms. Direct methods are
crucially dependent on having atomic resolution data, say better than 1.2A.
A good rule of thumb is that a least one half of the theoretically possible
number of reflection s between 1.1 and 1.2A should have been measured with
I>2sigma for direct methods to be successful, though this rule can be relaxed
somewhat for centrosymmetric structures and structures containing heavier
atoms.
The resolution
is not so critical for the location of heavy atoms from delta-F data, provided
that the minimum distance betwen heavy atoms is much greater than the resolution;
a resolution of 3.5A if sufficient for a selenomet MAD experiment, provided
that the data were measured with a high redundancy (at least 4!).
SHELXL lacks
the energy terms used by e.g. X-PLOR or CNS for refinement against low-resolution
data. This imposes an effective limit of about 2.5A for SHELXL refinement,
but this limit may be extended a little to lower resolution if NCS restraints
can be used.
Back
to main menu
SHELX-based
benchmarks of some current computer systems
To benchmark
PC's and Workstations for a mix of typical number-crunching crystallographic
calculations, a SHELX benchmark may be performed using the files in the
subdirectory bench on the SHELX ftp server. They are based on the
tests distributed with the programs with minor changes to artificially
increase the computer time requirements. The programs were recompiled for
these tests using the latest compiler versions.
log2000
- like the log test but with TREF replaced by TREF 2000.
cumos10
- like the cumos2 test but with PATT 2 replaced by PATT 10.
6rxn
- exactly as in the 6rxn test distributed with SHELX.
7rxn
- like 6rxn, but with CGLS 10 -1 replaced by L.S. 1 and BLOC
1.
The first two
require shelxs, the last two shelxl. The total CPU time in
seconds (t) for all four is then the benchmark time: the performance in
Vax units is given to a good approximation by 200000/t (the Vax780 mainframe
and later the Microvax II, the computer of choice for crystallographic
computing for many years, both ran at about 1 Vax given enough memory).
Computer
Sum
of 4 CPU times
Vax
units
3.06
GHz P4 / RAMBUS / Intel 7.0
3.1
+ 2.9 + 3.7 + 5.9 = 18.9
12821
500MHz
AXP 21264 / True64UNIX
12.9
+ 10.1 + 17.2 + 23.1 = 63.3
3160
300MHz
SGI R12000 IRIX 6.5
16.5
+ 7.5 + 25.1 + 42.8 = 91.9
2176
The first and
third tests involve primarily floating point arithmetic; the final (full-matrix
refinement) test involves a lot of memory access and may benefit from faster
memory or more cache. Taken together the four tests represent a realistic
mix for crystallographic calculations.
Back
to main menu
Support
and bug reporting
The author
is happy to provide advice by email (gsheldr@shelx.uni-ac.gwdg.de) or fax
(+49-551-392582 - after June 1st 2008 -3922582) but not phone. Questions already
answered in this file
or in the full documentation may be moved to the bottom of the pile! In
particular he would like to be informed of any suspected bugs in the programs
or of errors or lack of clarity in the documentation; the current release
has benefitted enormously from such contributions by users.
Important announcement
about new versions etc. will be posted on this SHELX homepage and on appropriate
crystallagraphic email newsgroups.
Back
to main menu
|
|