On-line Help for Ligand Expo
Name, Formula and SMILES Search
This form searches the chemical component dictionary by molecular
name, molecular formula, or SMILES description. A number of search
types are supported:
Chemical Component Identifier search options:
- Matching the chemical component identifier (3-letter-code)
- Matching chemically similar chemical components identified by a 3-letter-code
using a fingerprint comparison.
Molecular formula search options:
- Exact all atom formula matches
- Exact heavy atom formula matches
- An exact formula subset query will match any exact partial dictionary component. For instance, a query for C6 N2 will match any formula containing exactly six carbons and two nitrogens.
- A formula subset query will find molecules whose formula contains a minimum of the query formula composition. For instance,
C6 H7 Hg N2 O2 S will match
C8 H10 Hg N2 O4 S.
- A close formula query will find molecules whose compositions differ by +/- three for any element in the query formula
Note: formulas must be entered with spaces, e.g.: C6 H11 N2 O7 P.
Molecular name search options:
- Exact name matches.
- Exact substring name queries will find chemical components where
the inputted name is contained within the dictionary name. For
instance, pyridine will match 2-aminopyridine
- The similar name option will match molecules in which the
lexigraphical name is similar but not exactly the same. In other words,
the names may differ in a small number of characters. For instance, a
search for pyridine will match uridine and pyrimidine.
SMILES search options:
- Match components containing the target SMILES pattern.
- Match components which match the chemical fingerprint of the target SMILES pattern.
InChi search options:
Begin your search for similar compounds by entering the InChi
or InChiKey format chemical description as a starting query. You can
then link to similar compounds within the chemical component dictionary
from the Search Result Summary.
Instance Search
The input to an instance search is a 3-letter-code component
identifier, and will result in a table of all PDB entries containing
the desired component. The Display option will provide the information in two ways:
- selecting PDB codes will give you a list of all PDB entries that contain your search query.
- selecting PDB codes + coordinates will output a longer list containing each individual instance of the chemical component, including "per chain" instances within one PDB entry. This sort of search is helpful to compare individual coordinates of each ligand, in case the chemical coordinates of a flexible ligand are different per chain.
Each instance is individually available for download in PDB, MOL/SDF, and mmCIF formats.
You can also launch the MarvinSketch viewer to look at the chemical structure in this instance.
Browse Search
The Browse feature allows the user to explore the content of the wwPDB
chemical dictionary in a number of categories. Menus are provided to
select amino acids, nucleotides, selected top-selling pharmaceuticals,
and common aromatic ring systems. Searches are performed by finding
structures containing a SMILES pattern or by comparison to a chemical
fingerprint. The chemical fingerprint consists of 1000 individual
chemical features such as the presence of common functional groups
or ring systems. Fingerprints are considered similar if their
Tanimoto similarity score is greater than 0.8.
Within your query results are links to the chemical component
listing, downloadable coordinates, and clickable links to search for
further analogs by similar name, similar SMILES string, or similar
chemical formula.
To measure similarity or distance in chemical space we precompute
a chemical fingerprint for each chemical component in our dictionary.
Our chemical fingerprint
was developed by Christian Laggner for the
OpenBabel software system.
The fingerprint contains the SMILES patterns for approximately 1000
chemical features such as the presence of common functional groups
or ring systems. Each component is tested for the presence the
chemical patterns in the fingerprint. The results are stored in vector
bits in which 1 or 0 is set to denote the presence or absence of a particular feature.
Two fingerprints (A and B) are compared by using the Tanimoto similarity score,
a value between 0 and 1, which is defined as:
Tanimoto score = (A .AND. B) / ( A + B - (A .AND. B) )
- A
- is the number of bits set in fingerprint A
- B
- is the number of bits set in fingerprint B
- (A .AND. B)
- is the number of bits set after calculating the bitwise logical AND between A and B
Note that the results of the SMILES and fingerprint comparison may produce
some unanticipated results. The SMILES comparison will match substructures
of the target molecule within any other component in our chemical dictionary.
For small or simple targets (e.g. alanine), this may result is a large number of matches.
Results of the chemical fingerprint comparison reflect the bias of the
patterns in our chemical fingerprint. The discrimination of this comparison
may be useful for locating molecules which have
some common features but
not with the selectivity of a substructure match.
Other Tutorial Information
- The features of Ligand Expo are also described in this 2008 EMBO course
talk and
tutorial.