Document Text (Pages 31-40) Back to Document

In Silico Drug Design of Biofilm Inhibitors of Staphylococcus epidermidis

by Al-mulla, Aymen Faraoun, MS

Page 31

Table 2.1 Typical Costs of Experiments (Young, 2009)

2.1.2 Drug Design
Drug design, sometimes referred to as rational drug design (or more
simply rational design), is the inventive process of finding new
medications based on the knowledge of biological targets (Madsen et al.
,2002).Rational drug design can be broadly divided into two categories:
Development of small molecules with desired properties toward targets,
biomolecules (proteins or nucleic acids), whose functional roles in cellular
processes and 3D structural information are known. This approach in drug
design is well established, being applied extensively by the pharmaceutical
Development of small molecules with predefined properties toward
targets ,whose cellular functions and their structural information may be
known or unknown(Mandal et al.,2009).
In the most basic sense, drug design involves design of small molecules
that are complementary in shape and charge to the biomolecular target to
which they interact and will, therefore, bind to it. The identification of a
potential drug target is valuable and significant in the research and
development of drug molecules at early stages. Due to the limitation of
throughput, accuracy and cost, experimental techniques cannot be applied
widely. Therefore, the development of in Silico target identification
algorithms, as a strategy with the advantage of fast speed and low cost, has

Page 32

been receiving more and more attention worldwide. It has been of great
importance to develop a fast and accurate target identification and
prediction method for the discovery of targeted drugs, construction of
drug-target interaction network as well as the analysis of small molecule
regulating network (Markus et al., 2007).

2.1.3 In Silico Drug Design
In Silico is a term that means “computer aided”. The phrase was coined
in 1989 as an analogy to the Latin phrases in vivo, in vitro, and in situ. So
in Silico drug design means rational design by which drugs are
designed/discovered by using computational methods. According to
Kubinyi (1999), most of the drugs in the past were discovered by
coincidence or trial and error method, or in other words, serendipity played
an important role in finding new drugs.
Current trend in drug discovery is shifted from discovery to design,
which needs understanding the biochemistry of the disease, pathways,
identifying disease causative proteins and then designing compounds that
are capable of modulating the role of these proteins. This has become
common practice in biopharmaceutical industries. Both experimental and
computational methods play significant roles in the drug discovery and
development and most of the times run complementing each other
(Bajorath, 2002).
The main aim of CADD is to bring the best chemical entities to
experimental testing by reducing costs and late stage attrition
(Kapetanovic, 2008). CADD involves:

Page 33

1. Computer based methods to make more efficient drug discovery and
development process.

2. Building up chemical and biological information databases about ligands
and targets/proteins to identify and optimize novel drugs.

3. Devising in Silico filters to calculate drug likeness or pharmacokinetic
properties for the chemical compounds prior to screening to enable early
detection of the compounds which are more likely to fail in clinical stages
and further to enhance detection of promising entities.
There are various computational techniques which are capable of
producing the desired effect at various stages of the drug discovery process
(Hoeltje et al., 2003).
The two major disciplines of CADD which can manipulate modern day
drug discovery process and which are capable of accelerating drug
discovery are bioinformatics and cheminformatics. In general:

(generally proteins/enzymes), target validation, understanding the protein,
evolution and phylogeny and protein modeling (Lengauer, 2001).

management and maintenance of information related to chemical
compounds and related properties, and importantly in the identification of
novel bioactive compounds, and further in lead optimization. Besides,
cheminformatic methods are extensively utilized in in Silico ADME
(Absorption, Distribution, Metabolism and Elimination) prediction and
related issues that help in the reduction of the late stage failure of
compounds (Hoeltje et al., 2003).

Page 34

Why computer aided drug discovery?
Besides the significant costs and time associated in bringing a new drug
to the market (Bleicher et al., 2003), some of the major reasons for the
pharmaceutical industries to look for alternative or complementary
methods to experimental screening are:

a. In a survey study, five of the 40,000 compounds tested in animals reach
human testing and only one out of these five reaching the clinical trials is
finally approved (Kapetanovic, 2008).

b. On the other hand, the tremendous increment in chemical space and
target proteins/receptors increases the demand for the HTS and will in turn
call for new lead identification strategies (rational approaches) to reduce
costs and enhance efficacy.

c. Advances in computing technologies on software and hardware have
enabled reliable computational methods.

2.1.4 Strategies of In Silico Design
In Silico drug design can be applied by either of two strategies of design
depending on the knowledge of the target, presence of the primary
sequence and 3D structure. These two strategies are:

A-Structure Based Drug Design
Structure-based drug design (SBDD) is one of the earliest techniques
used in drug design. Drug targets are typically key molecules involved in
a specific metabolic or cell signaling pathway that is known, or believed,

Page 35

to be related to a particular disease state. Drug targets are most often
proteins and enzymes in these pathways. Drug compounds are designed to
inhibit, restore or otherwise modify the structure and behavior of disease
related proteins and enzymes. SBDD uses the known 3D geometrical shape
or structure of proteins to assist in the development of new drug
compounds. The 3D structure of protein targets is most often derived from
x-ray crystallography or nuclear magnetic resonance (NMR) techniques.
X-ray and NMR methods can resolve the structure of proteins to a
resolution of a few angstroms (Rao and Srinivas, 2011).

However structure-based drug design is not a single tool or technique.
It is a process that incorporates both experimental and computational
techniques. This is generally the preferred method of drug design, since it
has the highest success rate. In the drug design stage of SBDD, docking is
the preferred tool for giving a computational prediction of compound
activity (Young, 2009). The following steps are mostly used in SBDD:

I. Target Determination
Drug target is a biomolecule which is involved in signaling or metabolic
pathways that are specific to a disease process. Biomolecules play critical
roles in disease progression by communicating through either protein
protein interactions or proteinnucleic acid interactions leading to the
amplification of signaling events and/or alteration of metabolic processes.
In structure based drug design, a known 3D structure of the target is the
initial step in target identification. This is usually determined either by X-
ray crystallography or by NMR to identify its binding site, the so called
active site (Mandal et al., 2009).

Page 36

Homology Modelling
If crystallographic coordinates or a 2D NMR models are not available,
then a homology model is usually the next best way for determining the
protein structure. A homology model is a three-dimensional protein
structure that is built up from fragments of crystallographic models. Thus,
the shape of an α-helix may be taken from one crystal structure, the shape
of a β-sheet taken from another structure, and loops taken from other
structures. These pieces are put together and optimized to give a structure
for the complete protein. Often, a few residues are exchanged for similar
residues, and some may be optimized from scratch. Homology models may
be very accurate or very marginal, depending upon the degree of identity
and similarity that the protein bears to other proteins with known crystal
structures. Since the homology model building process is dependent upon
utilizing crystal structure coordinates for similar proteins (called the
template), a crucial factor to consider is how similar the unknown sequence
should be to the template protein. A number of metrics have been suggested
for this. One of the most conservative metrics suggests that there should be
over 70% sequence identity (not similarity) with the template, in order to
get a homologous model that can be trusted. Other metrics suggest having
over 30% or 40% sequence identity with the template. One study showed
that having 60% or more sequence identity gave a success rate greater than
70%. With higher sequence identities, the percent of error is decreased,
where as many as 10% of homology models may have a root mean square
deviation (RMSD) greater than 5A° (which represent error cutoff) (Young,
In order to clarify the seemingly disparate metrics mentioned in the
previous paragraph, Rost carried out an extensive study looking at how
much sequence identity is needed to get a good homology model as a
function with the number of aligned residues. For a small sequence of 25

Page 37

aligned residues, 60% identity was necessary. For a large region of 250
aligned residues, templates with over 20% identity could give good
homology models (Rost, 1999).
The metrics used by Rost are somewhat less conservative than some of
the other metrics. Rost’s results also reflect improvements in homology
model software and methodology compared with earlier work. Percent
similarity is also a useful metric to examine. If several potential templates
have essentially the same percent identity, then the one with the highest
percent similarity may be chosen. Researchers may also choose the one in
which the crystal structure has the best resolution (Young, 2009).

Protein Folding

Another method for target identification is protein folding. This is a
difficult process which starts with the primary sequence only and runs a
calculation that tries an incredibly large number of conformers. This is an
attempt to compute the correct shape of the protein based on the
assumption that the correct shape has the lowest energy conformer. This
assumption is not always correct, since some proteins are folded to
conformers that are not at the lowest energy with the help of chaperones.
It is also difficult to write an algorithm that can determine when disulfide
bonds should be formed. So, sometimes protein folding gives an accurate
model, and sometimes it gives a rather poor model. The real problem with
protein folding is that there is no reliable way to tell whether it has given
an accurate model. There are only some checks that provide some
circumstantial evidence that the model might be good or bad. For example,
one can check if hydrophilic residues are on the exterior of the protein and
hydrophobic residues are on the interior. So, pharmaceutical companies
can be justifiably hesitant to spend millions of dollars on research and
development based on a folded protein model when there is no way to have

Page 38

confidence in the accuracy of that model. For this reason, protein folding
tends to be the last resort for building three-dimensional protein models.
Homology model building has two important advantages over protein
folding. First, it is more accurate on average. Second, and more
importantly, the researcher can get a better estimate of whether the
homology model is likely to be qualitatively and quantitatively accurate,
based on the degree of similarity to a known structure. The role and
reliability of homology model building is increasing as the number of
available crystal structures increases. Knowing the three-dimensional
structure of a protein is only the beginning of understanding it. It is also
important to understand the mechanism of chemical reactions involving
that protein, where it is expressed in the body, the pharmacophoric
description, and the mechanism of binding with chemical inhibitors
(Young, 2009).

II. Ligands Search
Much of drug design is a refinement process. In this process, successive
changes are made on molecular structures in order to improve activity.
However, the process needs to get started with some compounds having at
least marginal activity. There are often a couple of known inhibitors from
previous studies on the target, or very similar targets. There often needs to
be at least one known inhibitor in order to provide a reference for the
development of an assay (Young, 2009).
In Silico screening of chemical compound databases for the identification
of novel chemotypes is termed as Virtual Screening (VS). VS is generally
performed on commercial, public or private 2-dimensional or 3-
dimensional chemical structure databases. Virtual screening is employed
to reduce the number of compounds to be tested in experimental stages,
thereby allowing focusing on more reliable entities for lead discovery and

Page 39

optimization (Koh, 2003; Kӧppen, 2009; Subramaniam et al., 2008;
Rester, 2008).
The costs associated with the virtual screening of chemical compounds
are significantly lower when compared to screening of compounds in
experimental laboratories. Virtual screening methods are mainly driven by
the availability of the existing knowledge. Depending on already existing
knowledge on the drug targets and potential drugs, these methods fall
mainly in these two categories (Figure 2.1):
1. Structure based virtual screening (SBVS).
2. Ligand based virtual screening (LBVS).
In the absence of receptor structural information and when one or more
bioactive compounds are available, ligand based virtual screening (LBVS)
is generally utilized (McInnes, 2007; Reddy et al., 2007; Jain, 2004).
This screening method can be carried out by either of the following
a. Similarity search: Similarity searching is performed when a single
bioactive compound is available. The basic principle behind similarity
searching is to screen databases for similar compounds with the backbone
of the lead molecule.
b. Pharmacophore based virtual screening: Pharmacophore is the threedimensional
geometry of interaction features that a molecule must have in
order to bind in a protein’s active site. These include such features as
hydrogen bond donors and acceptors, aromatic groups, and bulky
hydrophobic groups. When one or several bioactive compounds are
available, pharmacophore based virtual screening is performed. The
principle behind the pharmacophore is a set of chemical features; their
arrangement in a 3-Dimensional space is responsible for the bioactivity of
the compound (Young, 2009). By utilizing the chemical features of already
known bioactive compounds, a pharmacophore model is built, which later

Page 40

is used to screen against database of unknown compounds for finding
chemical compounds with similar chemical features (Figure 2.2).

Figure 2.1 Schematic representation of virtual screening methods (Leach and Gillet,

Figure 2.2 Example of a pharmacophore model (Yang, 2010)

© 2009 All Rights Reserved.