Case Studies

Virtual predictive tool rationalizes lead compound identification for new drug discovery

The client

Identification and validation of lead compounds – the best candidates from amongst several million molecules – for new drug discovery is traditionally a long and laborious process involving ‘brute force’ experimental verification. While virtual predictive tools claiming to rationalize the process in silico are available, their efficacy has remained questionable. The downside of errors in formulation of new drugs can be calamitous in terms of adverse results during clinical trials and safety of patients as well as the huge costs involved.

The Indian Institute of Science (IISc), a leading scientific research institution, chose Infosys to collaborate on innovations in research informatics. One successful result was the development of a new virtual predictive Ligand Identification and Matching tool. The tool helped enhance the lead compound identification and validation process, reducing adverse results during clinical evaluation, while enhancing post-launch drug safety.

The Indian Institute of Science (IISc) is a premier Indian research institution with a rich legacy of cutting-edge research across scientific disciplines going back nearly a century. It offers post-graduate and doctoral research programs to over 2000 researchers across 48 specialized departments. IISc has been rated by surveys as the topmost in terms of research output among Indian institutions and also the first among South Asian universities.

IISc was seeking a partner in research informatics to collaborate in the development of a new virtual predictive tool to rationalize the process of identifying and validating lead compounds to aid new drug development.

Business need

The traditional process of identifying lead compounds – the best among a lead molecule series with sufficient target potency and selectivity – from new drug molecules is long, laborious, and entails experimental verification by screening several million molecules for binding affinity. IISc needed co-development support to optimize the process and improve the efficacy of the conceptual virtual predictive Ligand Identification and Matching tool.

Existing algorithms and methodologies used for tools available had proved unsatisfactory in terms of sensitivity and specificity, high cost, infrastructure requirements, and inputs that frequently resulted in systemic errors.

The research organization required a new algorithm for the tool to reduce the challenge of bulk and eliminate false positives.

Our solution

The virtual screening tool was based on a novel method to predict inhibitors for researchers who would then identify and optimize leads.

In this approach, the preliminary binding affinity results of chemical inhibition were available while the atomic details of the target active site were unknown. A consensus profile was computed from a group of known inhibitors with respect to groups of atoms or moieties best aligned among the Simplified Molecular Input Line Entry System (SMILES) strings.

The method used a cost-effective computational technique to identify homologues. Molecules, which matched either a part of or the entire profile, were selected to re-construct sub-structures or moieties of the final lead compound which had the optimized inhibition characteristics. Researchers then tested the binding affinity of these screened chemical homologues and suggested combinations to improve the binding characteristics.

The novel algorithm from Infosys for the Ligand Identification and Matching tool improved sensitivity while lowering computational costs. Set up in a framework leveraging access to multiple services and applications, the tool was validated with the family of Cox-2 inhibitors.

Infosys relied on innovation in this new frontier of pharmacological research to successfully develop a novel algorithm for the virtual predictive tool, resulting in cycle time reduction for lead identification. Relational chemical databases were leveraged to handle large volumes of data along with annotation and indexing of molecules in a tabular format.


The lead identification and matching tool proved its efficacy in identifying leads, reducing adversity during clinical evaluation, and improving post launch drug safety. It achieved:

  • Rationalization of drug design addressing issues of safety and efficacy
  • Fast and accurate predictive high throughput screening (HTS)
  • Validation through correction of false positive experimental data by deterministic methods

The tool developed by Infosys helped rationalize and shortlist candidate molecules by efficiently managing various attributes of chemical molecules, resulting in a shorter cycle time for lead identification and validation. Use of reusable components helped keep costs low while the modular, scalable, and extensible technology framework provided unprecedented flexibility.

Future enhancements

The tool is being integrated with other Web services and custom applications to build a service oriented architecture-based platform. This workbench will provide complete support to chemistry and pharmacology researchers in lead identification, validation, and optimization.