
Title
Combining Different Types of Information in Similarity Searching and Library Design
Speaker
Peter Willett, Sheffield University, United Kingdom
Abstract
Similarity and diversity methods play an important role in modern approaches to drug discovery. The talk will discuss ways of enhancing such methods by combining different types of information.
The Tanimoto coefficient is widely used for similarity searching in chemical databases this purpose, but different similarity coefficients are known to yield different database rankings. We have recently carried out a comparison of 22 different similarity coefficients when used to search the IDAlert and NCI AIDS data bases with 2D fragment bit-strings. While there are differences between the performances, the best results are obtained by using data fusion methods to combine the rankings resulting from different coefficients. This suggests a simple, and inexpensive way of enhancing the performance of chemical database systems.
SELECT is a program for the design of combinatorial libraries that are both structurally diverse and that consist of drug-like molecules (in terms of characteristics such as molecular weight, partition coefficient, etc. A limitation of SELECT is that it uses rather simple criteria to weight the various contributions to the overall effectiveness of a potential library design, and we have hence recently enhanced the program by replacing its underlying genetic algorithm (GA) with a multiple-objective genetic algorithm (MOGA). This optimization strategy ensures that the full range of types of library can be considered in a design, thus providing a greater degree of control over the precise characteristics of the libraries suggested by the program.