Title: Similarity based virtual screen using enhanced Siamese deep learning methods

Abstract

Traditional Drug production is a long and complex process that leads to new drug production. The virtual screening technique is a computational method that allows chemical compounds to be screened at an acceptable time and cost. Several databases contain information on various aspects of biologically active substances. Simple statistical tools are difficult to use because of the enormous amount of information and complex data samples of molecules with structurally heterogeneous recorded in these databases. Many techniques for capturing the biological similarity between a test compound and a known target ligand in LBVS have been established. However, despite the good performances of the above methods compared to their prior, especially when dealing with molecules that have homogenous active elements, they are not satisfied when dealing with molecules that are structurally heterogeneous. Deep learning models have recently achieved considerable success in a variety of disciplines due to their powerful generalization and feature extraction capabilities. Also, The Siamese network has been used in similarity models for more complicated data samples, especially with heterogeneous data samples. The main aim of this study is to enhance the performance of similarity searching, especially with molecules that are structurally heterogeneous. The Siamese architecture will be enhanced by using two similarity distance layers with one fused layer to further improve the similarity measurements between molecules, then many layers added after the fused layer for some models to improve the retrieval recall. In this architecture, several methods of Deep Learning have been used which are, Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), Convolutional Neural Network -one dimension (CNN1D), and Convolutional Neural Network-two dimensions (CNN2D). Series of experiments has been carried out on real-world datasets and the results showed that the proposed methods, outperformed the existing methods.

+1 (506) 909-0537