Predicting the biological properties of potential SARS-CoV-2 inhibitors using graph theory and machine learning
The expected solution should represent chemical structures in the form of graphs, consisting of vertices (or nodes) and edges (or links), that represent the relationship between vertices. The nodes may be the functional groups and rings while the edges could be the linkage between these conjugated systems. Attempts should be made to generate hypotheses for correlating functional groups and rings with the experimental properties. An algorithm needs to be developed which can establish correlation between chemical and biological knowledge of molecules. This algorithm should be implemented in a GUI-based program (with compatibility for Mac, Linux and Window based systems), which will identify the rules describing the complex relationship between substructures and properties of the molecules and how they are positively or negatively correlated with the observed experimental value. The relationship can be based on two-dimensional, three-dimensional descriptors. Finally, this algorithm should be tested using the test set to predict the biological properties of the test molecules and the quality of predictions should be verified using different metrics. Dataset: The experimental and curated datasets (antiviral compounds with known activity for SARS related viruses) as well as test sets are provided along the list of functional groups and rings to be used as special functional groups.