Develop a reinforcement learning-based algorithm to identify lead molecules by emulating ligand-protein interactions
Devise a reinforcement learning algorithm where agents learn to identify the correct binding pose for different active sites in viral proteins based on a reward-penalty scheme. The environment for agents will be the active site and neighbouring residues within 10Å as well as functional groups near each particular atom on a ligand. Strategies for agents might include various scoring parameters for residue-ligand interactions. (eg. agents that orient towards particular residues based on favoring hydrogen bonding perform better than agents that don’t.) Another strategy could be to input lots of “poses” of ligands bound to proteins to the algorithm, and the RL algorithm learns better poses (low binding energy) from poor poses (high binding energy), and implements a suitable reward-penalty scheme. The Reinforcement learning will be performed on a variety of anti-viral drug targets except COVID-19 main Protease (Mpro: 6LU7). Once an RL model is obtained it will be first used to validate known inhibitor binding poses against COVID-19 main Protease (Mpro: 6LU7). After the above validation the RL model can be used to identify hits that share binding poses similar to those of known inhibitors. Such approaches are quite novel, and there are only a few examples available in literature (e.g. DOI 10.1126/sciadv.aap7885). The final output from the algorithm should be a predicted binding pose and its RMSE with known inhibitor binding poses and key residue interactions. Accuracy of the RL model will be based on the RMSE results for correct pose identification in Mpro. Top small molecule hits can be identified from CAS antiviral database to further demonstrate significance of this algorithm. Training set: Active sites and neighbourhood of different Viral proteins (RL Environment) complexed with inhibitors need to be used for training.