GNINA Models

Thanks to Andrew McNutt, who converted the weights of the original Caffe models to PyTorch, all GNINA models are available in gninatorch.

You can find more information about the Caffe implementation at gnina/models.

Loading GNINA Models

The pre-trained models can be loaded as follows:

from gninatorch import gnina

model, ensemble: bool = setup_gnina_model(model_name)

where model_name accepts the same values as the --cnn argument in GNINA (see Supported GNINA Models and Supported GNINA Ensembles of Models). ensemble is a boolean flag that indicates whether the model is an ensemble of models or not.

A single model returns log_CNNscore, and CNNaffinity:

assert ensemble == False

#  Grid based-representation of protein-ligand binding site
# x : torch.Tensor

log_CNNscore, CNNaffinity = model(x)
CNNscore = torch.exp(log_CNNscore)

An ensemble of models returns log_CNNscore, CNNaffinity and CNNvariance:

assert ensemble == True

#  Grid based-representation of protein-ligand binding site
# x : torch.Tensor

log_CNNscore, CNNaffinity, CNNvariance = model(x)
CNNscore = torch.exp(log_CNNscore)

Warning

In contrast to GNINA, which returns CNNscore, the PyTorch models return log_CNNscore.

Supported GNINA Models

The following models are provided:

  • default2017 [RHI+17]

  • redock_default2018 or redock_default2018_[1-4] [FMS+20]

  • general_default2018 or general_default2018_[1-4] [FMS+20]

  • crossdock_default2018 or crossdock_default2018_[1-4] [FMS+20]

  • dense or dense_[1-4] [FMS+20]

Supported GNINA Ensembles of Models

The following ensembles of models are also available:

default is the default model used by GNINA. See [MFA+21] for more information.

Note

If you are using the pre-trained models, please cite accordingly.

Building your own ensemble

You can build your own ensemble of models as follows:

from gninatorch import gnina

model = gnina.load_gnina_models([model_name1, model_name2, ...])

The default model used by GNINA corresponds to the following ensemble:

from gninatorch import gnina

names = [
        "dense",
        "general_default2018_3",
        "dense_3",
        "crossdock_default2018",
        "redock_default2018_2",
    ]

model = gnina.load_gnina_models(names)

The default model is chosen to optimise accuracy and inference speed. See [MFA+21] for more information.

Inference with GNINA Models

Inference with the pre-trained GNINA models is provided by gninatorch.gnina:

python -m gninatorch.gnina -h

Note

The gninatorch.gnina script loosely corresponds to running GNINA with the --score_only argument. Not all features are yet implemented.

The gninatorch.gnina script takes a .types file (a file listing protein-ligand pairs):

receptor_1.pdb ligand_1.sdf
receptor_1.pdb ligand_2.sdf
receptor_2.pdb ligand_1.sdf
...

Protein and ligand files can have the file formats supported by Open Babel.

Pose (CNNscore) and binding affinity (CNNaffinity) predictions with gninatorch.gnina using the protein-ligand complexes defined in PL.types and using the crossdock_default2018_ensemble ensemble of models can be run as follows:

python -m gninatorch.gnina \
    PL.types \
    --cnn crossdock_default2018_ensemble

FMS+20(1,2,3,4,5,6,7,8,9)

Paul G Francoeur, Tomohide Masuda, Jocelyn Sunseri, Andrew Jia, Richard B Iovanisci, Ian Snyder, and David R Koes. Three-dimensional convolutional neural networks and a cross-docked data set for structure-based drug design. J. Chem. Inf. Model., 60(9):4200–4215, 2020.

MFA+21(1,2,3)

Andrew T McNutt, Paul Francoeur, Rishal Aggarwal, Tomohide Masuda, Rocco Meli, Matthew Ragoza, Jocelyn Sunseri, and David Ryan Koes. Gnina 1.0: molecular docking with deep learning. J. of Cheminform., 13(1):1–20, 2021.

RHI+17

Matthew Ragoza, Joshua Hochuli, Elisa Idrobo, Jocelyn Sunseri, and David Ryan Koes. Protein–ligand scoring with convolutional neural networks. J. Chem. Inf. Model., 57(4):942–957, 2017.

SK20

Jocelyn Sunseri and David R Koes. Libmolgrid: graphics processing unit accelerated molecular gridding for deep learning applications. J. Chem. Inf. Model., 60(3):1079–1084, 2020.