diff --git a/cookbook/assets/tyk2_lig_ejm_31_openfold_portal.png b/cookbook/assets/tyk2_lig_ejm_31_openfold_portal.png new file mode 100644 index 0000000..7527e14 Binary files /dev/null and b/cookbook/assets/tyk2_lig_ejm_31_openfold_portal.png differ diff --git a/cookbook/openfold3_to_openfe.ipynb b/cookbook/openfold3_to_openfe.ipynb index e048e05..dce7eea 100644 --- a/cookbook/openfold3_to_openfe.ipynb +++ b/cookbook/openfold3_to_openfe.ipynb @@ -17,7 +17,7 @@ "source": [ "## Introduction\n", "\n", - "This notebook offers a brief preview of some of the steps you may need to go through to calculate binding affinities using [OpenFE](https://docs.openfree.energy/en/latest/) with [OpenFold3](https://openfold-3.readthedocs.io/en/latest/) generated structures.\n", + "Here we show a brief preview of some of the steps you may need to go through to calculate binding affinities using [OpenFE](https://docs.openfree.energy/en/latest/) with [OpenFold3](https://openfold-3.readthedocs.io/en/latest/) generated structures.\n", "\n", "Notably, we will demonstrate how to generate a set of predictions, clean them up for use in molecular dynamics simulations, and then apply them to both relative binding free energy (RBFE) and absolute binding free energy (ABFE) methods." ] @@ -35,14 +35,14 @@ }, { "cell_type": "markdown", - "id": "1e89d5ef-e362-4335-9cbb-9752341b29f9", + "id": "f0d1c067-d461-4a28-b154-396f1517161b", "metadata": {}, "source": [ "## Overview\n", "\n", "In this example notebook we demonstrate how to predict and run simulations for different ligands bound to tyrosine kinase 2 (TYK2). The dataset comes from the [Wang et al. JACS 2015 dataset](https://pubs.acs.org/doi/10.1021/ja512751q).\n", "\n", - "
\n", + "
\n", "\n", "\n", "**The tutorial is broken into the following steps:**\n", @@ -74,9 +74,15 @@ "id": "2fea29c3", "metadata": {}, "source": [ - "## 1. Run cofolding with OpenFold3 on a set of TYK2 ligands\n", - "\n", - "### Preparing the Input JSON for OpenFold\n", + "## 1. Run cofolding with OpenFold3 on a set of TYK2 ligands" + ] + }, + { + "cell_type": "markdown", + "id": "ef3bdb1f-504a-4311-8e0a-c28265b006cb", + "metadata": {}, + "source": [ + "### Preparing the input JSON for OpenFold\n", "\n", "First we need to create an input JSON file for OpenFold. This step requires assembling two key pieces of information:\n", "\n", @@ -84,6 +90,7 @@ "- Protein sequence (extracted from the PDB file)\n", "\n", "In this workflow, we begin with an SDF file containing the ligands.\n", + "\n", "We extract the SMILES representation for each ligand and store them in a dictionary alongside their ligand identifiers." ] }, @@ -96,12 +103,14 @@ }, "outputs": [], "source": [ - "ligands_file = \"assets/tyk2_ligands.sdf\"\n", "from rdkit.Chem.Descriptors3D import Asphericity\n", "from rdkit import Chem\n", "\n", + "ligands_file = \"assets/tyk2_ligands.sdf\"\n", + "\n", "ligands_dict = {}\n", "sppl = Chem.SDMolSupplier(ligands_file, removeHs=True)\n", + "\n", "for mol in sppl:\n", " smi = Chem.MolToSmiles(mol)\n", " name = mol.GetProp(\"_Name\")\n", @@ -143,7 +152,7 @@ "id": "80da7967-8046-44e4-aef7-c81ac6712f98", "metadata": {}, "source": [ - "We then create the JSON file that will be the OpenFold input." + "Next we create the JSON file that will be the OpenFold input." ] }, { @@ -199,47 +208,351 @@ "source": [ "### Running OpenFold\n", "\n", - "#### Defining the output settings\n", + "#### Defining the OpenFold settings\n", "\n", - "By default OpenFold3 outputs `.mmcif` files, but OpenFE requires `.pdb` files. We can ask OpenFold to generate this instead by defining it in the output settings.\n", + "we define the following settings in a `settings.yaml` file to be passed to OpenFold:\n", "\n", - "To do this, create an `output_settings.yml` file to specify .pdb as the output file format:\n", + "```json\n", + "model_update:\n", + " presets:\n", + " - predict\n", + " - low_mem\n", + " custom:\n", + " settings:\n", + " memory:\n", + " eval:\n", + " use_cueq_triangle_kernels: true\n", + " use_deepspeed_evo_attention: false\n", "\n", - "```\n", - "output_writer_settings:\n", - " # change output format to pdb (default: mmcif):\n", - " structure_format: pdb\n", - "```\n", "\n", + "```\n", "#### Running OpenFold\n", "\n", "With all these inputs, you can now run OpenFold:\n", "\n", - "`run_openfold predict --query_json=queries.json --runner_yaml output_settings.yml`" + "`run_openfold predict --query_json=queries.json --runner_yaml settings.yaml`" ] }, { "cell_type": "markdown", - "id": "b4adde29-282a-45f6-96a6-c8de0700a266", + "id": "9c1ab9ee-71f5-4587-bea1-35097621c039", "metadata": {}, "source": [ - "## 2. Processing the OpenFold3 output for use with OpenFE\n", + "#### Evaluating OpenFold output\n", + "\n", + "confidence scores for each ligand are at ``assets/of3_tyk2_output/{lig_name}/seed_42/*confidences.json``\n", "\n", + "If you are using the OpenFold portal, you can view the confidence scores interactively:\n", + "\n", + "
\n" + ] + }, + { + "cell_type": "markdown", + "id": "9481baf0-9f31-41f2-acf2-ec9b62552345", + "metadata": {}, + "source": [ "
\n", "⚠️ Important Note:\n", - " Here we use the first model for each complex predicted by OpenFold3. In practice you would filter through all predicted models, possibly using pose busters (github.com/maabuu/posebusters) and some other scoring metrics to identify the most plausible inputs.\n", + " Here we use the first model for each complex predicted by OpenFold3. In practice you would filter through all predicted models using confidence scores, and possibly using pose busters (github.com/maabuu/posebusters).\n", "
" ] }, { "cell_type": "markdown", - "id": "eaf920be-9e16-402b-9fc5-8e1d4f3712cf", + "id": "721074db-8266-4c3e-b592-64220e0cfd8f", + "metadata": {}, + "source": [ + "## 2. Running OpenFE Protocols with OpenFold structures" + ] + }, + { + "cell_type": "markdown", + "id": "249ec515-cfad-4f49-8dba-ee2438b1adf4", + "metadata": {}, + "source": [ + "We will demonstrate how to use OpenFold generated structures with three different OpenFE protocols:\n", + "Absolute Binding Free Energy (ABFE), Separated Topologies RBFE, and Hybrid Topology RBFE.\n" + ] + }, + { + "cell_type": "markdown", + "id": "6419b4ec-6513-47ca-bb66-0eb17a7224c3", + "metadata": {}, + "source": [ + "### Creating an Absolute Binding Free Energy (ABFE) Transformation\n", + "\n", + "The ABFE protocol requires the least amount of post-processing of the OpenFold outputs.\n", + "\n", + "For the sake of simplicity, we will only create a Transformation for `lig_jmc_27`, but this process is generalizable to the rest of the ligands in the series.\n" + ] + }, + { + "cell_type": "markdown", + "id": "cdeee4e3-ce8c-4151-9ef7-5a6017ac3dcb", + "metadata": {}, + "source": [ + "#### Loading and assigning partial charges to the ligand(s)" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "09725e7d-b959-4221-b027-1488b6fea22a", + "metadata": {}, + "outputs": [], + "source": [ + "import openfe\n", + "supp = Chem.SDMolSupplier(\"assets/tyk2_ligands.sdf\", removeHs=False)\n", + "all_ligands = [openfe.SmallMoleculeComponent.from_rdkit(mol) for mol in supp]\n", + "ligand = [lig for lig in all_ligands if lig.name == 'lig_jmc_27'][0]" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "0b78ec14-9ef9-4068-84ab-3271e6780d40", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Generating charges: 100%|█████████████████████████| 1/1 [00:00<00:00, 1.49it/s]\n" + ] + } + ], + "source": [ + "from openfe.protocols.openmm_utils.omm_settings import OpenFFPartialChargeSettings\n", + "from openfe.protocols.openmm_utils.charge_generation import bulk_assign_partial_charges\n", + "\n", + "# Generate partial charges\n", + "charge_settings = OpenFFPartialChargeSettings(partial_charge_method=\"nagl\")\n", + "\n", + "charged_ligands = bulk_assign_partial_charges(\n", + " molecules=[ligand],\n", + " overwrite=False,\n", + " method=charge_settings.partial_charge_method,\n", + " toolkit_backend=charge_settings.off_toolkit_backend,\n", + " generate_n_conformers=1,\n", + " nagl_model=None,\n", + " processors=1,\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "a84b9dfa-e92a-43fa-baae-a725778122cb", + "metadata": {}, + "source": [ + "#### Fixing & protonating the protein structure\n", + "\n", + "The PDB contains the heavy atoms of the protein but is missing hydrogens and caps. Here we call PDBFixer to add hydrogens at pH 7 and add protein caps." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "ea82c47d-9794-424c-8d27-a9c6e69d29e8", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/Users/atravitz/micromamba/envs/openfe-notebooks/lib/python3.12/pty.py:95: DeprecationWarning: This process (pid=74212) is multi-threaded, use of forkpty() may lead to deadlocks in the child.\n", + " pid, fd = os.forkpty()\n" + ] + } + ], + "source": [ + "# now run the reference protein through pdbfixer\n", + "! pdbfixer assets/of3_tyk2_output/lig_jmc_27/seed_42/lig_jmc_27_seed_42_sample_1_model.pdb --add-atoms=all --add-residues --output=openfold_to_openfe/lig_jmc_27_seed_42_sample_1_model_fixed.pdb" + ] + }, + { + "cell_type": "markdown", + "id": "8c55d160-558f-426f-8a8c-9ae46d191cff", + "metadata": {}, + "source": [ + "Then we can load the reference protein to a ProteinComponent." + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "88bbbbca-4f6a-4496-bea9-f621ffd8183e", + "metadata": { + "scrolled": true + }, + "outputs": [ + { + "data": { + "text/plain": [ + "ProteinComponent(name=tyk2)" + ] + }, + "execution_count": 7, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "protein = openfe.ProteinComponent.from_pdb_file(\"openfold_to_openfe/lig_jmc_27_seed_42_sample_1_model_fixed.pdb\", name=\"tyk2\")\n", + "protein" + ] + }, + { + "cell_type": "markdown", + "id": "4d11803b-6d4b-4507-8c98-6ff40f75680e", + "metadata": {}, + "source": [ + "We must also define a SolventComponent which we will use to define our Transformations." + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "30373ee0-d777-4828-b89a-5a8fdbc56c41", + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "SolventComponent(name=O, Na+, Cl-)" + ] + }, + "execution_count": 8, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "solvent = openfe.SolventComponent()\n", + "solvent" + ] + }, + { + "cell_type": "markdown", + "id": "e1ef09d2-ec8b-4577-8510-c67cb89fc5b5", + "metadata": {}, + "source": [ + "#### Creating the ABFE ``Transformation``" + ] + }, + { + "cell_type": "markdown", + "id": "695c61c4-3fa6-4e1a-939f-5349e6829627", + "metadata": {}, + "source": [ + "Create the `ChemicalSystems` defining the end states of the simulation:" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "987a1657-82d0-4310-9fe7-4eecd289632a", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "ChemicalSystem(name=complex, components={'ligand': SmallMoleculeComponent(name=lig_jmc_27), 'protein': ProteinComponent(name=tyk2), 'solvent': SolventComponent(name=O, Na+, Cl-)})\n", + "\n", + "ChemicalSystem(name=solvent, components={'protein': ProteinComponent(name=tyk2), 'solvent': SolventComponent(name=O, Na+, Cl-)})\n" + ] + } + ], + "source": [ + "# Create the two end states\n", + "\n", + "# state A is the complex\n", + "stateA = openfe.ChemicalSystem({\n", + " 'ligand': ligand,\n", + " 'protein': protein,\n", + " 'solvent': solvent,\n", + "}, name=\"complex\")\n", + "\n", + "# state B is the system without the ligand\n", + "stateB = openfe.ChemicalSystem({\n", + " 'protein': protein,\n", + " 'solvent': solvent,\n", + "}, name=\"solvent\")\n", + "\n", + "print(stateA)\n", + "print(\"\")\n", + "print(stateB)" + ] + }, + { + "cell_type": "markdown", + "id": "e32fb4ef-dbf5-4a9a-90c4-b64d9e5af6ce", + "metadata": {}, + "source": [ + "Next we create the Protocol. We use the default settings, except to help us execute in parallel we set the number of repeats to 1." + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "037cb959-8f0c-4a09-844a-30e39590eabf", "metadata": {}, + "outputs": [], "source": [ - "### Preparing & extracting the structures\n", + "from openfe.protocols.openmm_afe import AbsoluteBindingProtocol\n", "\n", - "Now that we have a set of predicted structures, we must extract the relevant information from the PDB files.\n", - "This means:\n", + "settings = AbsoluteBindingProtocol.default_settings()\n", + "settings.protocol_repeats = 1\n", + "protocol = AbsoluteBindingProtocol(settings=settings)" + ] + }, + { + "cell_type": "markdown", + "id": "6bf8af4c-9894-4af3-84f8-2ed3830120ad", + "metadata": {}, + "source": [ + "Finally we create the `Transformation` object." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "ef7a9ebf-e0d7-4dda-aac5-d4428b120814", + "metadata": {}, + "outputs": [], + "source": [ + "abfe_transformation = openfe.Transformation(\n", + " stateA=stateA,\n", + " stateB=stateB,\n", + " mapping=None,\n", + " protocol=protocol, # use protocol created above\n", + " name=f\"{ligand.name}\"\n", + ")" + ] + }, + { + "cell_type": "markdown", + "id": "34ec9c8d-cb7f-4cad-9794-b73629a11246", + "metadata": {}, + "source": [ + "This transformation can now be run using `openfe quickrun`." + ] + }, + { + "cell_type": "markdown", + "id": "df8cb66d-c201-4a22-b87a-a1e5501d4be4", + "metadata": {}, + "source": [ + "### Preparing OpenFold outputs for Relative Binding Free Energies" + ] + }, + { + "cell_type": "markdown", + "id": "9dd96548-7190-4851-b1ed-fb2d8c1bd87b", + "metadata": {}, + "source": [ + "RBFE transformations require additional preparation, including:\n", "\n", "a) Choosing a reference protein structure which will be used for all binding affinity predictions\n", "\n", @@ -250,8 +563,14 @@ "\n", " - This is so that we can extract the ligands and use them with the chosen structure.\n", "\n", - "c) Fixing & protonating the reference protein structure.\n", - "\n", + "c) Fixing & protonating the reference protein structure, as we did above for the ABFE protein structure.\n" + ] + }, + { + "cell_type": "markdown", + "id": "02a82dbd-5a01-43c5-af33-654b706a5721", + "metadata": {}, + "source": [ "\n", "#### a) Choosing a reference protein structure\n", "\n", @@ -262,8 +581,8 @@ }, { "cell_type": "code", - "execution_count": 4, - "id": "25ef33bf-fb1a-45da-96e4-004f77ef7d34", + "execution_count": 12, + "id": "c7b239c1-eaf3-466e-84ef-0700eab300e6", "metadata": {}, "outputs": [ { @@ -333,7 +652,7 @@ }, { "cell_type": "markdown", - "id": "6056e755-0da0-4712-b4ae-7a721c94851d", + "id": "193dc76a-6e43-4e25-8b9d-5c0e4a79c3e9", "metadata": {}, "source": [ "#### b) Aligning & extracting the necessary structures\n", @@ -343,7 +662,7 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 13, "id": "26c10e11-f001-44e5-87a4-2939ead048a8", "metadata": {}, "outputs": [ @@ -405,7 +724,7 @@ }, { "cell_type": "markdown", - "id": "fe3ca792-4615-4be3-8a41-feb0819e968d", + "id": "e5b5141e-efe0-4bea-89e9-4896c00f4311", "metadata": {}, "source": [ "
\n", @@ -418,7 +737,7 @@ }, { "cell_type": "markdown", - "id": "10b42b56-2f47-4215-8397-fcf7473c4860", + "id": "6bf56d93-222a-4651-b8cf-f451f0360878", "metadata": {}, "source": [ "#### c) Fixing & protonating the reference protein structure\n", @@ -427,11 +746,20 @@ ] }, { - "cell_type": "code", - "execution_count": 6, - "id": "ea82c47d-9794-424c-8d27-a9c6e69d29e8", + "cell_type": "code", + "execution_count": 14, + "id": "fdfd428e-42af-49d3-9857-7a1b086f9ab2", "metadata": {}, - "outputs": [], + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/Users/atravitz/micromamba/envs/openfe-notebooks/lib/python3.12/pty.py:95: DeprecationWarning: This process (pid=74212) is multi-threaded, use of forkpty() may lead to deadlocks in the child.\n", + " pid, fd = os.forkpty()\n" + ] + } + ], "source": [ "# now run the reference protein through pdbfixer\n", "! pdbfixer openfold_to_openfe/p_aligned/protein.pdb --add-atoms=all --add-residues --output=openfold_to_openfe/p_aligned/protein_fixed.pdb" @@ -439,29 +767,23 @@ }, { "cell_type": "markdown", - "id": "89e52270-514c-4b21-8179-07c7319ff468", + "id": "1153dbd0-998a-4f3f-8e47-a950c806d1dd", "metadata": {}, "source": [ - "## 3. Creating OpenFE components from the inputs\n", - "\n", - "Now we can create OpenFE components from the files we prepared above.\n", - "\n", - "### Loading ligands and assigning partial charges\n", - "\n", - "First we load the ligands and assign them partial charges using OpenFF's NAGL model." + "#### d) Loading and assigning partial charges to the aligned ligands" ] }, { "cell_type": "code", - "execution_count": 7, - "id": "fc97de03", + "execution_count": 15, + "id": "992d53f3-d086-4283-9d6a-055be01588ad", "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ - "Generating charges: 100%|███████████████████████| 10/10 [00:05<00:00, 1.85it/s]\n" + "Generating charges: 100%|███████████████████████| 10/10 [00:05<00:00, 1.83it/s]\n" ] } ], @@ -494,189 +816,23 @@ }, { "cell_type": "markdown", - "id": "0d1e9f9d-ebe0-4fd5-be6f-fe03f7db75ee", - "metadata": {}, - "source": [ - "Then we can load the reference protein to a ProteinComponent." - ] - }, - { - "cell_type": "code", - "execution_count": 8, - "id": "88bbbbca-4f6a-4496-bea9-f621ffd8183e", - "metadata": {}, - "outputs": [ - { - "name": "stderr", - "output_type": "stream", - "text": [ - "/Users/atravitz/micromamba/envs/openfe-notebooks/lib/python3.12/site-packages/gufe/components/explicitmoleculecomponent.py:59: UserWarning: Molecule (name='tyk2') doesn't have any hydrogen atoms present. If this is unexpected, consider loading the molecule with `removeHs=False`\n", - " warnings.warn(\n" - ] - } - ], - "source": [ - "protein = openfe.ProteinComponent.from_pdb_file(\"openfold_to_openfe/p_aligned/protein.pdb\", name=\"tyk2\")" - ] - }, - { - "cell_type": "markdown", - "id": "7936a780-99a5-4bcb-addf-64646b9f3a4d", - "metadata": {}, - "source": [ - "Finally, let's also define a SolventComponent which we will use to define our Transformations." - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "id": "30373ee0-d777-4828-b89a-5a8fdbc56c41", - "metadata": {}, - "outputs": [], - "source": [ - "solvent = openfe.SolventComponent()" - ] - }, - { - "cell_type": "markdown", - "id": "8a83aff9-a356-4a3f-aeed-cc3a0e989cf5", - "metadata": {}, - "source": [ - "## 4. Creating a Binding Free Energy Transformation\n", - "\n", - "### 4a) Absolute Binding Free Energy\n", - "Using these inputs, we can create ABFE Transformations. For the sake of simplicity, we will only create a Transformation for `lig_jmc_27`, but this process is generalizable to the rest of the ligands in the series.\n", - "\n", - "First we create the `ChemicalSystems` defining the end states of the simulation:" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "id": "c405e50c-1df3-4bef-8278-393550e1fc6a", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "SmallMoleculeComponent(name=lig_jmc_27)" - ] - }, - "execution_count": 10, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# choose the first ligand\n", - "ligand = [l for l in charged_ligands if l.name == 'lig_jmc_27'][0]\n", - "ligand" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "id": "987a1657-82d0-4310-9fe7-4eecd289632a", - "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "(ChemicalSystem(name=state_A, components={'ligand': SmallMoleculeComponent(name=lig_jmc_27), 'protein': ProteinComponent(name=tyk2), 'solvent': SolventComponent(name=O, Na+, Cl-)}),\n", - " ChemicalSystem(name=state_B, components={'protein': ProteinComponent(name=tyk2), 'solvent': SolventComponent(name=O, Na+, Cl-)}))" - ] - }, - "execution_count": 11, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "# Create the two end states\n", - "\n", - "# state A is the complex\n", - "stateA = openfe.ChemicalSystem({\n", - " 'ligand': ligand,\n", - " 'protein': protein,\n", - " 'solvent': solvent,\n", - "}, name=\"state_A\")\n", - "\n", - "# state B is the system without the ligand\n", - "stateB = openfe.ChemicalSystem({\n", - " 'protein': protein,\n", - " 'solvent': solvent,\n", - "}, name=\"state_B\")\n", - "\n", - "stateA, stateB" - ] - }, - { - "cell_type": "markdown", - "id": "b576d9b0-b81e-4dc1-8bf0-26fc1585b757", - "metadata": {}, - "source": [ - "Next we create the Protocol. We use the default settings, except to help us execute in parallel we set the number of repeats to 1." - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "id": "037cb959-8f0c-4a09-844a-30e39590eabf", - "metadata": {}, - "outputs": [], - "source": [ - "from openfe.protocols.openmm_afe import AbsoluteBindingProtocol\n", - "\n", - "settings = AbsoluteBindingProtocol.default_settings()\n", - "settings.protocol_repeats = 1\n", - "protocol = AbsoluteBindingProtocol(settings=settings)" - ] - }, - { - "cell_type": "markdown", - "id": "8a643c9b-5510-4ae4-9b10-0af779821e66", - "metadata": {}, - "source": [ - "Finally we create the `Transformation` object." - ] - }, - { - "cell_type": "code", - "execution_count": 13, - "id": "ef7a9ebf-e0d7-4dda-aac5-d4428b120814", - "metadata": {}, - "outputs": [], - "source": [ - "abfe_transformation = openfe.Transformation(\n", - " stateA=stateA,\n", - " stateB=stateB,\n", - " mapping=None,\n", - " protocol=protocol, # use protocol created above\n", - " name=f\"{ligand.name}\"\n", - ")" - ] - }, - { - "cell_type": "markdown", - "id": "f89a9800-9d25-470c-845f-87479095806e", + "id": "cc023ddb-d32a-4f78-bea1-a4f9efddf902", "metadata": {}, "source": [ - "This transformation can now be run using `openfe quickrun`." + "### Creating a Separated Topologies RBFE Transformation" ] }, { "cell_type": "markdown", - "id": "f9e4f4a1-ed28-4d33-b73b-865dc5023b93", + "id": "a90d2205-1c68-4349-95b1-e20f607d4a86", "metadata": {}, "source": [ - "### 4b) Separated Topologies Relative Binding Free Energy Transformation\n", - "\n", - "We can instead build a SepTop RBFE Transformation. Here we will do the transformation between `lig_jmc_27` and `lig_ejm_46`." + "We can now SepTop RBFE Transformation. Here we will do the transformation between `lig_jmc_27` and `lig_ejm_46`." ] }, { "cell_type": "code", - "execution_count": 14, + "execution_count": 16, "id": "7ad01abd-097c-466d-9a3f-74a67e4d9015", "metadata": {}, "outputs": [], @@ -687,7 +843,7 @@ }, { "cell_type": "code", - "execution_count": 15, + "execution_count": 17, "id": "71d4ff18-a837-443d-bc02-69554e7d76cf", "metadata": {}, "outputs": [], @@ -711,7 +867,7 @@ }, { "cell_type": "markdown", - "id": "ebde65ab-8a9c-4426-a0c4-91aefd6dea68", + "id": "54484f2f-347b-498e-b5e2-60e8c8db446a", "metadata": {}, "source": [ "Next we create the Protocol. We use the default settings, except to help us execute in parallel we set the number of repeats to 1." @@ -719,7 +875,7 @@ }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 18, "id": "3e343c11-b693-4700-b1ea-2fa7951afa88", "metadata": {}, "outputs": [], @@ -733,7 +889,7 @@ }, { "cell_type": "code", - "execution_count": 17, + "execution_count": 19, "id": "a531a0cb-3eda-4d25-89d8-347f470d40ea", "metadata": {}, "outputs": [], @@ -749,7 +905,7 @@ }, { "cell_type": "markdown", - "id": "d59f5d1b-f078-49c2-9965-9cb8abc64a71", + "id": "db4344c5-c300-4b6e-aa0a-a7cae9c6380f", "metadata": {}, "source": [ "This transformation can now be run using `openfe quickrun`." @@ -757,17 +913,94 @@ }, { "cell_type": "markdown", - "id": "316727a4-aa8b-43dd-916d-b73244ee829a", + "id": "6e16e781-c6dc-4a62-85a7-0bc59678e916", "metadata": {}, "source": [ - "### 4c) Creating a Hybrid Topology Relative Binding Free Energy Transformation\n", + "### Creating a Hybrid Topology Relative Binding Free Energy Transformation\n", "\n", "Finally we create a Hybrid Topology RBFE Transformation. Again, as per the SepTop example above, we will do the transformation between `lig_jmc_27` and `lig_ejm_46`." ] }, + { + "cell_type": "markdown", + "id": "eaf920be-9e16-402b-9fc5-8e1d4f3712cf", + "metadata": {}, + "source": [] + }, { "cell_type": "code", - "execution_count": 18, + "execution_count": 20, + "id": "25ef33bf-fb1a-45da-96e4-004f77ef7d34", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Extracting ligand from lig_ejm_50_seed_42_sample_1_model.pdb\n", + "Extracting ligand from lig_jmc_23_seed_42_sample_1_model.pdb\n", + "Extracting ligand from lig_ejm_43_seed_42_sample_1_model.pdb\n", + "Extracting ligand from lig_ejm_42_seed_42_sample_1_model.pdb\n", + "Extracting ligand from lig_ejm_31_seed_42_sample_1_model.pdb\n", + "Extracting ligand from lig_ejm_47_seed_42_sample_1_model.pdb\n", + "Extracting ligand from lig_jmc_27_seed_42_sample_1_model.pdb\n", + "Extracting ligand from lig_ejm_46_seed_42_sample_1_model.pdb\n", + "Extracting ligand from lig_ejm_48_seed_42_sample_1_model.pdb\n", + "Extracting ligand from lig_jmc_28_seed_42_sample_1_model.pdb\n", + "\n", + "Largest ligand is lig_jmc_27 with volume 0.5209626577896912\n" + ] + } + ], + "source": [ + "import pathlib\n", + "import MDAnalysis as mda\n", + "from MDAnalysis.analysis import align\n", + "from rdkit import Chem, rdBase\n", + "from rdkit.Chem import AllChem\n", + "from rdkit.Chem.Descriptors3D import Asphericity\n", + "import warnings\n", + "\n", + "# don't show warnings that are expected and not an issue\n", + "warnings.filterwarnings(action=\"ignore\", message=r\"Unit cell dimensions not found\") \n", + "\n", + "# gather all the complexes and align to a single structure\n", + "# which has the largest ligand by volume \n", + "of_results_root = pathlib.Path(\"assets/of3_tyk2_output/\")\n", + "output_root = pathlib.Path(\"openfold_to_openfe\")\n", + "temp_dir = pathlib.Path(output_root,\"temp_ligands\")\n", + "temp_dir.mkdir(exist_ok=True,parents=True)\n", + "# load all the output pdbs and calculate ligand volumes\n", + "ligand_volumes = {}\n", + "\n", + "for output_file in of_results_root.glob(\"*/*/*.pdb\"):\n", + " ligand_name = output_file.name.split('_seed')[0]\n", + " print(f\"Extracting ligand from {output_file.name}\")\n", + " with warnings.catch_warnings(): \n", + " warnings.simplefilter(\"ignore\")\n", + " u = mda.Universe(str(output_file))\n", + " ligand = u.select_atoms(\"resname LIG\")\n", + " # write the ligand to a temporary file\n", + " ligand.write(str(temp_dir / f\"{ligand_name}.pdb\"))\n", + " # now load back with rdkit\n", + " rdkit_mol = Chem.MolFromPDBFile(str(temp_dir / f\"{ligand_name}.pdb\"), removeHs=True)\n", + " # now use a template from the SMILES definitions to make sure the bond orders are correct\n", + " template_mol = Chem.MolFromSmiles(ligands_dict[ligand_name])\n", + " # don't show rdkit's \"WARNING: More than one matching pattern found - picking one\" for every ligand\n", + " with rdBase.BlockLogs(): \n", + " rdkit_mol = AllChem.AssignBondOrdersFromTemplate(template_mol, rdkit_mol)\n", + " # calculate the volume\n", + " vsa = Asphericity(rdkit_mol)\n", + " ligand_volumes[ligand_name] = vsa\n", + "\n", + "# find the ligand with the largest volume\n", + "largest_ligand = max(ligand_volumes, key=ligand_volumes.get)\n", + "print(f\"\\nLargest ligand is {largest_ligand} with volume {ligand_volumes[largest_ligand]}\")" + ] + }, + { + "cell_type": "code", + "execution_count": 21, "id": "31d18f6d-b938-4de3-9b2f-3c6e9bedd827", "metadata": {}, "outputs": [], @@ -797,7 +1030,7 @@ }, { "cell_type": "code", - "execution_count": 19, + "execution_count": 22, "id": "b5186710-f89e-4c86-998f-4e9fe076a2d8", "metadata": {}, "outputs": [], @@ -819,7 +1052,7 @@ }, { "cell_type": "code", - "execution_count": 20, + "execution_count": 23, "id": "c8db1e94-c146-4daf-bdb2-4383b4fcd0d6", "metadata": {}, "outputs": [ @@ -873,7 +1106,7 @@ }, { "cell_type": "code", - "execution_count": 21, + "execution_count": 24, "id": "7d3c436c-fd93-4a3e-9933-e43b00783931", "metadata": {}, "outputs": [], @@ -897,7 +1130,7 @@ }, { "cell_type": "code", - "execution_count": 22, + "execution_count": 25, "id": "b161b5b6-7718-4a47-b834-e3d88082bb31", "metadata": {}, "outputs": [], @@ -927,7 +1160,7 @@ }, { "cell_type": "code", - "execution_count": 23, + "execution_count": 26, "id": "3bbd3d6d-35f0-490f-8fdd-4debc7874e23", "metadata": {}, "outputs": [], @@ -957,7 +1190,7 @@ }, { "cell_type": "code", - "execution_count": 24, + "execution_count": 27, "id": "8b374e77-64fb-48a1-b9ae-9b1dba74ebb5", "metadata": {}, "outputs": [], @@ -992,11 +1225,19 @@ "id": "6cae5156-f80f-4520-baee-4199774d441a", "metadata": {}, "source": [ - "## 5. Creating networks and running the simulations\n", + "## 5. Running the ``Transformation`` simulations\n", "\n", "In practice, one would create entire network of Transformations, and then run all the simulations.\n", "This notebook will not cover these aspects. To find out more about this, please see the [OpenFE Tutorials](https://docs.openfree.energy/en/latest/tutorials/index.html)." ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f09a0d21-c0d3-4c39-9c9d-b0b1d61be2ae", + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": {