.. _installation_guide:
.. raw:: html
Installation Guide
==================
.. raw:: html
DDGWizard consists of 3 components: the feature calculation pipeline, that processes raw ΔΔG data and outputs feature-enriched ΔΔG data with 1547 features; the DDGWizard dataset, including 15752 ΔΔG data; and the accurate ΔΔG prediction model.
This section explains how to install dependencies for using the DDGWizard's application (there is no need to install anything to access the DDGWizard dataset; it can be directly downloaded).
Installation prerequisites:
CentOS 7 or Ubuntu system; GCC version higher than 4.8.5; Conda version higher than 23.0; Git.
.. _`the Characterization part`:
Feature Calculation Pipeline (for Generating Feature-Enriched ΔΔG Data)
------------------------------------------------------------------------------
.. raw:: html
This subsection is for users who need to use the feature calculation pipeline. It can assist users in processing input raw ΔΔG data and outputting feature-enriched new data, including 1574 features that completed calculations. It can facilitate further analysis, feature selection, and machine learning.
The installation steps are as follows.
.. raw:: html
1. Git clone the DDGWizard repository
.. code-block::
$ git clone https://github.com/bioinfbrad/DDGWizard.git
.. raw:: html
2. Config and install conda virtual environment
There is an Environment.yml file located in the path DDGWizard/src, which is the Conda environment configuration file.
Open this file with your text editor (e.g., nano, vim, vi, etc.). Here we use nano as an example:
.. raw:: html
$ cd </path/to/DDGWizard/>src/
$ nano Environment.yml
.. raw:: html
Modify the
prefix, which is on the last line. Change the prefix to your local
conda envs folder path.
If you don't know how to find the path to local
conda envs folder, you can use command:
.. code-block::
$ conda info --envs
.. raw:: html
After changing, the
prefix should be
prefix: /path/to/your_conda/envs/DDGWizard.
.. raw:: html
Once user have changed the prefix of Environment.yml file, please use Conda commands to create a Conda virtual environment and install dependencies. This may take some time.
.. code-block::
$ conda env create -f Environment.yml
.. raw:: html
3. Download NCBI-BLAST-2.13.0+
Users need to download the NCBI-BLAST-2.13.0 program for allowing DDGWizard to carry out multiple sequence alignment (MSA).
Users can visit
Download NCBI-BLAST-2.13.0+ to download the
ncbi-blast-2.13.0+-x64-linux.tar.gz file. We recommend download this file to the path
DDGWizard/src/. Users can also use
wget to download:
.. raw:: html
$ cd </path/to/DDGWizard/>src/
$ wget https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.13.0/ncbi-blast-2.13.0+-x64-linux.tar.gz
.. raw:: html
Then copy this compressed file to the path
DDGWizard/bin/ncbi_blast_2_13_0+/ and extract it. Use the following commands (assuming the file has been downloaded to the path
DDGWizard/src/):
.. raw:: html
$ cd </path/to/DDGWizard/>src/
$ cp ncbi-blast-2.13.0+-x64-linux.tar.gz ../bin/ncbi_blast_2_13_0+/
$ cd ../bin/
$ tar -zxvf ncbi-blast-2.13.0+-x64-linux.tar.gz
$ cp -r ncbi-blast-2.13.0+/* .
.. raw:: html
NCBI-BLAST-2.13.0+ is a "United States Government Work" under the terms of the United States Copyright Act. Please read and accept the license file in its folder before proceeding further.
.. raw:: html
4. Configure Modeller
The Modeller software is used for homology or comparative modeling of protein three-dimensional structures. In DDGWizard, Modeller is used to construct PDB protein structure files of mutations based on the user's input of wild-type PDB protein structure files.
Modeller has already been installed when creating Conda environment. But to allow our program to call it, you need to have a license of the Modeller and configure it.
Please enter
Official Modeller Website to register an account. Modeller use "Academic End-User Software License Agreement for MODELLER" terms. Please follow their instructions, read and accept the terms to obtain a license key.
Then input the license key into installed Modeller's configuration file. You can access it under the
Conda envs folder. Please use following commands:
.. raw:: html
$ nano </path/to/your_conda/envs/DDGWizard/>lib/modeller-10.6/modlib/modeller/config.py
.. raw:: html
Replace the XXXX to your license key. Save and close it.
.. raw:: html
To use DDGWizard feature calculation pipeline, the following software dependencies are optional (step 5-11) and not required to be installed (if certain software is not installed, the feature values it calculates will not be output).
If users want to calculate more features, please install the following software dependencies. If users want to test the feature calculation pipeline for now, it can already run (for usage, see section Generate Feature-Enriched ΔΔG data).
Before running, please don't forget to make sure the programs of the DDGWizard have the executable permission (step 12). Return to the DDGWizard program folder and execute the command:
.. raw:: html
$ cd </path/to/DDGWizard/>
$ chmod -R +x .
.. raw:: html
To use DDGWizard prediction model, users need to further complete installation of step 5-8 (Ring 3.0 needs to apply and achieves permission to download, might take some time).
.. raw:: html
(Optional) 5. Download FoldX 5.0
Users can download the FoldX 5.0 program for allowing DDGWizard to calculate energy terms of proteins. FoldX has academic version and commercial version. To use it in DDGWizard, academic version is enough. Please visit
Apply for FoldX 5.0 to register an account, read and accept "FoldX Academic License" terms to download the
foldx5Linux64.zip file. Copy this compressed file to the path
DDGWizard/bin/FoldX_5.0/ and extract it. Use the following commands (assuming the file has been downloaded to the path
DDGWizard/src/):
.. raw:: html
$ cd </path/to/DDGWizard/>src/
$ cp foldx5Linux64.zip ../bin/FoldX_5.0/
$ cd ../bin/FoldX_5.0/
$ unzip foldx5Linux64.zip
.. raw:: html
(Optional) 6. Download Ring 3.0
Users can download the Ring 3.0 application for allowing DDGWizard to calculate residue interaction information. Please visit
Apply for Ring 3.0 to apply and wait permission to download. Please read and accept the license of Ring 3.0 to obtain the
ring-3.0.0.tgz file. Copy this compressed file to the path
DDGWizard/bin/ring-3.0.0/ and extract it. Use the following commands (assuming the file has been downloaded to the path
DDGWizard/src/):
.. raw:: html
$ cd </path/to/DDGWizard/>src/
$ cp ring-3.0.0.tgz ../bin/ring-3.0.0/
$ cd ../bin/ring-3.0.0/
$ tar -zxvf ring-3.0.0.tgz
$ cp -r ./ring-3.0.0/* .
.. raw:: html
(Optional) 7. Download DisEMBL
Users can download the DisEMBL program for allowing DDGWizard to count disorder information of proteins. Please visit
Download the DisEMBL to download the
DisEMBL-1.4.tgz file. Copy this compressed file to the path
DDGWizard/bin/DisEMBL_1_4/ and extract it. Use the following commands (assuming the file has been downloaded to the path
DDGWizard/src/):
.. raw:: html
$ cd </path/to/DDGWizard/>src/
$ cp DisEMBL-1.4.tgz DDGWizard/bin/DisEMBL_1_4/
$ cd ../bin/DisEMBL_1_4/
$ tar -zxvf DisEMBL-1.4.tgz
$ cp -r ./DisEMBL-1.4/* .
.. raw:: html
DisEMBL uses GPL 2.0 open-source license. Please read and accept the license file in its folder before proceeding further.
.. raw:: html
(Optional) 8. Configure DSSP
The DSSP is used to calculate the RSA (relative surface area) and secondary stuctures of
PDB files.
To allow DDGWizard use DSSP, please enter your local
Conda envs folder, then enter
bin folder, and copy
mkdssp as
dssp:
.. raw:: html
$ cd </path/to/your_conda/envs/DDGWizard/bin/>
$ cp mkdssp dssp
.. raw:: html
(Optional) 9. Install Bio3D
Users can install the Bio3D package for allowing DDGWizard to calculate atomic fluctuations based on NMA (normal mode analysis). It requires users have
R as prerequisites (it can be downloaded and installed from
Official R Website). Then please use following commands to install Bio3d package:
.. code-block::
$ R
install.packages("bio3d")
.. raw:: html
(Optional) 10. Download PROFbval
PROFbval relies on the Ubuntu environment. To address cross-platform compatibility, we have created container images for easy download by users. This requires users have Docker or Singularity as a prerequisite.
Please download the following two files:
myprof.tar (128MB) and
myprof.sif (360MB) from
https://zenodo.org/records/12817843, and copy them to the path:
DDGWizard/src/Prof_Source. Please use the following commands (assuming the files have been downloaded to the path
DDGWizard/src/).
.. raw:: html
$ cd </path/to/DDGWizard/>src/
$ cp ./myprof.tar ./Prof_Source
$ cp ./myprof.sif ./Prof_Source
.. raw:: html
DDGWizard will automatically call the programs within the container images. Users only need to have either Docker or Singularity. If the user chooses Docker, an additional step is required:
.. code-block::
$ docker load -i </path/to/DDGWizard/>src/Prof_Source/myprof.tar
.. raw:: html
PROFbval uses GPL 3.0+ open-source license. Please read and accept its license before proceeding further.
.. raw:: html
(Optional) 11. Download SIFT 6.2.1
Users can download the SIFT 6.2.1 program for allowing DDGWizard to predict impact of amino acid substitution on protein function. Please visit
Download SIFT 6.2.1 or use
wget to download the
sift6.2.1.tar.gz file. Copy this compressed file to the path
DDGWizard/bin/sift6_2_1/ and extract it. Use the following commands (assuming the files have been downloaded to the path
DDGWizard/src/):
.. raw:: html
$ cd </path/to/DDGWizard/>src/
$ wget https://s3.amazonaws.com/sift-public/nsSNV/sift6.2.1.tar.gz
$ cp sift6.2.1.tar.gz ../bin/sift6_2_1/
$ cd ../bin/sift6_2_1/
$ tar -zxvf sift6.2.1.tar.gz
$ cp -r sift6.2.1/* .
.. raw:: html
SIFT 6.2.1 uses non-commercial license. Please read and accept the license file in its folder before proceeding further.
.. raw:: html
12. Make sure the programs of the DDGWizard have the executable permission
The programs of DDGWizard need the executable permission to run.
Return to the DDGWizard program folder and execute the command:
.. raw:: html
$ cd </path/to/DDGWizard/>
$ chmod -R +x .
.. _`the Prediction Part`:
ΔΔG Prediction Model (for Predicting ΔΔG)
-----------------------------------------------
.. raw:: html
This subsection is for users who need to use the ΔΔG prediction model.
To use DDGWizard's ΔΔG prediction model, users are required to complete steps 1-8 (these are no longer optional) and execute step 12 of Feature Calculation Pipeline's installation part. Steps 9-11 are not required.