Predict ΔΔG
This guide is intended to show users how to use DDGWizard to predict ΔΔG.
1. Running template
We first provide you with a running template of running DDGWizard to predict ΔΔG, and then explain the specifics of each parameter in detail.
You can run the program with (predicting ΔΔG also requires the prepared Blast database):
$ conda activate DDGWizard
$ cd </path/to/DDGWizard/>
$ python Predict_ddG_Executable.py \
--pred_dataset_path src/Sample_Pred.csv \
--db_folder_path </folder/to/save/Blast_database/> \
--db_name <the_name_to_assign_for_Blast database> \
--if_reversed_data 0 \
--blast_process_num 4 \
--mode whole \
--process_num 4
2. Parameter details
Below are the details of the parameters for the ΔΔG prediction program: (1). --pred_dataset_path This parameter indicates that you need to provide the path to a csv file, which contains the mutations' basic information for predicting ΔΔG. In the path DDGWizard/src, there is a sample file Sample_Pred.csv that you can use directly for testing and as a reference. We list some of the contents of this file here, and provide detailed descriptions of each column's attributes in the table file:PDB |
Amino Acid Substitution |
Chain ID |
pH |
T |
|---|---|---|---|---|
1SHG |
Y57H |
A |
7 |
25 |
2AFG |
C117I |
A |
7 |
25 |
2LZM |
M102L |
A |
3 |
52 |
… |
… |
… |
… |
… |
Description of attributes for each column in the table file:
a. PDB: provide a PDB identifier that allow program can automatically download the PDB file.
b. Amino Acid Substitution: It consists of one-letter code of the wild-type amino acid, the sequential number of the mutation site, and the code of the mutant amino acid, for describing substitution of amino acids caused by the mutation.
d. Chain ID: Indicate the protein chain where the mutation site is located.
e. pH: Specify at which pH the mutation occurs. If you have no specific requirements or preferences regarding pH, you can simply specify it as 7.
f. T: Specify at which temperature the mutation occurs. If you have no specific requirements or preferences regarding temperature, you can simply specify it as 25.
(2). --db_folder_path
This parameter indicates the folder path of the Blast database that user have prepared.
(3). --db_name
This parameter indicates the name of the Blast database that user have prepared.
(4). --if_reversed_data
This parameter requires user to provide a value of 0 or 1. The value of 0 means only predicting the ΔΔG for the mutations provided in the file, while the value of 1 means also predicting the ΔΔG for the reverse mutations of the mutations provided.
(5). --blast_process_num
This parameter requires user to provide an integer greater than 0 and less than 200. It represents the number of processes (multiprocessing) DDGWizard will use for sequence alignment.
(6). --mode
Please provide the default value whole.
(7). --process_num
This parameter requires user to provide an integer greater than 0 and less than 200. It represents the number of processes (multiprocessing) DDGWizard will use for calculating the optimal features.
3. Output
There will be an output csv file Pred_ddG.csv located in DDGWizard/src/Pred_Res/, which will record all prediction results.4. Notes
(1). When running DDGWizard, you need to cd to the top-level directory of the program to execute the program.
(2). DDGWizard supports multi-process handling itself. If you wish to run multiple instances of DDGWizard to fully utilize your computer's resources, we recommend using the multi-process parameters provided by DDGWizard.
We don't recommend to achieve multi-process handling of DDGWizard by user themselves.
If user need to run multiple instances of DDGWizard at the same time by themselves, please avoid running multiple instances of DDGWizard from the same folder, as the program synchronizes files within the folder, which can cause synchronization errors. Please make multiple copies of the DDGWizard folder and run each instance separately in its own folder.
(3). Do not place your files in the top-level folder of DDGWizard. DDGWizard will automatically clean files in the top-level folder to maintain multi-process synchronization.
(4). The complete log file is saved at the path DDGWizard/src/log.txt.