Predict ΔΔG for Saturation Mutagenesis

This guide is intended to show users how to quickly use DDGWizard to predict ΔΔG for saturated mutagenesis.

Saturation mutagenesis represents mutating the original amino acid residue at the same mutation site to all possible amino acids. In practical applications, users often require predicting the ΔΔG of saturation mutagenesis at one or all amino acid sites, thereby assessing which mutations may enhance thermostability of the protein based on a wide range of possibilities.

To meet this practical user's need, we have prepared a program to help users quickly generate the needed csv file for saturation mutagenesis. This file serves as input for DDGWizard to predict the ΔΔG of saturation mutagenesis.

1. Running example

Similarly, we first provide two examples of running this program, followed by a detailed explanation of the program parameters. We selected the protein 1SHG as a case study.

For predicting the ΔΔG of saturation mutagenesis at a single site (e.g. number 57 amino acid site), run the program with:

$ conda activate DDGWizard
$ cd </path/to/DDGWizard/>
$ python Utility_Tool.py \
   --pdb_id 1SHG \
   --chain_id A \
   --site_number 57 \
   --wt_aa Y \
   --pH 7 \
   --T 25

For predicting the ΔΔG of full-site saturation mutagenesis, run the program with:

$ conda activate DDGWizard
$ cd </path/to/DDGWizard/>
$ python Utility_Tool.py \
   --pdb_id 1SHG \
   --site_number all \
   --pH 7 \
   --T 25

2. Parameter details

Below are the details of the parameters for the program of saturation mutagenesis:
(1). --pdb_id

Provide a PDB identifier that allow program can automatically download the PDB file.

(2). --chain_id

Indicate the protein chain where the mutation site is located.

If you intend to predict the ΔΔG of full-site saturation mutagenesis and the parameter --site_number was provided with the value all, you don't need to provide this parameter. The program will automatically match the chain identifier for all possible mutations.

(3). --site_number

This parameter indicates that you need to provide the site number of the predicted mutation.

If you intend to predict the ΔΔG of full-site saturation mutagenesis, please provide the value all.

(4). --wt_aa

This parameter indicates that you need to provide the wild-type amino acid of the predicted mutation.

If you intend to predict the ΔΔG of full-site saturation mutagenesis and the parameter --site_number was provided with the value all, you don't need to provide this parameter. The program will automatically match the wild-type amino acid for all possible mutations.

(5). --pH

This parameter indicates that you need to specify at which pH you want to predict the ΔΔG for the mutations.

(6). --T

This parameter indicates that you need to specify at which temperature you want to predict the ΔΔG for the mutations.

3. Output

The program will generate an output csv file Pred.csv located in DDGWizard/src/.

This csv file can be directly used as input for the DDGWizard prediction program, enabling quick preparation for ΔΔG prediction of saturation mutagenesis:

$ conda activate DDGWizard
$ cd </path/to/DDGWizard/>
$ python Predict_ddG_Executable.py \
    --pred_dataset_path ./src/Pred.csv \
    --db_folder_path </folder/to/save/Blast_database/> \
    --db_name <the_name_to_assign_for_Blast database> \
    --if_reversed_data 0 \
    --blast_process_num 4 \
    --mode whole \
    --process_num 4