Below are the details of the parameters for program to generate complete ΔΔG feature set:
(1).
file, which contains the raw data you want to use to generate ΔΔG feature set.
We list some of the contents of this file here, and provide detailed descriptions of each column's attributes in the table file:
+-------------+---------------------------+--------------------+----------------+----------+------------------+
| PDB | Amino Acid Substitution | Chain ID | ddG | pH | T |
+=============+===========================+====================+================+==========+==================+
| 1AAR | K6E | A | 0.53 | 5 | 25 |
+-------------+---------------------------+--------------------+----------------+----------+------------------+
| 1AAR | K6Q | A | 0.26 | 5 | 25 |
+-------------+---------------------------+--------------------+----------------+----------+------------------+
| 1AAR | H68E | A | 0.77 | 5 | 25 |
+-------------+---------------------------+--------------------+----------------+----------+------------------+
| ... | ... | ... | ... | ... | ... |
+-------------+---------------------------+--------------------+----------------+----------+------------------+
.. raw:: html
Description of attributes for each column in the table file:
a.
PDB: This attribute requires to provide a
PDB identifier sourced from
the RCSB database. Using the
PDB identifier program can automatically download the
PDB file.
b.
Amino Acid Substitution: It consists of one-letter code of the wild-type amino acid, the sequential number of the mutation site, and the code of the mutant amino acid, for describing substitution of amino acids caused by the mutation. For example, K6Q represents a substitution where lysine at the 6th position of protein sequence is substituted with glutamine.
c.
Chain ID: Indicate the protein chain where the mutation site is located.
d.
ddG: Require to provide the ΔΔG values of users' own raw dataset. For users with machine learning needs, this value can serve as the regression target. If users only require generating features, this attribute can be set to any numerical value without affecting the generation of other features.
e.
pH: Specify at which pH the mutation occurs.
f.
T: Specify at which temperature the mutation occurs.
.. raw:: html
(2).
--db_folder_path
This parameter indicates the folder path of the Blast database that user have prepared.
.. raw:: html
(3).
--db_name
This parameter indicates the name of the Blast database that user have prepared.
.. raw:: html
(4).
--if_reversed_data
This parameter requires user to provide a value of 0 or 1. The value of 0 means only generating features for the direct mutation, while the value of 1 means also generating the features for the reverse mutations of the mutations provided.
.. raw:: html
(5).
--blast_process_num
This parameter requires user to provide an integer greater than 0 and less than 200. It represents the number of processes (multiprocessing) DDGWizard will use for sequence alignment.
.. raw:: html
(6).
--mode
Please provide the default value
whole.
.. raw:: html
(7).
--process_num
This parameter requires user to provide an integer greater than 0 and less than 200. It represents the number of processes (multiprocessing) DDGWizard will use for generating features.
.. raw:: html
(8).
--container_type
This parameter requires user to provide a value of
D or
S or
- (default). The value of
D means using
Docker as container system, the value of
S means using
Singularity as container system, and the value of
- means skipping running PROFbval.
.. raw:: html
4. Output
There will be an output
csv file
features_table.csv located in
DDGWizard/src/Feature_Res/, which will record complete generated features.
.. raw:: html
.. raw:: html
(1). When running DDGWizard, you need to
cd to the top-level directory of the program to execute the program.
(2). DDGWizard supports multi-process handling itself. If you wish to run multiple instances of DDGWizard to fully utilize your computer's resources, we recommend using the multi-process parameters provided by DDGWizard.
We don't recommend to achieve multi-process handling of DDGWizard by user themselves.
If user need to run multiple instances of DDGWizard at the same time by themselves, please avoid running multiple instances of DDGWizard from the same folder, as the program synchronizes files within the folder, which can cause synchronization errors.
Please make multiple copies of the DDGWizard folder and run each instance separately in its own folder.
(3).
Do not place your files in the top-level folder of DDGWizard. DDGWizard will automatically clean files in the top-level folder to maintain multi-process synchronization.
(4).
The complete log file is saved at the path DDGWizard/src/log.txt.