How to setup CS-HM-Rosetta calculations

CS-HM-Rosetta also known as Chemical Shift-Homology Modeling-Rosetta is a protocol that makes use of structural restraints derived from homologous proteins together with any additional NMR data within CS/RARSEC/AutoNOE-Rosetta protocols to model high-resolution structures [1].

Source availability:

1. Rosetta must be installed first. It is available to academic and non-academic users under the license (see here).
2. The scripts (CS-HM-Rosetta) required to obtain structural restraints from homologous proteins are available under the downloads folder within toolbox version 3.5.

Steps to run CS-HM-Rosetta:

Step1: First, an alignment file (in .hhr format) needs to be generated using HHpred. To generate the alignment, a webserver (see here) or a standalone HHsearch (available here) can be used. If a standalone software is used, the following command needs to be executed from within hhsearch directory for alignment generation. The -i option is used to provide fasta file, -d option is used to provide a database file (this file is obtained by concatenation of all the hhm files from pdb70 database available here).

$ ./hhsearch -i aLP.fasta -d hhsearch_db.hhm -o aLP.hhr

Step2: After the generation of an alignment file (.hhr), the rosetta_cm.conf present within the CS-HM-Rosetta/cm_scripts directory must be updated to reflect the compiler requirements. Below is the example of the .conf file

boinc_tag           rosetta_cm
method              loopbuild_threading loopbuild_threading_cst_relax

# alignment filtering options
max_templates       20
max_pct_id          1.0
max_template_pct_id 1.0
max_e_value         1000.0
mini_compile_mode   release

# If MACOS, then use the clang compiler
mini_compiler       clang
# If LINUX, then use the gcc compiler
#mini_compiler       gcc

# standard boinc priority
priority 0

Here, max_templates represent the maximum number of templates to include, max_pct_id allows defining maximum percentage identity, max_template_pct_id represents maximum percentage identity of templates to query sequence, max_e_value is the threshold to filter based on HHsearch e-values, mini_compile_mode is the compilation mode (either release or debug), mini_compiler tells the protocol to use either gcc compiler (for UNIX/LINUX systems) or clang compiler (for MACOS).

Step3: Once the rosetta_cm.conf is adjusted, the following commands need to be executed to obtain the distance restraints:

$ cd CS-HM-Rosetta/cm_scripts/bin
$ ./predict_distances.pl ../../input/rtt.hhr ../../input/rtt.fasta -aln_format hhsearch -outfile rtt.cst

During the execution of predict distances script, a series of pdb structure are downloaded for estimating the distance restraints.

Step4: The distance restraints obtained in Step3 are applied during structure calculation performed using either of the protocols listed below:

Tutorial: CS-Rosetta
Tutorial: RASREC
Tutorial: AutoNOE

Example input and expected output files are provided under the CS-HM-Rosetta directory within the toolbox.


References