Calculations with (NOE) distance restraints

2.4. Adding NOE distance upper limit files

Adding distance restraints in CYANA .upl format is straightforward

$ setup_target -target SgR145 -method rasrec -cyana_upl SgR145.manual.upl SgR145.final.upl 
[ ... ]
============================================================
STORED: method options as new setup 'rasrec_standard'...
- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -
Method: rasrec
Choosen Options: 
   cs nmr_data/SgR145.tab
   cyana_upl nmr_data/SgR145.manual.upl nmr_data/SgR145.final.upl
   fasta fragments/SgR145.fasta
   frags fragments/SgR145.frags3.dat.gz fragments/SgR145.frags9.dat.gz
   rdc nmr_data/SgR145.med1.rdc nmr_data/SgR145.med2.rdc
------------------------------------------------------------  

Having all restraint files added we proceed by generating a CS-ROSETTA calculation directory.

$ setup_run -target SgR145 -method rasrec -dir run_with_restraints -job slurm 

In the output of this command we find the trace from conversion of the CYANA upl files, which were based on the untrimmed sequence that is 20 residues longer at the N-terminus. As seen below the sequence offset was detected correctly.

[...]
convert cycana-upl files to Rosetta cst ...
trying to match the following sequences
-------------------------------KILCLA----------------VTAVDQ-------------------T-V-S-------------GIVS-----------------------G-VFIL-G-----------------------------------I-N-L-R-----------A-LIQLLG-K
LVSVANQIPQGKILCLAEGEGRNACFLASLGYEVTAVDQSSVGLAKAKQLAQEKGVKITTVQSNLADFDIVADAWEGIVSIFCHLPSSLRQQLYPKVYQGLKPGGVFILEGFAPEQLQYNTGGPKDLDLLPKLETLQSELPSLNWLIANNLERNLDEGAYHQGKAALIQLLGQKLEH
KILCLA  is in  LVSVANQIPQGKILCLAEGEGRNACFLASLGYEVTAVDQSSVGLAKAKQLAQEKGVKITTVQSNLADFDIVADAWEGIVSIFCHLPSSLRQQLYPKVYQGLKPGGVFILEGFAPEQLQYNTGGPKDLDLLPKLETLQSELPSLNWLIANNLERNLDEGAYHQGKAALIQLLGQKLEH
with offset  20
['inputs/flags_rasrec', 'inputs/nmr_data/SgR145.manual_noQF.cst', 'inputs/nmr_data/SgR145.manual_QFall.cst', 'inputs/setup_init.tpb']
convert cycana-upl files to Rosetta cst ...
trying to match the following sequences
-----------V--------LVSVANQI-QGKILCLA----RNACFLASLGYEVTAVDQSSVGLAKAKQLAQEKGVKITTVQSNLADFDIVADAW-GIVSIFCHL-SSLRQQLY-KVYQGL--GGVFILEGFA-EQLQYNTGG-KDLDLL-KLETLQSEL-SLNWLIANNLERNL--------KAALIQLLGQKLE
LVSVANQIPQGKILCLAEGEGRNACFLASLGYEVTAVDQSSVGLAKAKQLAQEKGVKITTVQSNLADFDIVADAWEGIVSIFCHLPSSLRQQLYPKVYQGLKPGGVFILEGFAPEQLQYNTGGPKDLDLLPKLETLQSELPSLNWLIANNLERNLDEGAYHQGKAALIQLLGQKLEH
KVYQGL  is in  LVSVANQIPQGKILCLAEGEGRNACFLASLGYEVTAVDQSSVGLAKAKQLAQEKGVKITTVQSNLADFDIVADAWEGIVSIFCHLPSSLRQQLYPKVYQGLKPGGVFILEGFAPEQLQYNTGGPKDLDLLPKLETLQSELPSLNWLIANNLERNLDEGAYHQGKAALIQLLGQKLEH
with offset  20
[....]
Method  rasrec  has been setup in run_with_restraints/SgR145 ...
Enjoy!

During the generation of the RUN directory, the upl-files were trimmed and converted to ROSETTA cst-format. For CYANA UPL files the quality factor (labelled #QF in the upl files; aka the probability that the restraint is correctly assigned) is used to separate restraints roughly into strong (QF=1) and weak (QF

$ ls run_with_restraints/SgR145/nmr_data
SgR145.final_noQF.cst  SgR145.final_noQF.cst.centroid  SgR145.final_QFall.cst  SgR145.final_QFall.cst.centroid  SgR145.manual_noQF.cst

The mapping method is chosen by

$ setup_run -target SgR145 -method rasrec -dir run_with_restraints -job slurm -cst_map_mode simple

The default mapping simple adds 1A for each CEN-mapped atom of a restraint to the upper-limit distance and also scales down the force-constant of the respective restraint by adding 1A to the standard deviation (which starts at 0.3A for unmapped restraints). This scales the force-constant to ca. 25% of its initial value for the first mapped atom and down another 40% for the second mapped atom. This scheme is admittedly pretty simple and to some extent arbitrary. However, we got good results for restraints sets resulting from methyl (ILV) labelled deuterated protein samples[1]. In this case more complicated schemes (experimental choices: aadep,aadep_padonly,aadep_mid,aadep_mid_sd,aadep_mid_sdfix) that atempted to take into account different mapping distances for different types of aminoacid sidechains could not improve upon the simple scheme. The experimental choices might, however, be advantageous when dealing with more diverse sets of restraints than a purely ILV labelled sample. However, until further benchmarking results are obtained the aadep_XXX choices cannot be recommended based on any kind of data.

2.5 Adding arbitrary restraint data directly in ROSETTA cst format

You can also add any other restraint data when it is already converted to ROSETTA cst format using

$ setup_target -method rasrec -target SgR145 -restraints rosetta.cst.fa -centroid_stage_restraints rosetta.cst.centroid

Note that while a CYANA upl restraint file will always be automatically mapped to CEN atoms if side-chain atoms are included you can switch of automatic mapping for restraint files given by option -restraint.

$ setup_run -method rasrec -target SgR145 -nocst_mapping

will render a file given by -restraints that contains side-chain atoms unavailable for ROSETTA sampling in centroid mode.


References