Schudoma et al.,
Nucl. Acids Res. 38: 970-980.
|RLooM - Help|
Currently the RLooM database can be searched or browsed using the query form at http://rloom.mpimp-golm.mpg.de/cgi-bin/index.py. To browse the database, simply click on one of the four links Hairpins, Single Strand Segments, Internal Loops, or Multiloops. This results in a list of sets of loops of the chosen type grouped by length -- or in case of multiloops, number of stems (which will display the above mentioned length-ordered list upon selection). Selecting one of those sets yields a list of available cluster sets for the selected set. Selection of such a cluster set results in a list of the clusters within the set, which can then be viewed in more detail.
Searching the database requires the submission of either a nucleotide sequence (A, C, G, T, U, I + ambiguity codes) and optionally a tolerated number of mismatches, or a Protein Data Bank id with optional chain id. A sequence-based query will search the database for all loops with the given sequence using exact matching for plain sequences or a regular expression-based search for sequences containing ambiguity codes. In case of a pdb-id-based query the search will return all known loops within the respective pdb structure.top
A valid pdb-id contains of a digit followed by three alphanumerical characters. The id is case-insensitive. Optionally, a (case-sensitive) chain-id with a leading ':' can be attached to the pdb-id. This limits the search to loops found in the specified chain.top
The RLooM database stores for each loop, information directly obtained from its source pdb file or computed using external tools. Here, we describe the individual records contained on the result page for a loop cluster/loop.
For loop clusters the title bar contains the loop-length, loop-type and cluster id for a cluster, as well as the cutoff used for forming the cluster. For individual loops then length, loop-type and loop id are given. For loop clusters the section Representative Structure provides information on the structure chosen to represent the cluster. Pages for individual loops contain the same fields:
The next four fields are obtained by scanning the loop structure with the program MC-Annotate.
The following sections provide visualizations of the loop structure. The 3D Structure is displayed using the Java Applet Jmol, while the Structure Graph is computed by a tool developed at our group.
The Cluster Members (cluster only) section shows all members of the cluster, sorted and grouped by their base-sequence. For individual loops, the Structural Clusters selection allows browsing the different clusters the loop belongs to in different cluster sets.
In addition to traditional database search, the RLooM database allows submitting a 3D RNA structure (e.g. from homology modeling) which is then scanned for unpaired regions (=loops). The database then proposes certain loop structures that geometrically fit best to either replace the current loop or add a new one for instance at the end of a helix.
The user uploads a PDB file containing the coordinates of an RNA structure.It is important that the PDB complies to current PDB format standards. This especially holds for the column boundaries of ATOM/HETATM records, which must not be overstepped. For improving the quality and usability of RLooM, we kindly ask you to send us your PDB file in case RLooM cannot handle it properly.
If the user already has an idea of the location where the new loop should be inserted, they can submit an XML-like modeling query straightforwardly (s. below for a description). Otherwise, they can let RLooM scan the submitted structure for suitable anchor locations. The latter case will yield a list of anchor locations. Clicking the items of the list enters a portion of script-code describing the location of the unpaired region into the query textfield. The user then has to specify a sequence for the new loop and optionally give some additional parameters within the script.
Three parameters can be adjusted directly at the form: the cluster set that should be used, the maximum distance between the anchors of a loop and a target structure such that the inserted loop gives a valid model, and the threshold distance defining when a clash occurs between the new loop and the target molecule.top
Modeling loops using the RLooM application is performed using a simple XML-like script language -- RLML. Three parameters can be adjusted: the template data set that should be used, the maximum distance between the anchors of a loop and a target structure such that the inserted loop gives a valid model, and the threshold distance defining when a clash occurs between the new loop and the target molecule.
A single command is enclosed between tags specifying the loop-type of the query.
<x>...</x>, with x = hairpin|segment|internal|multiloop
Each command has a number of anchors (hairpins/segments:2, internal loops:4, multiloop:6+):
<anchor>ANCHOR ID</anchor>, with ANCHOR ID = RI:C, R=resSeq, I=iCode, C=chainID
The anchor-tag has an optional parameter id, which can be used for specifying the sequence of the anchors. By default, <anchor>-tags are processed in order of appearance.
Finally, each command requires a query:
<query>SEQUENCE</query>, with SEQUENCE being a nucleotide sequence (wildcards are allowed.)
The <query>-tag has three optional parameters: k, force, and mcsearch. The parameter k specifies the tolerated number of mismatches, force denotes whether suitable candidate loops with a different sequence than the query shall be artificially mutated to match the query sequence. The parameter mcsearch, if set to true, allows a valid MC-Search script (see example pattern below, for details see e.g. http://major.iric.ca, or study the output of MC-Annotate) to be submitted instead of the query sequence. By default, k is set to 0, force to false, and mcsearch to true.
The optional <remodel>- tag specifies a non-wildcard nucleotide sequence that loop candidates should be mutated into (<remodel>SEQUENCE</remodel>.)
A sample pattern that searches for tetraloop hairpins of the GNRA sequence motif with two arbitrary flanking bases, any base pair between the first and last loop bases and including two intraloop base stacks:
sequence (A0 NGNRAN)
Please cite our 2010 Nucleic Acids Research article:
Schudoma, C, May P, Nikiforova, V, and Walther, D