The following is intended as documentation on how to submit jobs to the RaptorX server and retrieve results. For detailed explanation of the algorithms deployed please see relevant papers listed at http://raptorx.uchicago.edu/about/#cite
The manual is composed of the following sections:
- Creating an Account and Submitting New Jobs
- How to create a user account and submit sequences for structure prediction
- Job Monitoring and Job Availability
- How to obtain information on the status of all your submitted jobs
- Predicted Secondary Structure
- How to interpret the data produced by the secondary structure prediction procedure
- Tertiary Structure and Functional Prediction
- How to interpret the data produced by the structure prediction procedure
- Disorder Prediction
- A disorder prediction procedure for the entire target sequence is performed for a structure prediction job
- Domain Parsing
- A domain parsing procedure for the entire target sequence is also performed for a structure prediction job
- Custom Alignment
- How to interpret the data produced by the custom alignment procedure.
2. Creating An Account And Submitting New Job
A new user account tied to your email address is automatically created when you submit your first prediction job. Job submission is done by clicking “New Job” in the top menu of this page. This will display a form where the user can use the tab menu to select between submitting an “Alignment Job” or a “Structure Prediction Job”.
The “Job Identification” section allows the user to provide a job name (default is ‘my job’) and an email address to be used for notification when the job has finished (if you are already logged in this field will be pre-filled). The email provided here will also serve as the username by which the job account is identified on the server for accessing results at a later time.
When your account is first created (after you submitted your first job) you are automatically logged into the server on the machine you are using at that point in time. If the login from a previous session has expired or the account needs accessed from a different machine on which it was initially created, you will need to go to the server front page http://raptorx.uchicago.edu and supply your account email in the login field on the right. Few minutes after submission of the form you will receive an email with a hyperlink to the page containing the jobs for the account.
The “Sequences for Prediction” section is where the user submits one or more sequences in FASTA format. The sequence(s) can either be supplied by copy-and-pasting into the text box or by uploading a flat text file containing the data.
The job parameters can be different depending on whether an “Alignment Job” or a “Structure Prediction Job” is being submitted.
For an “Alignment Job” indicate the structure(s) you wish the supplied sequence(s) to be aligned with in the “Align to These Structures” section. Enter the PDB ID in the text box and select the desired structure from the drop-down menu that appears. Repeat to add additional structures to the list. Under “Alignment Options”, check the types of alignment you wish to generate. The options given are:
- “Optimal pairwise alignment” which returns the best possible pairwise alignments between the target sequence and the selected templates.
- “Probabilistic sampling” which returns a user specified number of alternative alignments sampled according to the alignment probability distribution generated by the CRF model.
- “Multiple template alignment” which returns a multiple protein alignment between the selected templates and the input target sequence.
For a “Structure Prediction Job” specify the parameters in “Job Settings”. Specify the prediction type in the drop-down menu (select between doing “Structure prediction”, “Secondary structure prediction” or both) and whether to use multiple-template threading when multiple good templates are available for the target.
After all the above settings are done, press the submit button to queue the job on the server. Successful submission will redirect the user to a page of pending and finished jobs for the account used. Please note that each user can have no more than 20 sequences pending prediction at any point in time and a single job can contain at most 10 sequences. Further, the results of a job are only stored for 14 days after the job is completed.
3. Job Monitoring And Job Availability
To track pending and finished jobs the user needs to be logged in to the server. Refer to Section 2 of this manual for login instructions.
Once logged in to the server, selecting “My Jobs” in the menu at the top of the page display a job overview page similar to the one depicted below. Here the status of each prediction in the job is given along with overall information of the predictions being done for each sequence submitted. To track the job status in real-time simply refresh the page and the completion status of the prediction submitted for each sequence in a job will be updated. Clicking on a sequence name will take the user to the result page for this sequence.
4. Tertiary Structure And Function Prediction
Click on a structure job in the overview to display a summary page similar to the one depicted below (the number used in parentheses hereafter corresponds to the labels in the figure).
In a structure prediction job, a protein structure is built for each of the 10 top-ranked alignments between the target sequence and the structures in the template library. The interface provides the rank of the currently selected alignment result (1), with the highest ranked model being selected as default (that based on the best template). The quality of a selected structure model can be judged from the reported alignment score (2). The score falls between 0 and 100, with 100 indicating a perfect model. The PDB code of the template for the currently selected structure model is also provided which will take you to structure record at the Protein Data Bank http://www.pdb.org. You can switch between alternative models by clicking the “View Alternative Model”-button (4). Further, the full SCOP http://scop.berkeley.edu/ classification of the template for the currently selected model is given if available. Clicking the links will take you to the relevant record in the SCOP database (5).
A Jmol viewer providing a visualization of the currently selected model is loaded underneath. Using the mouse you can rotate and zoom on the structure. Right-clicking the model will bring up a menu of further options for changing the visualization (6). To the right of the structure viewer a menu for controlling the representation of the currently selected model is available. Here the user can zoom on the structure, switch between coloring modes, and provide a wire-frame display of the structure (7). Right-clicking the model will bring up a menu of further options for changing the visualization.
The alignment of the target and template sequence used for constructing the current model is displayed below the Jmol viewer. Each position in the alignment is color-coded according to the chemical nature of the residue. The scheme used is: Red=Hydrophobic, Blue=Acidic, Magenta=Basic, Green=Hydroxyl+Amine. A '*' under aligned residues signifies matching residues, while ':' signifies the aligned residues being in the same functional group. Hovering over aligned residues will highlight the target residue in the Jmol viewer (8).
The right-hand column provides information on the status of the prediction job (9) and links for download of the prediction results, including the PDB files for the 10 top-ranked with corresponding alignments, the set of alignments between the target sequence and all structures in the template library used, a list containing the complete ranking of all alignments in acceding order according to GDT score, and a BLAST search result of the target sequence against the non-redundant PDB database (10). Below the box with download links a brief user guide for the Jmol viewer is given (11).
5. Secondary Structure Prediction
Click on a secondary structure job in the overview to display a summary page similar to the one depicted below (the number used in parentheses hereafter corresponds to the labels in the figure).
Secondary structure prediction is provided in two modes, using both three state and eight state models. You can switch between the two modes using the blue tab-menu (1).
Hovering over a residue will display the exact distribution of secondary structure classes in a popup box appearing next to the residue (2). The legend for the color-coding of the structure classes can be found in the right-hand column (5).
The right-hand column provides information on the status of the prediction job (3); Links for download of the prediction results, including the full class distribution for both models and the most likely secondary class sequence from the three-state model in PSIRED-like format (Buchan, et al., 2010) (4).
6. Disorder Prediction
If a structure prediction job has been submitted, a disorder prediction for the entire target sequence is also done. Graphics comparable to those described for secondary structure prediction are used to visualize the probability that a given residue is either in a disorder segment (marked in red) or non-disorder segment (marked in blue). Hovering over the residue will display the exact probabilities.
7. Domain Parsing
If a structure prediction job has been submitted, RaptorX explore whether the target sequence appears to consist of multiple domains or is a single folding unit. If multiple domains are found the domain parsing results will be available in table format outlining the span of each segment, the Pfam family it is predicted to belong to, and the E-value for this assignmen
8. Custom Alignment
Click on an alignment job in the job overview to obtain a job summary page similar to the one depicted below (the numbers used in parentheses hereafter correspond to the labels in the figure).
Clicking the drop down box brings up a selection menu from witch it is possible to switch between alternative alignments (1). The alignment of the target and template sequences will be displayed after a selection is made and the “Display” button is pressed. Each position in the alignment is color-coded according to the chemical nature of the residue. The scheme used is: Red=Hydrophobic, Blue=Acidic, Magenta=Basic, Green=Hydroxyl+Amine (5). A '*' under aligned residues signifies matching residues, while ':' signifies the aligned residues being in the same functional group (2).
Similar to “Structure Prediction Job”, the right-hand column provides information on the status of the job (3) and links for download of the alignment results (4), including the set of alignments between the target sequence and all structures in the template library.