You don't have javascript enabled. The web-server may behave improperly.

New query
Download
References
News
Server status
Example results
PconsC2
Privacy policy
Help

Your recent jobs:

Queued    0
Running    0
Finished   0
Failed    0






Help



1. Summary


Direct-coupling based contact prediction methods (DCA) enable de-novo structure prediction of proteins with an unprecedented accuracy. However, these methods require thousands of protein sequences to achieve high accuracy, limiting their usability. Here, we introduce PconsC3, a method to accurately predict contacts for families an order of magnitude smaller than feasible with DCA, thus increasing accurate contact predictions from 12% to 54% of all protein domain families with unknown structure. Input features comprise contact predictions by plmDCA, GaussDCA as well as PhyCMAP, secondary structure prediction by PSIPRED 3.0, and solvent accessibility prediction by NetSurfP 1.1. In PconsC3 PhyCMAP can be replaced by another contact predictor and we have successfully used CMapPro with similar accuracy. Additionally, CD-HIT is run to generate statistics about the alignment (i.e. alignment depth at different sequence similarity cut-offs). The initial layer of PconsC3 takes these features as input and uses a random forest to predict a score for each possible contact. On contrary to previous work, PconsC3 applies pattern recognition already in the first layer. This results in an intermediate contact map. Every following layer uses all the initial features plus the output from the previous layer, given as a window of 11 by 11 residues around the current contact.

This work used the EGI infrastructure with the support of IN2P3-IRES, INFN-CLOUD-BARI and TR-FC1-ULAKBIM.

Fig. 1: Workflow of PconsC3.


2. Usage


Input to the server is one or several (upto five) amino acid sequence(s) in FASTA format. The user can either paste sequences in the text-area provided, or, alternatively, upload a file containing your sequences.

Example input:
>1qj8A
ATSTVTGGYAQSDAQGQMNKMGGFNLKYRYEEDNSPLGVIGSFTYTEKSRTASSGDYNKN
QYYGITAGPAYRINDWASIYGVVGVGYGKFQTTEYPTYKNDTSDYGFSYGAGLQFNPMEN
VALDFSYEQSRIRSVDVGTWIAGVGYRF



3. Output


The server outputs the contact predictions using PconsC3 method in plain text format. The predicted contacts are also displayed graphically. Additionally, a zipped folder including the input sequence, predicted contacts (plain text and graphical representation) and multiple sequence alignments is also downloadable.



4. References


Skwark MJ, Michel M, Hurtado DM, Ekeberg M, Elofsson A. "Accurate contact predictions for thousands of protein families using PconsC3."
Skwark MJ, Raimondi D, Michel M, Elofsson A (2014) "Improved Contact Predictions Using the Recognition of Protein Like Contact Patterns." PLoS Computational Biology 10(11).


5. Contact


Arne Elofsson group

Department for Biochemistry and Biophysics
The Arrhenius Laboratories for Natural Sciences
Stockholm University
SE-106 91 Stockholm, Sweden

Science for Life Laboratory
Box 1031, 17121 Solna, Sweden

E-mail:   arne@bioinfo.se
Phone:   (+46)-8-16 4672
Fax:   (+46)-8-15 3679