CompaRNA - on-line benchmarks of RNA structure prediction methods

Laboratory of Bioinformatics and Protein Engeneering

Home

Methods

Datasets

Rankings

RNA 2D Atlas

Help

FAQ

Contact us

RSS feeds

Twitter

RNA structure prediction methods tested by CompaRNA

CompaRNA tests 44 methods predicting RNA secondary structure. These methods include either tools installed locally on the server or available by the Internet.

35 methods for RNA secondary structure prediction are installed locally:

Method	Description	Size limit	Reference
Afold	Evaluates internal loops of RNA secondary structure with optimized nearest-neighbor model energy functions (version 11.01.2006).	---	Ogurtsov et al. 2006
Carnac	Combines three features: energy minimization, phylogenetic comparison and sequence conservation in order to predict an RNA secondary structure (version 2008, pre 0.34).	---	Touzet and Perriquet, 2004
CentroidAlifold	An extension of the CentroidFold program which takes as an input multiple sequences (version 0.0.9).	---	Hamada et al 2011
CentroidFold	Uses generalized centroid estimators which maximize the expected weighted true predictions of base pairs in the predicted structure (version 0.0.9).	---	Sato et al 2009
ContextFold	Uses rich parameterized machine learning models (over 70,000 free parameters - version 1.0).	---	Zakov and Ziv-Ukelson, 2011
Contrafold	Uses conditional log-linear models (CLLMs), a flexible class of probabilistic models which generalize upon stochastic context-free grammars (SCFGs) by using discriminative training and feature-rich scoring (version 2.02).	---	Do et al. 2006
DAFS	Simultaneous aligning and folding of RNA sequences by dual decomposition (version 0.0.2, Sep 15, 2012 + Vienna 1.8.5 + GLPK 4.45).	---	Sato et al., 2012
DotKnot	DotKnot is a heuristic method for pseudoknot prediction in a given RNA sequence. DotKnot extracts stem regions from the secondary structure probability dot plot calculated by RNAfold. Recursive H-type pseudoknots and intramolecular kissing hairpins are constructed and their presence in the sequence is verified. The detected pseudoknots can then be further analysed using bioinformatics or laboratory technique (ver 1.3.1).	---	Sperschneider and Datta, 2010
Fold	A program from the RNAstructure package for single sequence secondary structure prediction by free energy minimization (RNAstructure ver 5.3).	---	Reuter & Mathews, 2010
HotKnots	A heuristic algorithm which iteratively forms stable stems using a free energy minimization criterion to identify promising candidate stems (version 2.0).	500 nt	Renet al. 2005
IPknot	Predicts the maximum expected accuracy (MEA) structure using integer programming with a threshold cut (version 0.0.2).	---	Sato et al, 2011
MCFold	MC-Fold uses a Nucleotide Cyclic Motif (NCM) fusion process to generate a pool of secondary structures, from which the final prediction is selected (version Mar 17 2008 17:48:11).	---	Parisien & Major, 2008
MXScarna	Performs fast structural multiple alignment of RNA sequences using a progressive alignment based on the pairwise structural alignment algorithm of SCARNA (ver. 2.1).	---	Tabei and Asai, 2009
Mastr	Uses a MCMC sampling approach in a simulated annealing framework, where both structure and alignment is optimized by making small local changes. The score combines the log-likelihood of the alignment, a covariation term and the base pair probabilities (ver. 1.0).	---	Lindgreen et al. 2007
MaxExpect	A program from the RNAstructure package for secondary structure prediction by maximizing expected accuracy (RNAstructure ver 5.3).	---	Gloor & Matthews, 2009
McQFold	Markov Chain Monte Carlo (MCMC) sampling of secondary structures with pseudoknots (version 30.05.2006).	---	Metzler & Nebel 2008
Multilign	Finds the lowest free energy secondary structure common to more than two homologous sequences. Uses multiple iterations of Dynalign to predict the conserved structure (RNAstructure ver 5.3).	---	Xu and Matthews, 2010
Murlet	A variant of the Sankoff algorithm, which uses an efficient scoring system that reduces the time and space requirements (version 0.0.1).	---	Kiryu et al. 2007
PETfold_ver_20	Predicts the consensus RNA secondary structure from an RNA alignment (release from May 20, 2013, last modified at May 20, 2013). Uses Vienna 1.8.5.	---	Seemann et al, 2008
PETfold_ver_pre_20	Predicts the consensus RNA secondary structure from an RNA alignment (version 2.0pre - pre-release from Nov 3, 2011, last modified at Nov 3, 2011). Uses Vienna 1.8.5.	---	Seemann et al, 2008
PPfold	A new version of Pfold that can predict the consensus secondary structure of RNA alignments through a stochastic context-free grammar coupled to an evolutionary model (version 2.0).	---	Sükösd et al, 2011
Pknots	A dynamic programming algorithm for "optimal" RNA pseudoknot prediction (version 1.05). Uses the Turnes rules and finds the minimum free energy structure.	---	Rivas & Eddy, 1999
PknotsRG	PKnotsRG uses the same model that PKNOTS but instead of finding the optimal minimum free energy, it applies heuristic approach. It does not guarantee to find the mininum free energy structure. PknotsRG is dedicated to pseudoknot prediction (version 1.03).	---	Reeder et al. 2007
ProbKnot	A program from the RNAstructure package for fast prediction of RNA secondary structure including pseudoknots. Assembles maximum expected accuracy structures from computed base-pairing probabilities (RNAstructure ver 5.3).	---	Bellaousov & Matthews, 2010
RNASLOpt	Predicts stable, locally optimal secondary structures represented by stack configurations (version 2011-11-01).	---	Li and Zhang, 2011
RNASampler	A sampling-based program that predicts common RNA secondary structure motifs in a group of related sequences (ver. 1.3).	---	Xu and Stormo, 2007
RNAalifold	Computes the minimum free energy structure that is simultaneously formed by a set of aligned sequences. Additionally, uses sophisticated handling of alignment gaps, and RIBOSUM-like scoring matrices (Vienna Package ver. 1.8.3).	---	Bernhart et al. 2008
RNAfold	RNA structure prediction program that comes with the Vienna package. Predicts MFE structures and base pair probabilities based on the dynamic programming algorithm originally developed by M. Zuker and P. Stiegler. The partition function algorithm is based on work by McCaskill (Vienna Package ver. 1.8.3).	---	Hofacker 2004
RNAshapes	Unique suboptimal structures (shapes) are selected based on an abstract representation of RNA secondary structure which is inspired by the dot bracket representation known from the Vienna RNA package. The user can choose from five different types of shape resolution corresponding to different abstraction levels (version 2.1.5).	500 nt	Steffen et al. 2006
RNAsubopt	Calculates all suboptimal secondary structures within a user defined energy range above the MFE (Vienna Package ver. 1.8.3).	500 nt	Hofacker 2004
RNAwolf	Predicts an extended structure (including non-canonical base-pairs and structures composed of 2-diagrams). The allowed base-pairs can contain all 4x4 nucleotides and the nucleotide bonds are explicitly annotated with the paired edges and isostericity information (version 0.3.2.0).	2000 nt	Höner zu Siederdissen et al, 2011
RSpredict	Takes into account sequence covariation and employs effective heuristics for improving accuracy (ver. 26.05.2009).	---	Spirollari et al. 2009
Sfold	Statistical sampling of all possible structures. The sampling is weighted by partition function probabilities. The ensemble centroid (EC) structure is used as a final prediction (ver. 2.0-20080807).	---	Ding et al. 2004
TurboFold	The base pairing probabilities for a sequence are estimated by combining intrinsic information, derived from the sequence itself via the nearest neighbor thermodynamic model, with extrinsic information, derived from the other sequences in the input set. For a given sequence, the extrinsic information is computed by using pairwise-sequence-alignment-based probabilities for co-incidence with each of the other sequences, along with estimated base pairing probabilities, from the previous iteration, for the other sequences (RNAstructure ver 5.3).	---	Harmanci et al, 2011
UNAFold	An integrated collection of programs that simulate folding, hybridization, and melting pathways for one or two single-stranded nucleic acid sequences. Folding (secondary structure) prediction for single-stranded RNA or DNA combines free energy minimization, partition function calculations and stochastic sampling (version 3.8).	8000 nt	Markham and Zuker, 2008

There are different length limits of RNA sequences for methods predicting RNA secondary structure. This is an important limitation to keep in mind when evaluating the results of tests run on the above-listed methods. This limitation is due to the CompaRNA server restrictions and not restrictions imposed by the above-listed methods. In the case of RNAsubopt the number of possible suboptimal RNA secondary structures grows exponentially with the sequence length (as stated by the authors of the method) and the computation cost for sequences >500 nt becomes prohibitive for a regular workstation. In the case of RNAshapes the calculation of RNA secondary structure for sequence longer than 500nt is not possible because of the execution time. For HotKnots this limitation has been set because it fails to run successfully for RNAs longer, than 500nt.

CompaRNA also tests 9 web server methods for RNA secondary structure prediction:

Method	Description	Size limit	Reference
Alterna	Dynamic programming algorithm that minimizes the energy density sum and free energy of an RNA structure.	100 nt	Aksay et al. 2007
CMfinder	CMfinder is a RNA motif prediction tool. This tool performs well on unaligned sequences with long extraneous flanking regions, and in cases when the motif is only present in a subset of sequences. It is an expectation maximization algorithm using covariance models for motif description, carefully crafted heuristics for effective motif search, and a novel Bayesian framework for structure prediction combining folding energy and sequence covariation. CMfinder also integrates directly with genome-scale homology search, and can be used for automatic refinement and expansion of RNA families.	500 nt	Yao et al, 2006
CRWrnafold	A new version of RNAfold utilizing statistical potentials derived from comparative data.	3000 nt	Gardner et al, 2011
CentroidHomfoldLAST	An extension of the CentroidFold program using additional homologous sequences collected automatically by the LAST program (gamma = 4).	500 nt	Hamada et al, 2011
Cylofold	Simulates the folding process in a coarse-grained manner by choosing helices based on established energy rules. The steric feasibility of the chosen set of helices is checked during the folding process using a highly coarse-grained 3D model of the RNA structures.	300 nt	Bindewald et al. 2010
NanoFolder	Predicts the base pairing of potentially pseudoknotted multistrand RNA nanostructures.	2000 nt	Bindewald et al, 2011
RDfolder	RNA folding by energy weighted Monte Carlo simulation.	100 nt	Ying et al. 2004
Vsfold4	Uses di-nucleotide pairing energies for short-range interactions and for long-range entropy interactions an entropy-loss model based on the accumulated sum of the entropy of bonding between each base-pair weighted inversed by the correlation of the RNA sequence (the Kuhn length).	449 nt	Dawson et al. 2006
Vsfold5	An upgraded version of Vsfold4 capable of predicting pseudoknots.	449 nt	Dawson et al. 2006