CompaRNA - on-line benchmarks of RNA structure prediction methods
Home

Methods
Datasets
Rankings
RNA 2D Atlas

Help
FAQ

Contact us
RSS feeds
Twitter

All RNAs from the RNAstrand dataset
robustness test for Rfam families


  All RNAs from RNAstrand dataset were used to benchmark RNA secondary structure prediction methods.

Input for comparative methods consisted of a sequence collection / an alignment containing the query sequence and all sequences from a given seed alignment for an Rfam family identified for the query.
 
   Updated: Oct. 3, 2012
   Base pair definition: extended
   Type of RNA structures: all (including pseudoknotted)
   RNA sequence length range: 20, 30000
   Number of RNA sequences: 1242
   Robustness test: yes, on RNAs with Rfam family assigned

In the summary below ranks have been assigned only to methods, for which at least 40% of comparisons with other methods is valid (i.e. at least 40% (in case of this ranking: 21) of the comparisons were based on at least 10 predictions for common targets).

The ratio 40% / 60% was chosen in order not to bias methods for which not enough predictions have been collected.


Summary of pairwise comparisons

54 methods predicting RNA secondary structure were compared with each other (thus each method has 53 comparisons with other methods (sum of values from columns "Wins", "Losses", "=" and "?" in each row)).

Legend:

= draw - it is assigned when both methods have generated >= 10 predictions for common targets if:
  1) the accuracies of their results are statistically not distinguishable (P-value greater than 0.001),
  or
  2) the numbers of base pairs classified to categories True Positivies (TP), True Negatives (TN), False Positivies (FP, including 3 subcategories) and False Negatives (FN) for both methods are the same.
? two methods cannot be compared ("no winner") - not enough predictions for a given pair of methods (minimum is 10)
N/A a method for which more than 32 (60%) of comparisons with other methods is invalid (see column "?").

Rank Method Name Trained Wins Losses = ? Predictions attempted Predictions generated MCC
1 ContextFold yes 53 0 0 0 1242 1242 0.754
2 PETfold_pre2.0(seed) no 52 1 0 0 1242 1224 0.737
3 CentroidAlifold(seed) yes 51 2 0 0 1242 706 0.611
4 IPknot yes 50 3 0 0 1242 1242 0.607
5 PETfold_pre2.0(20) no 49 4 0 0 1242 639 0.575
6 CentroidFold yes 47 5 1 0 1242 1241 0.571
6 CentroidAlifold(20) yes 47 5 1 0 1242 642 0.573
8 CentroidHomfold‑LAST yes 46 7 0 0 1242 1146 0.560
9 Contrafold yes 45 8 0 0 1242 1242 0.556
10 MXScarna(seed) no 44 9 0 0 1242 728 0.536
11 RNAalifold(20) no 43 10 0 0 1242 642 0.532
12 Sfold no 42 11 0 0 1242 1242 0.527
13 RNAalifold(seed) no 41 12 0 0 1242 700 0.523
14 MaxExpect yes 40 13 0 0 1242 1242 0.516
15 ProbKnot yes 39 14 0 0 1242 1242 0.512
16 MXScarna(20) no 38 15 0 0 1242 642 0.500
17 Fold no 36 16 1 0 1242 1242 0.493
17 UNAFold no 36 16 1 0 1242 1242 0.494
19 RNAfold no 35 18 0 0 1242 1242 0.485
20 PknotsRG no 34 19 0 0 1242 1211 0.474
21 CRWrnafold yes 33 20 0 0 1242 1188 0.430
22 RNAsubopt no 32 21 0 0 1242 1146 0.422
23 TurboFold(20) no 31 22 0 0 1242 565 0.419
24 PPfold(20) yes 30 23 0 0 1242 558 0.416
25 Afold no 29 24 0 0 1242 716 0.408
26 Carnac(20) no 28 25 0 0 1242 642 0.401
27 McQFold yes 27 26 0 0 1242 1159 0.389
28 RSpredict(20) no 26 27 0 0 1242 642 0.377
29 RNAshapes no 25 28 0 0 1242 1113 0.367
30 RNASLOpt no 24 29 0 0 1242 1077 0.364
31 Multilign(20) no 23 30 0 0 1242 534 0.350
32 Murlet(20) yes 22 31 0 0 1242 504 0.321
33 RNAwolf yes 21 32 0 0 1242 1159 0.310
34 RSpredict(seed) no 20 33 0 0 1242 702 0.295
35 CMfinder(20) yes 19 34 0 0 1242 561 0.292
36 Vsfold4 no 18 35 0 0 1242 1131 0.270
37 Vsfold5 no 17 36 0 0 1242 1124 0.260
38 RNASampler(20) no 16 37 0 0 1242 393 0.246
39 HotKnots no 15 38 0 0 1242 498 0.212
40 Mastr(20) yes 14 39 0 0 1242 592 0.187
41 Cylofold no 13 40 0 0 1242 466 0.178
42 Pknots no 12 41 0 0 1242 361 0.165
43 Alterna no 11 42 0 0 1242 360 0.145
44 MCFold yes 9 43 1 0 1242 410 0.138
44 RDfolder no 9 43 1 0 1242 360 0.138
46 CMfinder(seed) yes 7 45 1 0 1242 129 0.111
46 Murlet(seed) yes 7 45 1 0 1242 123 0.110
48 TurboFold(seed) no 6 47 0 0 1242 51 0.106
49 PPfold(seed) yes 5 48 0 0 1242 63 0.097
50 Carnac(seed) no 4 49 0 0 1242 269 0.093
51 RNASampler(seed) no 3 50 0 0 1242 98 0.088
52 Mastr(seed) yes 2 51 0 0 1242 644 0.072
53 NanoFolder yes 1 52 0 0 1242 192 0.068
54 Multilign(seed) no 0 53 0 0 1242 22 0.059


Detailed results of pairwise comparisons between methods

Legend:

+ method on the left scored higher in this pairwise comparison
- method on the left scored lower in this pairwise comparison
= draw - it is assigned when both methods have generated >= 10 predictions for common targets if:
  1) the accuracies of their results are statistically not distinguishable (P-value greater than 0.001),
  or
  2) the numbers of base pairs classified to categories True Positivies (TP), True Negatives (TN), False Positivies (FP, including 3 subcategories) and False Negatives (FN) for both methods are the same.
? two methods cannot be compared ("no winner") - not enough predictions for a given pair of methods (minimum is 10)

P-values were obtained using Wilcoxon signed-rank test taking 2 sets of 40 MCC values obtained for 40 random subsets of 90% of the dataset for each pair of methods. If the P-value is lower than 0.001 and there are at least 10 sequences on which benchmark was performed, the difference between the performance of two methods is assumed to be statistically sound.



 
ContextFold
PETfold_pre2.0(seed)
CentroidAlifold(seed)
IPknot
PETfold_pre2.0(20)
CentroidFold
CentroidAlifold(20)
CentroidHomfold‑LAST
Contrafold
MXScarna(seed)
RNAalifold(20)
Sfold
RNAalifold(seed)
MaxExpect
ProbKnot
MXScarna(20)
UNAFold
Fold
RNAfold
PknotsRG
CRWrnafold
RNAsubopt
TurboFold(20)
PPfold(20)
Afold
Carnac(20)
McQFold
RSpredict(20)
RNAshapes
RNASLOpt
Multilign(20)
Murlet(20)
RNAwolf
RSpredict(seed)
CMfinder(20)
Vsfold4
Vsfold5
RNASampler(20)
HotKnots
Mastr(20)
Cylofold
Pknots
Alterna
MCFold
RDfolder
Murlet(seed)
CMfinder(seed)
TurboFold(seed)
PPfold(seed)
Carnac(seed)
RNASampler(seed)
Mastr(seed)
NanoFolder
Multilign(seed)

Matthews Correlation Coefficients (MCC) plotted for all methods in a ranking. MCCs were calculated by taking into account all reference and predicted RNA structures by a given method in the entire ranking. The plot includes only methods for which at least 40% of comparisons with other methods is valid (i.e. at least 40% of the comparisons were based on more than 10 predictions for common targets). The number on the right to each bar, corresponds to the percent of RNAs for which a given method managed to generate a prediction in 24 hours when run on a single processor (2.4 GHZ).