کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
5907776 | 1160871 | 2013 | 8 صفحه PDF | دانلود رایگان |
- A novel algorithm of combined random forest and support vector machine is proposed.
- The context sequences provide important information for prediction of active siRNAs.
- The accuracy would be significantly improved when considering the context effect.
For a successful RNA interference (RNAi) experiment, selecting the small interference RNA (siRNA) candidates which maximize the knock down effect of the given gene is the critical step. Although various computational approaches have been attempted, the design of efficient siRNA candidates is far from satisfactory yet. In this study, we proposed a novel feature selection algorithm of combined random forest and support vector machine to predict active siRNAs. Using a publically available dataset, we demonstrated that the predictive accuracy would be markedly improved when the context sequence features outside the target site were included. The Pearson correlation coefficient for regression is as high as 0.721, compared to 0.671, 0.668, 0.680, and 0.645, for Biopredsi, i-score, ThermoComposition21 and DSIR, respectively. It revealed that siRNA-target interaction requires appropriate sequence context not only in the target site but also in a broad region flanking the target site.
Journal: Genomics - Volume 102, Issue 4, October 2013, Pages 215-222