Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
2825839 | Trends in Plant Science | 2014 | 11 Pages |
•Use the Big Data technology to assist basic and translational research in plants.•Basic concepts and procedures, pitfalls, and remedies of using machine learning.•Demonstration of using machine learning for data-driven discovery of stress genes.
Rapid advances in high-throughput genomic technology have enabled biology to enter the era of ‘Big Data’ (large datasets). The plant science community not only needs to build its own Big-Data-compatible parallel computing and data management infrastructures, but also to seek novel analytical paradigms to extract information from the overwhelming amounts of data. Machine learning offers promising computational and analytical solutions for the integrative analysis of large, heterogeneous and unstructured datasets on the Big-Data scale, and is gradually gaining popularity in biology. This review introduces the basic concepts and procedures of machine-learning applications and envisages how machine learning could interface with Big Data technology to facilitate basic research and biotechnology in the plant sciences.