Article ID Journal Published Year Pages File Type
6269245 Journal of Neuroscience Methods 2012 9 Pages PDF
Abstract

Modern neuroscientific research stands on the shoulders of countless giants. PubMed alone contains more than 21 million peer-reviewed articles with 40-50,000 more published every month. Understanding the human brain, cognition, and disease will require integrating facts from dozens of scientific fields spread amongst millions of studies locked away in static documents, making any such integration daunting, at best. The future of scientific progress will be aided by bridging the gap between the millions of published research articles and modern databases such as the Allen brain atlas (ABA). To that end, we have analyzed the text of over 3.5 million scientific abstracts to find associations between neuroscientific concepts. From the literature alone, we show that we can blindly and algorithmically extract a “cognome”: relationships between brain structure, function, and disease. We demonstrate the potential of data-mining and cross-platform data-integration with the ABA by introducing two methods for semi-automated hypothesis generation. By analyzing statistical “holes” and discrepancies in the literature we can find understudied or overlooked research paths. That is, we have added a layer of semi-automation to a part of the scientific process itself. This is an important step toward fundamentally incorporating data-mining algorithms into the scientific method in a manner that is generalizable to any scientific or medical field.

► Understanding the human brain, cognition, and disease will require integrating millions of facts from dozens of fields. ► The peer-reviewed neuroscientific literature contains millions of articles, making any such integration daunting. ► Text-mining the peer-review literature allows us to automatically and statistically identify relationships between neuroscientific concepts. ► We introduce an algorithm that identifies possible new hypotheses. ► We have added a layer of semi-automation to a part of the scientific process itself by finding statistical anomalies in the peer-reviewed literature. ► We combined our data with the massive gene expression database made public with the Allen Brian Atlas to uncover biases in neuroscientific research.

Related Topics
Life Sciences Neuroscience Neuroscience (General)
Authors
, ,