Discriminant document embeddings with an extreme learning machine for classifying clinical narratives

Article ID	Journal	Published Year	Pages	File Type
6864735	Neurocomputing	2018	32 Pages	PDF

Abstract

The unstructured nature of clinical narratives makes them complex for automatically extracting information. Feature learning is an important precursor to document classification, a sub-discipline of natural language processing (NLP). In NLP, word and document embeddings are an effective approach for generating word and document representations (vectors) in a low-dimensional space. This paper uses skip-gram and paragraph vectors-distributed bag of words (PV-DBOW) with multiple discriminant analysis (MDA) to arrive at discriminant document embeddings. A kernel-based extreme learning machine (ELM) is used to map the clinical texts to the medical code. Experimental results on clinical texts indicate overall improvement especially for the minority classes.

Keywords

Clinical narratives Multiple discriminant analysis Word embeddings Document classification Extreme learning machines Feature learning