Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
10356085 | Journal of Biomedical Informatics | 2012 | 11 Pages |
Abstract
⺠Annotated documents are necessary for NLP machine learning, modeling and testing. ⺠We create a method to determine a required sample size for the annotation set. ⺠The probability of word capture from a corpus provides the basis for the method. ⺠Dictation letters from a pain management medical practice are used as an example. ⺠We also demonstrate steps for creating a representative sample of dictations.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Science Applications
Authors
David Juckett,