Article ID Journal Published Year Pages File Type
6901420 Procedia Computer Science 2017 8 Pages PDF
Abstract
The Human Phenotype Ontology (HPO) project aims to provide a standardized vocabulary of phenotypic abnormalities encountered in human disease. Each term in the HPO describes a phenotypic abnormality, such as atrial septal defect. The HPO is currently being developed using the medical literature, Orphanet, DECIPHER, and OMIM. One of the project goals is to translate HPOs into other languages. Unfortunately, this process takes a very long time and is expensive, especially that the FACIT translation methodology is preferable. In this research, we propose translation acceleration using machine translation system which we train. We prove that facilitating freely available machine translation engines and parallel textual resources makes it possible to greatly reduce translation costs and time requirements. Even that our resulting machine translation system performed not too well. Its BLEU score was equal to 26.17 which means that the translations were supposed to be understandable but they were far from publishable quality. This was anticipated as HPOs are very hard and specific, with a very problematic vocabulary that was hardly present in training data. Nonetheless, the translation quality was not as bad to be completely ignored. That is why the decision was made to ask a translation company for opinion. Summing up we successfully reduced money and time costs of the HPO translation process by the factor of 2 and 3 respectively.
Related Topics
Physical Sciences and Engineering Computer Science Computer Science (General)
Authors
, ,