| Article ID | Journal | Published Year | Pages | File Type | 
|---|---|---|---|---|
| 532993 | Pattern Recognition | 2005 | 11 Pages | 
Abstract
												This paper is devoted to the techniques of clustering of texts based on the comparison of vocabularies of N-grams. In contrast to the regular N-grams approach, the proposed N-grams method is based on calculation of imperfect occurrences of N-grams in a text up to a number of mismatched strings. We demonstrated that such an approach essentially improves the resolving capacity of the N-grams method for DNA texts. Additionally, we discuss a mutual usage scheme of different clustering technique types to verify the partition quality.
Keywords
												
											Related Topics
												
													Physical Sciences and Engineering
													Computer Science
													Computer Vision and Pattern Recognition
												
											Authors
												Z. Volkovich, V. Kirzhner, A. Bolshoy, E. Nevo, A. Korol, 
											