کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
431157 688287 2007 11 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Character sets of strings
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر نظریه محاسباتی و ریاضیات
پیش نمایش صفحه اول مقاله
Character sets of strings
چکیده انگلیسی

Given a string S over a finite alphabet Σ, the character set (also called the fingerprint  ) of a substring S′S′ of S   is the subset C⊆ΣC⊆Σ of the symbols occurring in S′S′. The study of the character sets of all the substrings of a given string (or a given collection of strings) appears in several domains such as rule induction for natural language processing or comparative genomics. Several computational problems concerning the character sets of a string arise from these applications, especially:(1)Output all the maximal locations of substrings having a given character set.(2)Output for each character set C occurring in a given string (or a given collection of strings) all the maximal locations of C.Denoting by n   the total length of the considered string or collection of strings, we solve the first problem in Θ(n)Θ(n) time using Θ(n)Θ(n) space. We present two algorithms solving the second problem. The first one runs in Θ(n2)Θ(n2) time using Θ(n)Θ(n) space. The second algorithm has Θ(n|Σ|log|Σ|)Θ(n|Σ|log|Σ|) time and Θ(n)Θ(n) space complexity and is an adaptation of an algorithm by Amir et al. [A. Amir, A. Apostolico, G.M. Landau, G. Satta, Efficient text fingerprinting via Parikh mapping, J. Discrete Algorithms 26 (2003) 1–13].

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Discrete Algorithms - Volume 5, Issue 2, June 2007, Pages 330–340
نویسندگان
, , , ,