کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
536152 870473 2016 6 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Using a novel clumpiness measure to unite data with metadata: Finding common sequence patterns in immune receptor germline V genes
ترجمه فارسی عنوان
استفاده از اندازه گیری انبوهگی جدید برای متحدکردن داده با فراداده: پیداکردن الگوهای توالی مشترک در ژن های رگۀ ‌زایشی V گیرنده ایمنی
کلمات کلیدی
خوشه بندی سلسله مراتبی؛ تجمع؛ تجزیه و تحلیل درخت؛ رگولاتور گیرنده ایمنی؛ ایمنی سازگار؛ تجزیه و تحلیل چند مقیاسی
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر چشم انداز کامپیوتر و تشخیص الگو
چکیده انگلیسی


• We present a new method for finding relationships of metadata labels using other data.
• We present a novel measure of aggregation within a hierarchical container.
• Our clumpiness measure is stable across tree size, label size, and numbers of labels.
• We quantify relationships of immune receptor V genes from their sequence fragments.

When finding relationships in biological systems, we often describe hierarchies based on one facet of the data. However, when using this hierarchy to elucidate relationships between metadata, the distribution of metadata labels within the hierarchy may exhibit different levels of aggregation—uniform, random, or clumped. As of now, there exists no measure for finding the level of aggregation, or “clumpiness”, between labels distributed among the leaves of a hierarchical container. We propose a clumpiness measure to aid in the quantification of relationships between metadata. We validated our measure with random trees and found that the measure is resistant to changes in the tree size, label size, and the number of types of labels, compared to the closest alternative measures. We used our clumpiness measure to quantify the relationships between light and heavy chains in human and mouse B cell and T cell receptor V genes based on their motifs. We found that the B cell heavy chains were the most aggregated while the T cell chains were the least aggregated and that the IGL chain was clumped the most with the T cell chains out of all of the B cell chains.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Pattern Recognition Letters - Volume 74, 15 April 2016, Pages 24–29
نویسندگان
, , , ,