کد مقاله کد نشریه سال انتشار مقاله انگلیسی نسخه تمام متن
458329 696134 2006 17 صفحه PDF دانلود رایگان
عنوان انگلیسی مقاله ISI
Circulating shared-registers for multiprocessor systems
موضوعات مرتبط
مهندسی و علوم پایه مهندسی کامپیوتر شبکه های کامپیوتری و ارتباطات
پیش نمایش صفحه اول مقاله
Circulating shared-registers for multiprocessor systems
چکیده انگلیسی

The techniques for fine-grained data sharing are generally available only on specialized architectures, usually involving a shared-bus. The CIRculating Common-Update Sharing (CIRCUS) mechanism has low latency user-level contention-free access to a set of shared circulating data registers. The local access latency is near zero for both read and write operations. These operations can be mapped into more complex operations, such as arithmetic, logical, or data reduction operations such as minimum or sum to be performed by the circulating register hardware (CRH) on the circulating copy of a register. The CRH can also be used to perform atomic operations, such as fetch&add or swap. For a two-dimensional hierarchy of N processing elements (PEs), the write-latency (until the circulating register is updated with a new value) and the update-latency (when all CRH modules can see the updated value) have an optimum cluster size proportional to (N · I/D)1/2, where I is the intercluster time and D is the inter-PE time, including the time between and through one node. The latencies, when optimally clustered, are proportional to (N · I · D)1/2. Sub-microsecond write-latency is expected for up to 15,255 PEs or 660 workstations. For higher levels of hierarchy, the expected write-latency is shown to be proportional to the sum of the latencies of all loop hierarchies. CIRCUS is applicable to a wide variety of system architectures and topologies.

ناشر
Database: Elsevier - ScienceDirect (ساینس دایرکت)
Journal: Journal of Systems Architecture - Volume 52, Issue 3, March 2006, Pages 152–168
نویسندگان
, , ,