Article ID Journal Published Year Pages File Type
542926 Integration, the VLSI Journal 2008 11 Pages PDF
Abstract

Sorting, which is widely used in different areas such as database systems, IP routing, bio informatics, and cognitive-processing-based applications, imposes considerable overhead on computing resources. Therefore, an efficient on-chip sorting accelerator may significantly enhance real-time decision-making in such applications. In this paper we introduce a novel pipelined and parallel sorting algorithm with streaming I/O, with the time, logic, and memory complexity of O(n)O(n), O(n), and O(n)O(n), respectively. We present a formal analysis to prove the correctness of this algorithm. We then model, verify, and synthesize this unconditional algorithm (in the TSMC 0.13 micron technology) for 4k-word clusters as an ASIC accelerating engine. More specifically, our implementation with 3969-word multiple-bank memory, 63 word-size comparators, 64 word-size multiplexers, and 63 word-size registers only requires some 8k clock cycles to sort an arbitrary 3969-word long array of random data, which arrive at the sorter and also depart it one item at a time.

Related Topics
Physical Sciences and Engineering Computer Science Hardware and Architecture
Authors
, ,