





IFAC-PapersOnLine 49-25 (2016) 061-067

## A Custom dual-processor System for Real-time Neural Signal Processing

Paolo Meloni<sup>\*</sup> Claudio Rubattu<sup>\*</sup> Giuseppe Tuveri<sup>\*</sup> Luigi Raffo<sup>\*</sup>

\* Dipartimento Ingegneria Elettrica ed Elettronica Università degli Studi di Cagliari Cagliari, Italy 09123 (e-mail: name.surname@diee.unica.it)

**Abstract:** This paper presents a custom dual-processor SoC architecture, studied and customized to support information extraction from signals acquired from Peripheral Neural System, for prosthetic applications. The main tasks accomplished by the processing implemented on the computing platform are noise removal and identification of neural spikes. On-board execution of such tasks allows to identify which samples actually contain useful information. Thus, it reduces required input/output bandwidth, so that connection to the external environment can be implemented using a Bluetooth Low Energy device. The overall SoC architecture has power consumption compliant with implant-related constraints with a battery lifetime of around one-day.

© 2016, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved.

Keywords: System-on-chip, digital signal processing, neural signals, ASIP, Bluetooth LE

#### 1. INTRODUCTION

Among the different solutions proposed in literature and in the biomedical device market, neuroprosthetic systems connected to the patient's Peripheral Neural System (PNS) receive an ever-increasing attention by the community. Such systems rely on the acquisition of neural signal activity by means of adequate electrodes implanted near the amputation region and, exploiting the natural pathways of motor control, are more easily and finely managed by the amputees. The design of embedded systems for such neuroprosthetic applications requires extraction of the information encoded in neural signals, aimed at the identification of the patient's motion intention. An important challenge to be faced within the implementation of such a task, is related with the development of prospectively implantable processing platforms capable of performing the initial steps of the target decoding algorithm. The designed system must respect tight power/energy budget to improve battery lifetime. Moreover the platform must communicate with the external environment using lowpower wireless connections, typically providing limited data rate capabilities. Thus several computing tasks must be executed to pre-process acquired neural signals and to reduce the amount of information to be sent to the output. Finally, research efforts studying best analysis and decoding methods are still evolving very fast, thus a flexible solution is required, to enable improvement of the functionality through re-programming or re-configuration. An interesting architectural possibility that combines flexibility, significant processing power and power efficiency is provided by application-specific processor-based systems (Jozwiak et al., 2012, 2013), that are customized at design time on the basis of the software application that must be executed, nevertheless, at the same time, can be programmed using standard programming languages. In this work, we present a low-power dual-processor system, composed by a general purpose host processor and a custom Application Specific Instruction-set Processor (ASIP), designed to process the signals acquired by implanted electrodes. Most of the decoding algorithms proposed for PNS prostheses are based on the analysis of typical bursts of neural electrical activity, usually referred to as spikes, representing the meaningful electrical activity of motor neurons. The processing system presented in this work will execute all the processing steps needed to identify which samples acquired by the analog front-end are part of a spike and must be sent to the actual decoding phase, executed on an external non-implanted computing facility. Thus, on-board execution of the *spike detection* process, significantly reduces the data rate required for inputoutput communication.

The remainder of this paper is organized as follows. Section 2 provides a brief analysis of the previous works. The adopted state-of-the-art spike detection algorithm (Citi et al., 2008) is described in Section 3. The reference processing platform architectural template is presented in Section 4. Before concluding with some final remarks in Section 6, Section 5 presents the features and capabilities of the proposed system architecture.

#### 2. RELATED WORK

Information extraction from spike-related neural activity relies on the recognition of the firing activity of the different neurons (Lewicki, 1998). Several works on neural signal processing have been presented so far, the largest part of them proposing FPGAs (Zhang et al., 2012; Yu et al., 2011) as implementation target to guarantee more flexibility than ASICs (Chen et al., 2009; Perelman and

2405-8963 © 2016, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved. Peer review under responsibility of International Federation of Automatic Control. 10.1016/j.ifacol.2016.12.011



Fig. 1. Target application overview

Ginosar, 2007) with more parallelism than general-purpose processors.

In Zhang et al. (2012), a fully implantable programmable neuroprocessor mappable on a low-power nano-FPGA is presented. It manages data acquisition and reduction by particular compression techniques in order to minimize the output bitrate exploiting the sparse representation of the neural signals. This way, it is possible to overcome the limitation of the wireless telemetry bandwidth by transmitting only the samples associated to the detected spikes to an external device for cortically-controlled Brain-Machine Interfaces. The device has been tested on raw extracellular signals recorded through microelectrode arrays chronically implanted in the brain of sedated rats. The presented device, however, cannot be programmable in software, thus reducing adaptability to new versions of the algorithm.

The feasibility of a similar approach in terms of power has been investigated on standard CMOS VLSI (Zumsteg et al., 2005). In this approach, the computational complexity is shifted at downstream of the implantable device in order to perform the decoding which can be performed on many-core platforms (Chen et al., 2011) or FPGAaccelerated solutions (Gibson et al., 2013).

Coarse-grained reconfigurable approaches have been presented in order to accelerate some computational intensive kernels using a reduced hardware resources set and keeping some sort of adaptability of the processing platform, as in Carta et al. (2013). In this case, a trade-off between hardware reuse maximization and latency minimization must be carefully considered in order to fulfill the relative strict timing constraints.

The approach to neural signal decoding based on PNS seems to be the most attractive for the time being (Micera et al., 2010), however there is a lack of studies in terms of architectures able to cope with the application constraints. In Pani et al. (2011), the same algorithm taken as starting point for this work (Citi et al., 2008) has been partially ported on a complex VLIW floating-point processor by Texas Instruments. The claimed real-time results have been obtained on a 300 MHz processor: such architecture, and the operating frequency, determines an excessive contribution in terms of dynamic power consumption that is not allowable in case of implantable solutions. The methodological aspects related to powerefficient and effective multiprocessor architectures aimed at implementing in real-time state-of-the-art neural signal decoding algorithms lack in the scientific literature.

In Carta et al. (2014), a homogeneous MPSoC architecture, designed using custom process networks Meloni et al. (2012) Cannella et al. (2012), has been used to preliminary test the porting of a neural signal decoding algorithm on parallel processing platforms. Results have shown that real-constraints can be satisfied by clocking the system at a reasonable frequency and taking profit from the parallelism to reduce power consumption using a clock-gating programmable manager. The application code has been parallelized effectively using an approach based on software pipelined.

In the work presented by Meloni et al. (2016), authors present a custom architecture that performs, using a customizable number of ASIP processors, several steps of the decoding algorithm, including spike sorting. With respect to such work, we use a similar approach, but we limit the processing to the first steps of the decoding chain, down to the spike detection. This allows a smaller processing platform to be used, that is prospectively more easily programmable and surely less power-hungry. As a further point of novelty, in this paper we evaluate the integration of a complete system-level platform, based on a widely used SoC template, accounting for power consumption of peripherals. Moreover, in this work we introduce the evaluation of performance and of the power consumption contribution of a standard wireless connection interface.

### 3. TARGET APPLICATION AND CONSTRAINTS

As depicted in Figure 1, the target application is in charge of reading and analyzing the neural signal samples, acquired by an adequate analog front-end. It has to integrate several steps of the decoding chain and to be executable on a low-power miniaturized embedded system, prospectively implantable. The target decoding algorithm, taken as reference, involves two successive processing steps: Wavelet Denoising and Spike Detection. The samples identified to be part of a spike are thus sent in output, through a wireless network interface, to be analyzed and associated to a motion intention. Such an analysis, that usually involves spike sorting (i.e. identification of the morphological shape of a spike aimed at identifying which neuron has been firing it) and classification, can be performed on the prosthetic hand, by a non-implantable controller, with more relaxed requirements in terms of low-power consumption. In this work we have considered a prospective implementation of the device, using a Bluetooth LE connection to implement communication between the implantable and the nonimplantable computing platforms. Such a solution proDownload English Version:

# https://daneshyari.com/en/article/5002822

Download Persian Version:

https://daneshyari.com/article/5002822

Daneshyari.com