Article ID Journal Published Year Pages File Type
567957 Advances in Engineering Software 2015 7 Pages PDF
Abstract

•Data partitioning and network partitioning parallelization methods are combined.•Large performance gains are found on small networks.•The network is parallelized to the level of a single node dimension.•Parallelization is maximized to allow robustness for future hardware.

High-dimensional data is pervasive in many fields such as engineering, geospatial, and medical. It is a constant challenge to build tools that help people in these fields understand the underlying complexities of their data. Many techniques perform dimensionality reduction or other “compression” to show views of data in either two or three dimensions, leaving the data analyst to infer relationships with remaining independent and dependent variables. Contextual self-organizing maps offer a way to represent and interact with all dimensions of a data set simultaneously. However, computational times needed to generate these representations limit their feasibility to realistic industry settings. Batch self-organizing maps provide a data-independent method that allows the training process to be parallelized and therefore sped up, saving time and money involved in processing data prior to analysis. This research parallelizes the batch self-organizing map by combining network partitioning and data partitioning methods with CUDA on the graphical processing unit to achieve significant training time reductions. Reductions in training times of up to twenty-five times were found while using map sizes where other implementations have shown weakness. The reduced training times open up the contextual self-organizing map as viable option for engineering data visualization.

Related Topics
Physical Sciences and Engineering Computer Science Software
Authors
, ,