کد مقاله | کد نشریه | سال انتشار | مقاله انگلیسی | نسخه تمام متن |
---|---|---|---|---|
424856 | 685650 | 2016 | 12 صفحه PDF | دانلود رایگان |
• New distributed data processing paradigm that describes transformations as automata.
• Automata provide a schema of data processing and also facilitate a P2P routing.
• Protocol for data processing with automata, data, code and state as part of packet.
• Data packet control (parallel, coalesce, backlog) is based on compute predictions.
Data processing complexity, partitionability, locality and provenance play a crucial role in the effectiveness of distributed data processing. Dynamics in data processing necessitates effective modeling which allows the understanding and reasoning of the fluidity of data processing. Through virtualization, resources have become scattered, heterogeneous, and dynamic in performance and networking. In this paper, we propose a new distributed data processing model based on automata where data processing is modeled as state transformations. This approach falls within a category of declarative concurrent paradigms which are fundamentally different than imperative approaches in that communication and function order are not explicitly modeled. This allows an abstraction of concurrency and thus suited for distributed systems. Automata give us a way to formally describe data processing independent from underlying processes while also providing routing information to route data based on its current state in a P2P fashion around networks of distributed processing nodes. Through an implementation, named Pumpkin, of the model we capture the automata schema and routing table into a data processing protocol and show how globally distributed resources can be brought together in a collaborative way to form a processing plane where data objects are self-routable on the plane.
Journal: Future Generation Computer Systems - Volume 59, June 2016, Pages 21–32