Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
4956504 | Journal of Systems and Software | 2017 | 42 Pages |
Abstract
Log files are generated in many different formats by a plethora of devices and software. The proper analysis of these files can lead to useful information about various aspects of each system. Cloud computing appears to be suitable for this type of analysis, as it is capable to manage the high production rate, the large size and the diversity of log files. In this paper we investigated log file analysis with the cloud computational frameworks Apacheâ¢Hadoop® and Apache Sparkâ¢. We developed realistic log file analysis applications in both frameworks and we performed SQL-type queries in real Apache Web Server log files. Various experiments were performed with different parameters in order to study and compare the performance of the two frameworks.
Related Topics
Physical Sciences and Engineering
Computer Science
Computer Networks and Communications
Authors
Ilias Mavridis, Helen Karatza,