Article ID Journal Published Year Pages File Type
433420 Science of Computer Programming 2013 16 Pages PDF
Abstract

Software repository mining research extracts and analyses data originating from multiple software repositories to understand the historical development of software systems, and to propose better ways to evolve such systems in the future. Of particular interest is the study of the activities and interactions between the persons involved in the software development process. The main challenge with such studies lies in the ability to determine the identities (e.g., logins or e-mail accounts) in software repositories that represent the same physical person. To achieve this, different identity merge algorithms have been proposed in the past. This article provides an objective comparison of identity merge algorithms, including some improvements over existing algorithms. The results are validated on a selection of large ongoing open source software projects.

Related Topics
Physical Sciences and Engineering Computer Science Computational Theory and Mathematics