Article ID Journal Published Year Pages File Type
551694 Information and Software Technology 2013 13 Pages PDF
Abstract

ContextAlthough many papers have been published on software defect prediction techniques, machine learning approaches have yet to be fully explored.ObjectiveIn this paper we suggest using a descriptive approach for defect prediction rather than the precise classification techniques that are usually adopted. This allows us to characterise defective modules with simple rules that can easily be applied by practitioners and deliver a practical (or engineering) approach rather than a highly accurate result.MethodWe describe two well-known subgroup discovery algorithms, the SD algorithm and the CN2-SD algorithm to obtain rules that identify defect prone modules. The empirical work is performed with publicly available datasets from the Promise repository and object-oriented metrics from an Eclipse repository related to defect prediction. Subgroup discovery algorithms mitigate against characteristics of datasets that hinder the applicability of classification algorithms and so remove the need for preprocessing techniques.ResultsThe results show that the generated rules can be used to guide testing effort in order to improve the quality of software development projects. Such rules can indicate metrics, their threshold values and relationships between metrics of defective modules.ConclusionsThe induced rules are simple to use and easy to understand as they provide a description rather than a complete classification of the whole dataset. Thus this paper represents an engineering approach to defect prediction, i.e., an approach which is useful in practice, easily understandable and can be applied by practitioners.

Related Topics
Physical Sciences and Engineering Computer Science Human-Computer Interaction
Authors
, , , ,