Classification of Software Defects Using Orthogonal Defect Classification

Classification of Software Defects Using Orthogonal Defect Classification

Sushil Kumar, SK Muttoo, V. B. Singh
Copyright: © 2022 |Pages: 16
DOI: 10.4018/IJOSSP.300749
OnDemand:
(Individual Articles)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Classification of software defects is an important task to know the type of defects. It helps to prioritize the defects, to understand the cause of defects for improving the process of software defect management system by taking the appropriate action. In this paper, we evaluate the performance of naïve Bayes, support vector machine, k nearest neighbor, random forest, and decision tree machine learning algorithm to classify the software defect based on orthogonal defect classification by selecting the relevant features using chi-square score. Standard metrics accuracy, precision, and recall are calculated separately for Cassandra, HBase, and MongoDB datasets. The proposed method improves the existing approach in terms of accuracy by 5%, 20%, 6%, 27%, and 26% for activity, defect impact, target, type, and qualifier respectively, and shows the enhanced performance.
Article Preview
Top

1. Introduction

Software defect classification is a very crucial task to help the software defect management process. The size and number of software systems are increasing day by day. The defects are being reported by the users of these software systems. Classifying defects into a category helps software developers to assign priorities to the defects, faster resolution, analysis of a defect prone module etc. (Endres1975; Shepherd1993; Wagner2008). Classifying these defects is a time consuming process which is done manually by software developers. In recent years, supervised learning methods have been used to automate the process of defect classification using orthogonal defect classification.

Orthogonal defect classification (ODC) (Chillarege et al.1992 1996) was developed by IBM in the 1990s to provide measuring software process by extracting valuable information from defects. It acts as a bridge between defect modeling and causal analysis. It groups defects based on their impact, trigger, activity, target, type, source, qualifier and age. ODC defect impact specifies a user experience when a defect occurs. Defect impact is further classified into usability, reliability, standard, install-ability, security maintenance etc. In recent years many works have been focused on orthogonal defect classification that tried to categorize the defects based on ODC attributes. ODC has been successfully used by many organizations to improve their process of software development (Butcher 2002;Soylemez and Tarhan 2013;Bridge and Miller 1998;Mays et al. 1990;Lutz and Mikulski 2004; Zheng et al.2006).

In this paper we have evaluated five classifiers namely Naïve Bayes, Support Vector Machine, K Nearest Neighbor, Random Forest and Decision Tree.

In brief, the main contributions of this paper are:

  • Defect categorization from unstructured text provided in the description field of defect reports for 4096 defects from three datasets MongoDB, Cassandra and HBase

  • Selection of most relevant features using chi square score

  • Evaluate the performance of Naïve Bayes, Support Vector Machine, K Nearest Neighbor, Random Forest and Decision Tree dataset wise and with whole data

The rest of this paper is organized in various sections. Section 2 presents the background studies, definitions and related work focuses on software defect categorization using orthogonal defect classification. The next section discusses the dataset. In Section 4, we define the problem and explain our proposed approach; the results are discussed in section 5 in comparison with an existing approach. Section 6 discusses the threats to validity to our work. Last section concludes this paper with conclusion and future scope.

Complete Article List

Search this Journal:
Reset
Volume 15: 1 Issue (2024): Forthcoming, Available for Pre-Order
Volume 14: 1 Issue (2023)
Volume 13: 4 Issues (2022): 1 Released, 3 Forthcoming
Volume 12: 4 Issues (2021)
Volume 11: 4 Issues (2020)
Volume 10: 4 Issues (2019)
Volume 9: 4 Issues (2018)
Volume 8: 4 Issues (2017)
Volume 7: 4 Issues (2016)
Volume 6: 1 Issue (2015)
Volume 5: 3 Issues (2014)
Volume 4: 4 Issues (2012)
Volume 3: 4 Issues (2011)
Volume 2: 4 Issues (2010)
Volume 1: 4 Issues (2009)
View Complete Journal Contents Listing