Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

A Hybrid Approach to Identify Code Smell Using Machine Learning Algorithms

Archana Patnaik, Neelamdhab Padhy

Source Title: International Journal of Open Source Software and Processes (IJOSSP) 12(2)

DOI: 10.4018/IJOSSP.2021040102

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Code smell aims to identify bugs that occurred during software development. It is the task of identifying design problems. The significant causes of code smell are complexity in code, violation of programming rules, low modelling, and lack of unit-level testing by the developer. Different open source systems like JEdit, Eclipse, and ArgoUML are evaluated in this work. After collecting the data, the best features are selected using recursive feature elimination (RFE). In this paper, the authors have used different anomaly detection algorithms for efficient recognition of dirty code. The average accuracy value of k-means, GMM, autoencoder, PCA, and Bayesian networks is 98%, 94%, 96%, 89%, and 93%. The k-means clustering algorithm is the most suitable algorithm for code detection. Experimentally, the authors proved that ArgoUML project is having better performance as compared to Eclipse and JEdit projects.

Article Preview

Top

1. Introduction

The primary cause of code complexityisthe time frame, mismanagement,unclean shortcuts during the software development process, lack of testing,documentation issues, lack of understanding, communication issues, lack of teamwork, monitoring issues,workloadand late refactoring. Lack of cooperation and coordination often cause these problems. Project transition even harmed the whole project due to nasty coding. Code smell refers to the deeper issue inside a program's source code. These problems occurred because code smell may not affect the result, but it still harms the source code's performance. The absolute violation of basics in developingsoftware results decreases code quality by increasing the technical debt to identify code smells automatically. Wekanose is a tool used to determine the code smell from any coding using weka software. Other code detection tools are PMD, iplasma, Jdeodrant, Decoder, Checkstyle, etc.

Figure 1.

Dirty Code

Figure 1, illustrates the dirty code with data clump code complexity where groups of variables are combined to form objects at the class level. It increases the execution time of the program by allocating data values to the variables. In the above Figure datamembers like ccno,expmonth,expyear and amt consists of some random data values, which further leads to code complexity. It can be avoided by deleting the assigned values.

Feature selection is the automatic or manual selection of relevant features from the massive amount of data used to constructthe model. It is used to improve the accuracy of a model by reducing its complexity. It is a process of selecting a set of best features in the form of a subset before implementing any generalized algorithms.Various parameters involved for feature selectionare correlation, entropy, mutual information. Different types of feature selection methods are Recursive Feature Elimination,Chi-squared test, feature evaluation, etc.Machine learninginvolves a machine to learn from data by predicting things being programmed automatically.We have used different supervised, unsupervised and anomaly detection algorithms to identify the smelly data from the realtime datasets. In our research, the prime focus is on code smell detection using the identification of outliers. We have used different unsupervised anomaly detection methods like PCA, GMM, autoencoder, K-means clustering, and Bayesian network to identify outliers in the dirty code. We have also focused on the performance of the system by comparing its accuracy.Software quality is defined as the robustness or fitness of a software product's quality. It is analyzed by the following parameters reusability,correctness,portability and maintainability. Software quality assurance produces high-quality software by saving time and cost. Code smell affects the source code by violating the good program designing principles having a negative impact on the software quality. The primary solution to this problem is to develop the refactored code. Refactoring is used to change the internal structure of code without altering its external functionalities.Different types of techniques are replacing parameter, inline method, extract class etc.

RQ1: Which type of feature selection method is used for analyzing the open-source projects?

In this work, feature selection methods reduce complexity and increase the proposed model's efficiency. Recursive Feature Elimination(RFE) is used for selecting the relevant data by removing the weakest features of the dataset.

RQ2: Which type of anomaly detection is more preferable for analyzing the concept of code smell?

This work illustrated the anomaly detection technique for identifying outliers by comparing the dirty code with clean code. We have used five different algorithms to identify the extreme code point that slightly deviated from the original data samples. Cluster-based anomaly detection methods give the best results for code smell detection.

RQ3: What are the most commonly found code smell and suitable refactoring approach for developing clean code?

Complete Article List

Search this Journal:

Reset

Volume 15: 1 Issue (2024): Forthcoming, Available for Pre-Order

Volume 14: 1 Issue (2023)

Volume 13: 4 Issues (2022): 1 Released, 3 Forthcoming

Volume 12: 4 Issues (2021)

Volume 11: 4 Issues (2020)

Volume 10: 4 Issues (2019)

Volume 9: 4 Issues (2018)

Volume 8: 4 Issues (2017)

Volume 7: 4 Issues (2016)

Volume 6: 1 Issue (2015)

Volume 5: 3 Issues (2014)

Volume 4: 4 Issues (2012)

Volume 3: 4 Issues (2011)

Volume 2: 4 Issues (2010)

Volume 1: 4 Issues (2009)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

A Hybrid Approach to Identify Code Smell Using Machine Learning Algorithms

Abstract

1. Introduction

Complete Article List