Download PDFOpen PDF in browser

Improving Test Case Selection by Handling Class and Attribute Noise

EasyChair Preprint 6605

44 pagesDate: September 13, 2021

Abstract

Big data and machine learning models have been increasingly used to support software engineering practices. One example is the use of machine learning models to improve test case selection in continuous integration. However, one of the challenges in building such models is the large volume of noise that comes in data, which impedes their predictive performances. In this paper, we address this issue by studying the effect of two types of noise (class and attribute) on the predictive performance of a test selection model. For this purpose, we analyze the effect of class noise by using an approach that relies on domain knowledge for relabeling contradictory entries and removing duplicate ones. Thereafter, an existing approach from the literature is used to experimentally study the effect of attribute noise removal on learning. The analysis results show that the best learning is achieved when training a model on class-noise cleaned data only -- irrespective of attribute noise.
Specifically, the learning performance of the model reported 81% precision, 87% recall, and 84% f-score compared with 44% precision, 17% recall, and 25% f-score for a model built on uncleaned data. Finally, no causality relationship between attribute noise removal and the learning of a model for test case selection could be drawn.

Keyphrases: attribute noise, class noise, machine learning, regression testing, test case selection

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
@booklet{EasyChair:6605,
  author    = {Khaled Al-Sabbagh and Miroslaw Staron and Regina Hebig},
  title     = {Improving Test Case Selection by Handling Class and Attribute Noise},
  howpublished = {EasyChair Preprint 6605},
  year      = {EasyChair, 2021}}
Download PDFOpen PDF in browser