Downloads
Keywords:
Quality Metrics of Automated Tests in the Application of Machine Learning
Authors
Abstract
The article examines how machine learning methods are being increasingly integrated into automated testing processes; however, the absence of a unified system of quality metrics complicates the assessment of their effectiveness. In light of rising costs for defect resolution and the growing proportion of unstable (flaky) tests, investigating quality metrics for automated tests, taking machine learning (ML) approaches into account, becomes highly relevant. The goals of this work are to classify existing metrics and apply them to the particular requirements of testing ML systems; further, an economic model is proposed for decision justification in CI processes. Research novelty lies in unifying classical, machine learning (ML)-oriented, and financial metrics, and evaluating their impacts on practical costs for infrastructure, as demonstrated through industrial cases from Facebook/Meta, Netflix, and Slack. It was found that machine learning (ML) approaches to automated testing via Predictive Test Selection led to significant savings in CPU hours (Gradle, Netflix) and reductions in the percentage of flaky tests—data from Facebook/Meta and Slack also confirms this. It is shown here that PR-AUC and other precision–recall–based metrics are more faithful to real-world imbalanced defect classes than ROC curves. The economic model, with coefficients C_FP and C_FN, enables the computation of the optimal tradeoff between test execution speed and defect detection rate, allowing decisions to be made during Continuous Integration/Continuous Deployment. Quantitative data give rise to integral metrics that combine coverage type, build stability, and economic components. The article will be of particular interest to quality assurance engineers, project managers, and researchers in the fields of software testing and machine learning.
Article Details
Published
Issue
Section
License

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.