Summary: | Regression testing is the process of confirming that a code change did not introduce any test failure into the current build. One Regression testing technique commonly used is Regression test selection or RTS. It is the process of identifying all tests affected by a code change, which are identified by creating a dependency graph of the project. The selected tests are then executed. The purpose of RTS is to reduce the development time by lowering the time for testing. Machine learning has been used as a test selection tool in recent studies and have shown promising results. Machine learning were used with a RTS tool to further reduce the number of tests selected. The features are primarily extracted from the dependency graph from the RTS tool. The machine learning is then used to estimate the probability of a test failure, and the tests are selected based on the probability of test failure. However, in order to train a machine learning model, it is essential to have a lot of data, and faulty code changes are required. Code defects need to be tested with the RTS tool while extracting data from running the tests. However, for open source projects, obtaining a large number of historical code defects is challenging. This paper presents EALRTS, a predictive regression test selection tool. EALRTS uses mutation generation instead of historical code defects. The data for the machine learning model is obtained with the help of STARTS, which is a static RTS tool. The data extracted comes mainly from two sources: (1) from the dependency graph that STARTS creates. (2) And from the test result reports. The data extracted is then used to train a Random Forest algorithm, whose goal is to predict what test to select. EALRTS managed to reduce the number of tests selected by 60.3% while finding 95% of all failed tests. The recall rate is interpreted as the amount of individual test failure found in a test class. The results show a trade-off between the number of individual test failures found and the number of tests selected. The trade-off suggests that a machine learning model can drastically lower the amount of test selected by a slight reduction in recall rate. The results for EALRTS are based on one case study, 725 test runs with a project consisting of 808 Java-files. === Regressionstestning är processen för att bekräfta att en kodändring inte införde något testfel för projektet. En regressionstestningsteknik som vanligtvis används är Regression Test Selection eller RTS. Det är processen att identifiera alla tester som påverkas av en kodändring. Syftet med RTS är att minska utvecklingstiden genom att sänka tiden för testning. Maskininlärning har använts som ett verktyg för testval i nyligen genomförda studier och har visat lovande resultat. Maskininlärning användes med ett RTS-verktyg för att ytterligare minska antalet utvalda tester. För att träna en maskininlärningsmodell är det dock viktigt att ha mycket data och kodändringar som har introducerat testfel. För open-source projekt är det emellertid utmanande att hitta stort antal kodändringar som ger testfel. Den här studien presenterar EALRTS, ett testverktyg som kan förutspå vilka tester som behöver köras. EALRTS använder mutation generation istället för befintliga felaktiga kodändringar. EALRTS lyckades minska antalet utvalda tester med 60,3 misslyckade test. Resultatet antyder att en maskininlärningsmodell kan sänka mängden test som valts genom en liten minskning av felaktiga test som hittas. Resultaten för EALRTS är baserade på en fallstudie, 725 testkörningar med ett projekt som består av 808 Java-filer.
|