Software companies invest a large amount of time in testing software. One of the problems they face during this process is that of “test flakiness”: some of the tests, instead of failing because of the presence of a bug in the software, have a nondeterministic behavior and could fail for external reasons (e.g., randomness in the software, concurrency, network latency, etc). When they are run multiple times on the same version of the software they could pass or fail and developer cannot rely on their outcome for the identification of bugs: When developers are asked to fix a potential bug found by a flaky test they could just lose their time since the bug is not there. The current approach to detect such flaky tests (and then treat them differently) is to rerun them: if they lead to different outcomes in different runs they are declared flaky. Of course this strategy is costly for large industries (e.g., Google runs millions of tests every day, out of which up to 16% are flaky).
Cruciani et al. proposed a fast, static approach to detect flaky tests based on machine learning techniques that proved to have high precision in our preliminary experimental results.
The full list of winning projects and some more details on the award can be found here.