Skip to content

Test Regressions

If a test with a perfect execution history suddenly fails, it sends a strong signal that the code change is breaking the system.

Flakiness.io detects such situations and reports them by introducing fifth test status - regression:

All 5 test statuses All 5 test statuses

The regression window is a configurable time period (in days) that defines how far back Flakiness.io looks for a test’s perfect execution history when determining if a failure should be classified as a regression.

This setting can be customized per project in your project settings, allowing you to adjust the sensitivity of regression detection based on your team’s needs:

  • A shorter window (e.g., 7 days) makes regression detection more sensitive
  • A longer window (e.g., 30 days) requires a longer period of stability before flagging regressions

To compute regressions for any timeline, Flakiness.io:

  1. Computes test statuses for all commits
  2. Classifies commit failures with perfect history within the regression window as regressions
  3. Classifies failed runs inside Regressed commits as regressions
  4. Classifies failed days with regressed commits inside as regressions

To illustrate the logic, let’s say we have a testNervousSquirrel test which hasn’t been failing since forever, but suddenly failed yesterday in Commit X on May 25, 2025, and was failing since then:

All 5 test statuses All 5 test statuses

Flakiness.io classifies commit-level failures as regressions if the test has a perfect record for the previous regression window period.

So in our example:

  • Since Commit X failed, and the test had perfect record within the regression window, the failure in Commit X will be classified as regression
  • The test continued failing in both Commit Y and Commit Z. However, these failures are not new, so they will not be classified as regressions.
All 5 test statuses All 5 test statuses

Flakiness.io classifies run-level failure as regression if this failure is classified as regression on the commit level.

So if Commit X had, for example, 2 failing runs, then these run failures would be classified as regressions:

All 5 test statuses All 5 test statuses

Flakiness.io classifies “failed” status for a day as “regression” if the regression happened during this day.

So in our example:

  • Failure of the May 25, 2025 will be classified as a regression, since the test regressed during this day
  • The failure on the following day of May 26, 2025, will not be classified as regression.
All 5 test statuses All 5 test statuses