Flaky tests

What's a flaky test?

It's a test that sometimes fails, but if you retry it enough times, it passes, eventually.

Quarantined tests

When a test frequently fails in master, a ~"master:broken" issue should be created. If the test cannot be fixed in a timely fashion, there is an impact on the productivity of all the developers, so it should be placed in quarantine by assigning the :quarantine metadata.

This means it will be skipped unless run with --tag quarantine:

bin/rspec --tag quarantine

Before putting a test in quarantine, you should make sure that a ~"master:broken" issue exists for it so it won't stay in quarantine forever.

Once a test is in quarantine, there are 3 choices:

Should the test be fixed (i.e. get rid of its flakiness)?
Should the test be moved to a lower level of testing?
Should the test be removed entirely (e.g. because there's already a lower-level test, or it's duplicating another same-level test, or it's testing too much etc.)?

Quarantine tests on the CI

Quarantined tests are run on the CI in dedicated jobs that are allowed to fail:

rspec-pg-quarantine and rspec-mysql-quarantine (CE & EE)
rspec-pg-quarantine-ee and rspec-mysql-quarantine-ee (EE only)

Automatic retries and flaky tests detection

On our CI, we use rspec-retry to automatically retry a failing example a few times (see spec/spec_helper.rb for the precise retries count).

We also use a home-made RspecFlaky::Listener listener which records flaky examples in a JSON report file on master (retrieve-tests-metadata and update-tests-metadata jobs), and warns when a new flaky example is detected in any other branch (flaky-examples-check job). In the future, the flaky-examples-check job will not be allowed to fail.

This was originally implemented in: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/13021.

Problems we had in the past at GitLab

rspec-retry is bitting us when some API specs fail: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/9825
Sporadic RSpec failures due to PG::UniqueViolation: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/9846
- Follow-up: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/10688
- Capybara.reset_session! should be called before requests are blocked: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/12224
FFaker generates funky data that tests are not ready to handle (and tests should be predictable so that's bad!):

Time-sensitive flaky tests

Array order expectation

https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/10148

Feature tests

Capybara viewport size related issues

Transient failure of spec/features/issues/filtered_search/filter_issues_spec.rb: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/10411

Capybara JS driver related issues

PhantomJS / WebKit related issues

Memory is through the roof! (TL;DR: Load images but block images requests!): https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/12003

Resources

Return to Testing documentation