A Better GitLab Code Quality - Part 1 - Goodbye CodeClimate

August 27, 2024

GitLab's Code Quality analysis template can be a convenient tool, but the default CodeClimate-based solution has some significant drawbacks, and was recently deprecated. This post, part 1 of a series, explores the security, performance, and usability issues with the default Code Quality template, and why its deprecation not a significant loss. Future parts of this series will explore a more flexible, more adaptable, and more capable approach.

GitLab CodeClimate-based Code Quality is deprecated #

GitLab 17.3 announced the official deprecation of GitLab's CodeClimate-based Code Quality analysis, with removal planned for GitLab 18.0. In the interim only limited updates are planned in accordance with their support policy for deprecated capabilities. There is an open epic looking at future solutions, although it's notional and aspirational at this point. Their plan is to keep Code Quality reports, and the existing UI capabilities in Pipelines and Merge Requests, so custom Code Quality solutions remain working as expected (which is the direction this series is going).

The remainder of this post was originally written before this announcement, so it's included here to provide additional context on why this is not a great loss.

Problems with GitLab's CodeClimate-based Code Quality #

GitLab's CodeClimate-based Code Quality template can be a convenient and effective tool for analyzing code quality, but it does have some significant drawbacks including security concerns, performance issues, usability issues, and analysis limitations.

Runner security #

CodeClimate requires one of two general configurations to perform the analysis since it spawns containers as needed based on the analysis requirements. These configurations expose potentially significant security concerns.

The default template assumes Docker-in-Docker is used. This is a well understood technique, but requires runners running in privileged mode. Privileged mode gives the container all the root capabilities of the host. From the GitLab docs - "When privileged mode is enabled, a user running a CI/CD job could gain full root access to the runner’s host system, permission to mount and unmount volumes, and run nested containers. By enabling privileged mode, you are effectively disabling all the container’s security mechanisms and exposing your host to privilege escalation, which can lead to container breakout."
The alternative configuration for private runners requires exposing the Docker daemon socket to the container (required to launch new containers). By default, this is owned by the root user, again exposing root access to the host.

These settings represent attack vectors that could lead to the compromise of any CI job performed on the container, as well as compromise of the host running the container and anything running on it. While there are methods to limit the potential impacts - not running parallel CI jobs on a VM, running on ephemeral infrastructure, running Docker in rootless mode, etc. - there's no way to completely mitigate them, they're just varying levels of risk. There are also questions about the viability of some solutions, like rootless mode, as this issue suggests.

Even limiting access to known users can lead to an exploit if they bring in compromised external dependencies (and what application doesn't bring in external dependencies in modern development). Because of that, the recommendation to not allow configurations with these potential attack vectors is well documented in container security best practice guides from across the industry, including Anchore, Aqua Security, OWASP Foundation, NIST, GitLab's own Docker runner security risks, and certainly others.

Even Docker Hub's Docker page recommends reviewing this article to understand the risks before running Docker-in-Docker. This has been updated with a more secure alternative, which has not been investigated thoroughly as part of this evaluation. This may be a viable solution, although it would not resolve the other issues identified here.

Speed #

The GitLab Code Quality job is slow, with several contributing factors.

The Docker-in-Docker startup cost (reduced if running GitLab's alternative configuration).
The overhead of launching the various analysis containers.
The time to download the various analysis containers.

It can take several minutes even for a small project (e.g. < 500 lines of JavaScript), which can quickly become the longest running job in a pipeline. There are some mitigations to reduce these impacts. The first two performance items in the preceding list are somewhat constrained by compute and memory allocated to the runner. So, allocating more resources can reduce execution time. For gitlab.com shared SaaS runners, this would mean moving to more capable runners - medium, large, x-large, 2x-large - although this does come at increased usage cost. Even with this, there are limits to the effectiveness, and a single job running multiple analyses is typically slower than running multiple parallel jobs.

The time to download can be harder to improve since some of these containers are large. To illustrate the magnitude of the problem, the following table shows the container images and size to run Code Quality analysis configured for a JavaScript-based web project (with csslint, duplication, eslint, and fixme) enabled (which is the same as the default configuration, but with coffeelint and rubocop disabled).

Image	Size
`docker:20.10.12`	65 MB
`docker:20.10.12-dind`	72 MB
`registry.gitlab.com/gitlab-org/ci-cd/codequality:0.96.0`	110 MB
`codeclimate/codeclimate-structure:latest`	2.3 GB
`codeclimate/codeclimate-csslint:latest`	60 MB
`codeclimate/codeclimate-duplication:latest`	2.34 GB
`codeclimate/codeclimate-eslint:latest`	170 MB
`codeclimate/codeclimate-fixme:latest`	21 MB
Total	5.14 GB

The following table shows the container images and size to run Code Quality analysis on a Go-based project with an updated configuration enabling all of the available Go plugins.

Image	Size
`docker:20.10.12`	65 MB
`docker:20.10.12-dind`	72 MB
`registry.gitlab.com/gitlab-org/ci-cd/codequality:0.96.0`	110 MB
`codeclimate/codeclimate-structure:latest`	2.3 GB
`codeclimate/codeclimate-duplication:latest`	2.34 GB
`codeclimate/codeclimate-fixme:latest`	21 MB
`codeclimate/codeclimate-golint:latest`	127 MB
`codeclimate/codeclimate-gofmt:latest`	101 MB
`codeclimate/codeclimate-govet:latest`	125 MB
Total	5.26 GB

Each of these examples includes a total of over 5 GB of container images.

There are various container image caching strategies that may help in some instances, but may not be available based on your runner configuration.

Ease of use #

The GitLab docs give the impression that enabling their CodeClimate-based Code Quality analysis is as simple as including the following in the project's .gitlab-ci.yml file:

include:
  - template: Jobs/Code-Quality.gitlab-ci.yml

The reality is that this only partially true. The default configuration is used in this case, which runs the following plugins:

csslint (CSS)
coffeelint (CoffeeScript)
duplication (only for Ruby, JavaScript, Python, PHP)
eslint (JavaScript/TypeScript)
fixme
rubocop (Ruby)

Some of these may not be applicable, for example the coffeelint plugin is probably past its useful life. There are also numerous other analysis plugins, but these need to be specifically configured with a .codeclimate.yml file. CodeClimate is not adaptable enough to check file types in the project and run the applicable plugins or jobs as is done with other GitLab templates, for example sast or dependency_scanning (if using GitLab Ultimate).

Limited analysis engines #

The complete list of CodeClimate analysis plugins can be found here. The officially supported list of plugins is quite limited, and even with the community supported plugins it's far from a comprehensive coverage of modern development languages. Even in cases where languages are covered, the analysis engines (all integrating other open source tools) are outdated. A couple of examples:

There's no question that eslint is the solution for linting JavaScript, but CodeClimate only supports up to v8.50.0 (released September 2023), even though the latest v8 is v8.57.0 (released February 2024), and v9 is out (released April 2024) and has 12 releases up to v9.9.1 (as of this writing in August 2024). Given this release cadence with valuable fixes and rule updates, the lag to integrate into CodeClimate is intolerable (in this case a year behind).
The available Go plugins are limited to gofmt, govet, and golint. While the first two are still used, golint has been archived for years (and even the CodeClimate plugin is deprecated). There's no support for modern engines like golangci-lint, which itself has dozens of linters covering almost anything you could want to lint (including gofmt and govet).

Summary #

This post has examined the various issues with GitLab's CodeClimate-based Code Quality analysis, and why its deprecation is not a significant loss. Future parts of this series will explore a more flexible, more adaptable, and more capable approach.