Contributing#

Thank you for your interest in contributing to skore! We welcome contributions from everyone and appreciate you taking the time to get involved.

This project is hosted on probabl-ai/skore.

Below are some guidelines to help you get started.

Preamble#

Our values#

We aspire to treat everybody equally, and value their contributions. We are particularly seeking people from underrepresented backgrounds in Open Source Software to participate and contribute their expertise and experience.

Decisions are made based on technical merit, consensus, and roadmap priorities.

Code is not the only way to help the project. Reviewing pull requests, answering questions to help others on mailing lists or issues, organizing and teaching tutorials, working on the website, improving the documentation, are all priceless contributions.

We abide by the principles of openness, respect, and consideration of others of the Python Software Foundation: https://www.python.org/psf/codeofconduct/

Automated contributions policy#

Please refrain from submitting issues or pull requests generated by fully-automated tools. Maintainers reserve the right, at their sole discretion, to close such submissions and to block any account responsible for them.

Ideally, contributions should follow from a human-to-human discussion in the form of an issue.

Signing commits#

You have to sign your commits before submitting a pull request. For a pull request to be accepted, all the commits inside of it must be signed.

If you haven’t set up commit signing yet, GitHub supports signing using GPG, SSH, or S/MIME. Signed commits are marked as “Verified” on GitHub, providing confidence in the origin of your changes. For setup instructions and more details, please refer to GitHub’s guide on signing commits.

Questions, bugs, and feature requests#

If you have any questions, feel free to reach out:

When filing an issue:

Development#

Quick start#

You will need python >=3.11.

Setting up your development environment#

Fork the repository on GitHub, then clone your fork locally, and add a git remote to the main skore repository. You will find below some code you can use in your terminal, using HTTPS connection as an example.

# Clone your fork of the repo
git clone https://github.com/YOUR_USERNAME/skore.git

# Navigate to the newly cloned directory
cd skore

# Add the original repository as a remote
git remote add upstream https://github.com/probabl-ai/skore.git

# Create a new branch for your issue
git checkout -b issue-NAME_OF_ISSUE

You can set up your development environment in two ways:

Using pixi#

Our continuous integration relies on pixi to manage and lock the development environments. The full matrix of environments (Python and scikit-learn versions, documentation, linting, …) is declared in the pixi.toml file at the root of the repository, and locked in pixi.lock. Using pixi is therefore the most reliable way to reproduce the environment used by the CI.

First, install pixi by following the official installation instructions. On Linux and macOS, this is usually:

curl -fsSL https://pixi.sh/install.sh | sh

pixi creates an isolated environment for you (you do not need to create a virtual environment yourself) and downloads the dependencies on first use. The environments are named in the [environments] table of pixi.toml. The default environment targets the latest supported Python/scikit-learn stack and is used when no environment is given. The most common commands are:

# Run the test suite in the default environment
pixi run tests

# Run the test suite in a specific environment from the matrix
pixi run -e py311-sklearn16 tests

# Run the linters (ruff, pre-commit hooks, ...)
pixi run -e lint lint

# Build the documentation
pixi run -e sphinx docs

The sphinx environment also installs graphviz (which provides the dot command used to render some diagrams), so you do not have to install it yourself when building the documentation with pixi.

Using a virtual environment and pip#

If you prefer not to use pixi, you can work in a standard virtual environment. You will need python >=3.11.

Create and activate a virtual environment, for instance with the built-in venv module:

# Create a virtual environment in the `.venv` directory
python -m venv .venv

# Activate it (Linux/macOS)
source .venv/bin/activate

# Activate it (Windows, PowerShell)
.venv\Scripts\Activate.ps1

Once your environment is activated, install the development dependencies and setup pre-commit with:

python -m pip install --upgrade --editable './skore[test,sphinx,dev]'
pre-commit install

On old CPU architecture to get the support of polars:

python -m pip install --upgrade --editable './skore[test-lts-cpu,sphinx-lts-cpu,dev]'
pre-commit install

Consider re-executing this command each time you rebase your branch with main, as dependencies can change.

To build the documentation in this setup, you also need the dot command from Graphviz, which is not installed by pip. Install it with your system package manager, for example:

# Debian/Ubuntu
sudo apt-get install graphviz

# macOS (Homebrew)
brew install graphviz
Reproducing the CI environment with pip#

The dependency versions used by the CI are locked only in pixi’s own format (pixi.lock). If you do not use pixi but still want a pinned, cross-package-manager requirements.txt derived from skore/pyproject.toml, you can generate one on demand with the dedicated pixi task:

pixi run -e export export-requirements

This resolves the dependencies (including the test extra) with uv for the Python version pinned by the export environment and writes the result to requirements.txt at the root of the repository. The generated file is intentionally not committed; you can then install it with any pip-compatible tool:

pip install -r requirements.txt

Choosing an issue#

If you are new to open-source, you can start by an issue tagged “good first issue”.

The implementation of some issues are not very detailed. You can either propose a solution, or choose only the issues with a “Ready” status.

The contributor has to comment existing issue to be assigned by maintainers (this is a GH limitation), to highlight that an issue is ongoing and avoid overlapping work. If someone is already assigned to the issue, unless the PR has been stalled for weeks, maintainers will not assign someone else. Then, the contributor must link the PR to the issue.

Pull requests#

Quick start:

  • Create a branch for your changes.

  • Commit your changes.

  • Push to your fork.

  • Submit a pull request to the main branch.

    • Link your PR to its corresponding issue (if any).

    • You can mark your PR as draft if it is not ready to be reviewed by maintainers. You can use draft PR to get help on the code if needed.

We use the conventional commits format, and we automatically check that the PR title fits this format:

  • In particular, commits are “sentence case”, meaning that the fix: Fix issue title passes, while fix: fix issue does not.

  • Generally, the description of a commit should start with a verb in the imperative voice, so that it would properly complete the sentence: When applied, this commit will [...].

  • Examples of correct PR titles: docs: Update the docstrings or feat: Remove CrossValidationAggregationItem.

Skore is a company-driven project. We might provide extensive help to bring PRs to be merged to meet internal deadlines. In such cases, we will warn you in the PR.

Tests#

When adding a new feature to skore, please make sure to:

  1. Include unit tests

    Add tests to verify that your feature has as few bugs as possible. Tests are in the tests/ directory.

  2. Verify existing examples

    Check if your newly introduced changes do not impact existing examples.

    You can run all examples with:

    cd sphinx && make html
    

    Alternatively, you can run individual examples with:

    python <example_file>
    
  3. Update or add examples if needed

    • For a minor feature, adjust one existing example to demonstrate your change. Avoid creating many short example files.

    • For a major feature, add a single, concise example under examples/ (or update the gallery) that highlights the new capability.

To run the tests locally, you may run:

# With pixi (uses the default environment)
pixi run tests

# Or, in a plain virtual environment, from the `skore` directory
cd skore && pytest

Linting#

We use the linter ruff to make sure that the code is formatted correctly:

# With pixi
pixi run -e lint lint

# Or, in a plain virtual environment
pre-commit run --all-files

Pre-commit Hooks#

We use pre-commit hooks to ensure code quality before changes are committed. These hooks were installed during setup, but you can manually run them with:

pre-commit run --all-files

Documentation#

Setup#

Our documentation uses the PyData Sphinx Theme.

Building the documentation requires the dot command from Graphviz. It is bundled with the sphinx pixi environment; if you build the docs in a plain virtual environment, make sure Graphviz is installed on your system (see Using a virtual environment and pip).

To build the docs:

cd sphinx
make html

Alternatively, you can build them through pixi from the repository root, which also provides dot:

pixi run -e sphinx docs

You can access the local build at:

open build/html/index.html

A bot will automatically comment on your PR with a link to a documentation preview. Use this link to verify that your changes render correctly.

Skipping examples when building the docs#

The examples can take a long time to build, so if you are not working on them you can instead run the following to avoid building them altogether:

make html-noplot

If you are working on an example and wish to only build that one, you can do so by temporarily editing sphinx/conf.py. Follow the sphinx-gallery documentation for more information. By default, the examples that are built are Python files that start with plot_.

Note that by default, if an example has not changed since the last time you built it, it will not be re-built.

Contributing to the docstrings#

When writing documentation, whether it be online, docstrings or help messages in the CLI and in the UI, we strive to follow some conventions that are listed below. These might be updated as time goes on.

  1. We follow the scikit-learn documentation conventions for docstrings and narrative documentation.

  2. The docstring will be compiled using Sphinx numpydoc so use RST (ReStructured Text) for bold, URLs, etc.

  3. Argument descriptions should be written so that the following sentence makes sense: Argument <argument> designates <argument description>

  4. Argument descriptions start with lower case, and do not end with a period or other punctuation

  5. Argument descriptions start with “the” where relevant, and “whether” for booleans

  6. Text is written in US English (use “visualize” rather than “visualise”)

  7. In the CLI, positional arguments are written in snake case (snake_case), keyword arguments in kebab case (kebab-case)

  8. When there is a default argument, it should be shown in the help message, typically with (default: <default value>) at the end of the message

  9. Use clear, concise language (e.g. that can be understood by non-native English speakers)

Contributing to the examples#

The examples are stored in the examples folder:

  • They are organized in subcategories.

  • They should be written in a python script (.py), with cells marked by # %%, to separate code cells and markdown cells, as they will be rendered as notebooks (.ipynb).

  • The file should start with a docstring giving the example title.

  • No example should require to have large files stored in this repository. For example, no dataset should be stored, it should be downloaded in the script.

  • When built (using make html for example), these examples will automatically be converted into RST files in the sphinx/auto_examples subfolder. This subfolder is listed in the gitignore and cannot be pushed.

  • If you are visualizing the examples on the online documentation and notice some typos or things that could be improved, make sure that you are viewing the dev version of the documentation which is the latest version (e.g. check that the typo has not already been solved for example).

  • New examples should use datasets that are sufficiently interesting yet reasonably sized (avoid synthetic datasets with near-perfect scores). As examples are executed during the documentation build, their runtime must remain short (ideally under a few minutes).

Guidelines for creating effective examples:

  1. Types of examples:

    • Doctests: Use these in API documentation for demonstrating simple usage patterns.

    • User guide examples: Create comprehensive examples that demonstrate functionality in real-world contexts.

  2. Small features: For minor features, for instance when extending existing assets or easying a supported use-case, don’t create standalone examples. Instead, incorporate these into existing relevant documentation where they make sense contextually.

  3. Example content: Focus on demonstrating the core concept rather than exhaustively listing all possible arguments. Show the global idea of how to use the feature effectively.

  4. Dataset selection:

    • Use meaningful, realistic datasets (not synthetic data with artificially high scores)

    • Ensure examples execute efficiently (under a few minutes)

    • Prefer built-in or easily downloadable datasets

    • If downloading data, include clear code for this process

Contributing to the README#

The README.md file can be modified and is part of the documentation (although it is not included in the online documentation). This file is used to be presented on PyPI.

Pull Request Checklist#

Before marking your pull request as ready for review, ensure you have:

  1. Created or updated unit tests for your changes

  2. Run all tests locally and verified they pass

  3. Updated documentation if necessary

  4. Make sure the documentation can be ran without warning nor failure.

  5. Run pre-commit hooks on your code

  6. Signed all your commits

This checklist helps maintain code quality and ensures a smooth review process.