Effective Caching with GitHub Actions

GitHub Actions is great as a CI/CD platform. However, to be really efficient, workflows need to leverage some optimization techniques, such as caching or running tasks in parallel. In this note, I am sharing some thoughts on how to use cache effectively, with respect to multiple paths and sudo-installed APT packages. The discussion will touch on a few non-trivial aspects that, in my opinion, are not well-explained in other web materials.

My use case was simple: speed-up building a university course in Sphinx, to be hosted on GitHub pages. Installing Sphinx dependencies required multiple Python and Linux APT downloads, which took quite a long. The caching solution indeed fixed a problem, and here are the key takeaways:

This repository demonstrates the working solution. The cache size (Python and APT packages) ais about 120MB. And this is how the job looks like:

name: docs

on: [push, pull_request, workflow_dispatch]

jobs:
  docs:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v2
      - uses: actions/setup-python@v2
      - name: prepare virtual environment
        run: |
          python -m venv .venv
          mkdir .apt
      - name: cache dependencies
        id: cache_deps
        uses: actions/cache@v3
        env:
            cache-name: cache-dependencies
        with:
          path: |
            .venv 
            .apt
          key: ${{ runner.os }}-build-${{ env.cache-name }}-${{ hashFiles('.github/workflows/*') }}
      - name: Install python dependencies
        if: ${{ steps.cache_deps.outputs.cache-hit != 'true' }}
        run: |
          source .venv/bin/activate
          pip install jupyter-book
          pip install sphinxcontrib-plantuml
      - name: Install sudo dependencies
        run: |
          sudo apt-get -o Dir::Cache=".apt" update
          sudo apt-get -o Dir::Cache=".apt" install plantuml
      - run: |
          apt-config dump | grep Dir::Cache
      - name: Compile Docs
        run: |
          source .venv/bin/activate
          jupyter-book build docs
      - name: Deploy to gh-pages
        uses: peaceiris/actions-gh-pages@v3
        with:
          github_token: ${{ secrets.GITHUB_TOKEN }}
          publish_branch: gh-pages
          publish_dir: ./docs/_build/html

Published by mskorski

Scientist, Consultant, Learning Enthusiast

Leave a comment

Your email address will not be published. Required fields are marked *