The working directory was not cleaned up upon completion of a LXC job because rc.stopJobContainer() -> rc.cleanUpJobContainer() -> rc.JobContainer.Remove() was never called for LXC containers.
- stopContainer() and closeContainer() must not call
rc.stopHostEnvironment(ctx) for LXC containers because
- it will needlessly be called twice
- it intercepts the call to
- rc.stopJobContainer()
- rc.JobContainer.Close()
- rc.stopHostEnvironment(ctx) must be called in rc.cleanUpJobContainer which is indirectly called by rc.stopJobContainer()
- since rc.JobContainer.Close() is a noop, not calling it for LXC containers had no consequence
Resolvesforgejo/runner#442
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- bug fixes
- [PR](https://code.forgejo.org/forgejo/runner/pulls/1003): <!--number 1003 --><!--line 0 --><!--description Zml4OiByZW1vdmUgTFhDIHdvcmtpbmcgZGlyZWN0b3J5IHdoZW4gaXQgY29tcGxldGVz-->fix: remove LXC working directory when it completes<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/1003
Reviewed-by: Mathieu Fenniak <mfenniak@noreply.code.forgejo.org>
Co-authored-by: Earl Warren <contact@earl-warren.org>
Co-committed-by: Earl Warren <contact@earl-warren.org>
Fixes#994.
First commit ensures that the interpolateResults method is invoked before data is sent to the reporter. Second commit changes how data is sent to the reporter to include both the result and the job outputs.
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- bug fixes
- [PR](https://code.forgejo.org/forgejo/runner/pulls/995): <!--number 995 --><!--line 0 --><!--description Zml4OiBzZW5kIGpvYiBvdXRwdXRzICYgam9iIHJlc3VsdCB0byBGb3JnZWpvIGluIHN5bmMgd2l0aCBlYWNoIG90aGVy-->fix: send job outputs & job result to Forgejo in sync with each other<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/995
Reviewed-by: earl-warren <earl-warren@noreply.code.forgejo.org>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
- enforce timeout-minutes timeout for jobs in a way similar to how it is done for steps
- minimal refactor of evaluateStepTimeout evaluateTimeout so it can be used by jobs as well, with additional debug information and error logging if parsing fails
- add integration tests for both step and job timeout-minutes, verifying expressions are allowed and evaluated
Resolvesforgejo/runner#979
---
Manually verified to work as expected https://v13.next.forgejo.org/earl-warren/testtimeout-minutes/actions/runs/3/jobs/0/attempt/1
```yaml
on: [push]
jobs:
test:
runs-on: docker
timeout-minutes: 1
steps:
- run: |
set -x
while : ; do
sleep 30
done
```

<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- bug fixes
- [PR](https://code.forgejo.org/forgejo/runner/pulls/982): <!--number 982 --><!--line 0 --><!--description Zml4OiBlbmZvcmNlIGpvYi48am9iLWlkPi50aW1lb3V0LW1pbnV0ZXM=-->fix: enforce job.<job-id>.timeout-minutes<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/982
Reviewed-by: Michael Kriese <michael.kriese@gmx.de>
Reviewed-by: Mathieu Fenniak <mfenniak@noreply.code.forgejo.org>
Co-authored-by: Earl Warren <contact@earl-warren.org>
Co-committed-by: Earl Warren <contact@earl-warren.org>
- create the caches interface and matching cachesImpl
- move the cache logic out of handler
- openDB
- readCache
- useCache
- gcCache
- access to the storage struct
- serve
- commit
- exist
- write
- add getCaches / setCaches to the handler interface so it can be
used by tests. The caches test should be implemented independently
in the future but this is a different kind of cleanup.
- no functional change, minimal refactor
- responseFatalJSON(w, r, err) replaces responseJSON(w, r, 500, err)
and calls fatal() when the following fail because they are
not recoverable. There may be other non-recoverable errors but
it is difficult to be 100% sure they cannot be engineered by the
caller of the API for DoS purposes.
- openDB
- findCache
- cache.Repo != repo
- wrap errors in
- openDB() - it was missing
- readCache() - it was missing
- useCache() - it was missing
- findCache() - some had identical messages
- in gc
- replace logger.Warnf with h.fatal
- differentiate errors that have identical messages
- call fatal if openDB fails instead of returning
in case of an error that is not recoverable (e.g. failing to open the
bolthold database), the cache can call fatal() to log the error and
send a TERM signal that will gracefully shutdown the daemon.
the license change from MIT to GPLv3+ is a breaking change
Refs forgejo/runner#773
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- other
- [PR](https://code.forgejo.org/forgejo/runner/pulls/940): <!--number 940 --><!--line 0 --><!--description Y2hvcmU6IGJ1bXAgdmVyc2lvbiB0byB2MTE=-->chore: bump version to v11<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/940
Reviewed-by: Michael Kriese <michael.kriese@gmx.de>
Co-authored-by: Earl Warren <contact@earl-warren.org>
Co-committed-by: Earl Warren <contact@earl-warren.org>
It was raised during embargo review of #925 that there are two implementations of `computeMac`; this PR fixes that.
As all the tests for `computeMac` were in the `artifactcache` package, it made more sense to keep the method there. That required reversing the dependency `artifactcache->cacheproxy` package dependency -- it makes more sense to me for the proxy to depend on the cache, rather than vice-versa.
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- other
- [PR](https://code.forgejo.org/forgejo/runner/pulls/936): <!--number 936 --><!--line 0 --><!--description cmVmYWN0b3I6IHJlbW92ZSBkdXBsaWNhdGUgY29tcHV0ZU1hYyBmdW5jdGlvbg==-->refactor: remove duplicate computeMac function<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/936
Reviewed-by: Michael Kriese <michael.kriese@gmx.de>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
- the Handler struct becomes handler (lowercase)
- the Handler interface is defined to be the existing methods
- isClosed() is added and used only in tests
- setgcAt() is added and used only in tests
---
This is to allow mocking the Handler interface for testing.
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- other
- [PR](https://code.forgejo.org/forgejo/runner/pulls/934): <!--number 934 --><!--line 0 --><!--description Y2hvcmU6IHJlZmFjdG9yIGFjdC9hcnRpZmFjdGNhY2hlIEhhbmRsZXIgdG8gYW4gaW50ZXJmYWNl-->chore: refactor act/artifactcache Handler to an interface<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/934
Reviewed-by: Mathieu Fenniak <mfenniak@noreply.code.forgejo.org>
Co-authored-by: Earl Warren <contact@earl-warren.org>
Co-committed-by: Earl Warren <contact@earl-warren.org>
Container images built by the runner are tagged with a unique name:
- based on the specified `uses` URL for remote actions.
- random for local actions.
In the case of local actions, this will create new tags for each run
but the images (and their layers) will be shared and not be
duplicated. The least recently used tags can be garbage collected by
tools such as https://github.com/stepchowfun/docuum.
Using a different method for creating the tag name for the remote
actions is to help with maintenance by establishing a direct relation
with the `uses` field. It was instead relying on a name transformed
multiple times which makes it more difficult to verify name collision
are not accidentally made possible by one of those transformations.
Without this fix, when a workflow ran a local [docker action](https://forgejo.org/docs/next/user/actions/actions/#docker-actions)
(e.g. the [example in the end-to-end
tests](8f920b4b7a/actions/example-force-rebuild/.forgejo/workflows/test.yml)),
it used an image tag that could collide with other workflows that
happen to use the same name.
The workaround for older runner versions is to set
[`[container].force_rebuild: true`](https://forgejo.org/docs/next/admin/actions/runner-installation/#configuration)
in the runner configuration file.
If the tag name collide (e.g. v9.1.1), it will fail with:
```
[push.yml/test] [DEBUG] Working directory '/home/earl-warren/software/runner/act/runner/testdata/local-action-dockerfile-tag/example2'
[push.yml/test] ❌ Failure - Main [[ "example1 SOMEONE" == "example2 SOMEONE" ]]
```
The functionality provided by this package is also provided by the
standard library.
`fmt.Errorf` for dynamically generated errors.
`errors.new` for static errors.
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- other
- [PR](https://code.forgejo.org/forgejo/runner/pulls/873): <!--number 873 --><!--line 0 --><!--description Y2hvcmU6IHJlbW92ZSBgZ2l0aHViLmNvbS9wa2cvZXJyb3JzYA==-->chore: remove `github.com/pkg/errors`<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/873
Reviewed-by: Michael Kriese <michael.kriese@gmx.de>
Reviewed-by: earl-warren <earl-warren@noreply.code.forgejo.org>
Co-authored-by: Gusted <postmaster@gusted.xyz>
Co-committed-by: Gusted <postmaster@gusted.xyz>
It does not assert anything useful and te associated function is otherwise heavily used in many tests. It may benefit from unit testing but this test would need to be done very differently to achieve that.
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- other
- [PR](https://code.forgejo.org/forgejo/runner/pulls/900): <!--number 900 --><!--line 0 --><!--description Y2hvcmU6IHJlbW92ZSBUZXN0UnVuQ29udGV4dF9HZXRHaXRIdWJDb250ZXh0-->chore: remove TestRunContext_GetGitHubContext<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/900
Reviewed-by: Michael Kriese <michael.kriese@gmx.de>
Co-authored-by: Earl Warren <contact@earl-warren.org>
Co-committed-by: Earl Warren <contact@earl-warren.org>
Data race detection identified that `getWorkflowSecrets` is mutating `rc.caller.runContext.Config.Secrets` while interpolating values, in the case where secrets are inherited by a reusable workflow. This map is also mutated earlier in evaluation by `(*RunContext).handleCredentials`. It's possible that multiple goroutines performing mutation to this shared map could cause runtime panics (not observed).
The issue is addressed creating a separate map to store interpolated secrets in `getWorkflowSecrets`, which was already the behavior in the non-inherited secret case.
Automated testing for this issue will be provided by #861 when all data races are resolved.
```
==================
WARNING: DATA RACE
Read at 0x00c0003a9620 by goroutine 2546:
runtime.mapaccess1_faststr()
/home/mfenniak/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.6.linux-amd64/src/internal/runtime/maps/runtime_faststr_swiss.go:103 +0x0
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).handleCredentials()
/.../forgejo-runner/act/runner/run_context.go:1395 +0xab
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).prepareJobContainer()
/.../forgejo-runner/act/runner/run_context.go:460 +0x2de
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).startContainer.func1.(*RunContext).startJobContainer.2()
/.../forgejo-runner/act/runner/run_context.go:610 +0x5e
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).startContainer.func1()
/.../forgejo-runner/act/runner/run_context.go:853 +0xf3
code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.NewPipelineExecutor.Executor.Then.func22()
/.../forgejo-runner/act/common/executor.go:136 +0x57
...snip...
Previous write at 0x00c0003a9620 by goroutine 2440:
runtime.mapassign_faststr()
/home/mfenniak/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.6.linux-amd64/src/internal/runtime/maps/runtime_faststr_swiss.go:263 +0x0
code.forgejo.org/forgejo/runner/v9/act/runner.getWorkflowSecrets()
/.../forgejo-runner/act/runner/expression.go:578 +0x547
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).NewExpressionEvaluatorWithEnv()
/.../forgejo-runner/act/runner/expression.go:85 +0x3fc
code.forgejo.org/forgejo/runner/v9/act/common/git.FindGitRevision()
/.../forgejo-runner/act/common/git/git.go:70 +0xe4
github.com/go-git/go-git/v5.PlainOpenWithOptions()
/home/mfenniak/go/pkg/mod/github.com/go-git/go-git/v5@v5.16.2/repository.go:332 +0x7a6
code.forgejo.org/forgejo/runner/v9/act/common/git.FindGitRevision()
/.../forgejo-runner/act/common/git/git.go:58 +0xc4
...snip...
==================
```
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/875
Reviewed-by: earl-warren <earl-warren@noreply.code.forgejo.org>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
Uses the `Repo` field as an index during searches of the cache database. Removes unused indexes.
To measure the performance of this change, I created a synthetic test which wrote 10,000 records into the artifact cache DB. Of course, all benchmarks are lies that can't be generalized to real-world usage, but it seems clear from the magnitude of improvement that this fixes a flawed implementation, even if it's not perfect.
- Unmodified performance:
- Write: 196 records/second
- Read: 1 record/second
- With `Repo` index being used for reads, and other indexes being removed:
- Write: 347 records/second
- Read: 22,398 records/second
`Repo` is, I think, the only index that made sense to remain, with an eye on workflow run performance:
- `Key` -- can't be used for index because `findCache` searches for key *prefixes*, not equal values.
- `Version` -- isn't very distinct for different workflow runs (https://code.forgejo.org/actions/cache#cache-version)
- `Complete` - significant portion of the cache DB will be complete, making it the least selective possible index
- `UsedAt` & `CreatedAt` - only used in GC operation, so could remain, but this isn't a performance-sensitive codepath
Closes#874.
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- bug fixes
- [PR](https://code.forgejo.org/forgejo/runner/pulls/878): <!--number 878 --><!--line 0 --><!--description Zml4OiBhcnRpZmFjdCBjYWNoZSBEQiBub3QgdXNpbmcgaW5kZXhlcyBmb3Igc2VhcmNoaW5n-->fix: artifact cache DB not using indexes for searching<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/878
Reviewed-by: Michael Kriese <michael.kriese@gmx.de>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
Refs forgejo/runner#881
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- bug fixes
- [PR](https://code.forgejo.org/forgejo/runner/pulls/884): <!--number 884 --><!--line 0 --><!--description Zml4OiBSdW5zT24gaW4gam9icGFyc2VyIGlzIG5vdCB1c2VkIGJ5IHRoZSBydW5uZXIgYnV0IGl0IGlzIHVzZWQgYnkgRm9yZ2VqbyBbc2tpcCBjYXNjYWRlXQ==-->fix: RunsOn in jobparser is not used by the runner but it is used by Forgejo [skip cascade]<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/884
Reviewed-by: Michael Kriese <michael.kriese@gmx.de>
Co-authored-by: Earl Warren <contact@earl-warren.org>
Co-committed-by: Earl Warren <contact@earl-warren.org>
Just fixes the build, not sure if this actually enables any functionality (yet) on FreeBSD. However, it does seem to at least start:
```
time="2025-08-18T01:02:58-04:00" level=info msg="Starting runner daemon"
```
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- bug fixes
- [PR](https://code.forgejo.org/forgejo/runner/pulls/882): <!--number 882 --><!--line 0 --><!--description Zml4OiBmaXhlcyB0aGUgYnVpbGQgb24gRnJlZUJTRCBbc2tpcCBjYXNjYWRlXQ==-->fix: fixes the build on FreeBSD [skip cascade]<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/882
Reviewed-by: earl-warren <earl-warren@noreply.code.forgejo.org>
Co-authored-by: Daniel Morante <daniel@morante.net>
Co-committed-by: Daniel Morante <daniel@morante.net>
A job with a `runs-on` that references matrix variables will not run with the expected labels. eg.
```
jobs:
matrix-runs-on:
strategy:
matrix:
os: [ubuntu-latest, ubuntu-20.04]
runs-on: ${{ matrix.os }}
steps:
...
```
Due to shared mutated state, both jobs that this generates will (w/ a race condition) either run with the `ubuntu-latest` or `ubuntu-20.04`, but rarely (never observed) with the expected outcome of running on both labels.
`EvaluateYamlNode` is used to evaluate expressions in the `runs-on` field in the context of the current running job, but mutating an object shared between multiple concurrent jobs (in matrix evaluation). This results in the evaluation results from one job spilling into another and corrupting their `runs-on` labels.
```
==================
WARNING: DATA RACE
Write at 0x00c00047e0b0 by goroutine 1739:
reflect.typedmemmove()
/.../go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.6.linux-amd64/src/runtime/mbarrier.go:213 +0x0
reflect.Value.Set()
/.../go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.6.linux-amd64/src/reflect/value.go:2062 +0x184
gopkg.in/yaml%2ev3.(*decoder).unmarshal()
/.../go/pkg/mod/gopkg.in/yaml.v3@v3.0.1/decode.go:493 +0x7b4
gopkg.in/yaml%2ev3.(*Node).Decode()
/.../go/pkg/mod/gopkg.in/yaml.v3@v3.0.1/yaml.go:149 +0x355
code.forgejo.org/forgejo/runner/v9/act/runner.expressionEvaluator.EvaluateYamlNode()
/.../forgejo-runner/act/runner/expression.go:372 +0x7a
code.forgejo.org/forgejo/runner/v9/act/runner.(*expressionEvaluator).EvaluateYamlNode()
<autogenerated>:1 +0x6b
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).runsOnPlatformNames()
/.../forgejo-runner/act/runner/run_context.go:1019 +0x2af
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).runsOnImage()
/.../forgejo-runner/act/runner/run_context.go:1002 +0x772
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).platformImage()
/.../forgejo-runner/act/runner/run_context.go:1032 +0x77
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).isEnabled()
/.../forgejo-runner/act/runner/run_context.go:1069 +0x3c7
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).Executor.func1()
/.../forgejo-runner/act/runner/run_context.go:964 +0x4b
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.1()
/.../forgejo-runner/act/runner/runner.go:223 +0x271
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.NewParallelExecutor.2.1()
/.../forgejo-runner/act/common/executor.go:107 +0x61
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.NewParallelExecutor.2.gowrap1()
/.../forgejo-runner/act/common/executor.go:109 +0x4f
Previous read at 0x00c00047e0b0 by goroutine 1742:
code.forgejo.org/forgejo/runner/v9/act/model.(*Job).RunsOn()
/.../forgejo-runner/act/model/workflow.go:361 +0x3c4
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).runsOnImage()
/.../forgejo-runner/act/runner/run_context.go:991 +0x57a
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).platformImage()
/.../forgejo-runner/act/runner/run_context.go:1032 +0x77
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).isEnabled()
/.../forgejo-runner/act/runner/run_context.go:1069 +0x3c7
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).Executor.func1()
/.../forgejo-runner/act/runner/run_context.go:964 +0x4b
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.1()
/.../forgejo-runner/act/runner/runner.go:223 +0x271
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.NewParallelExecutor.2.1()
/.../forgejo-runner/act/common/executor.go:107 +0x61
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.NewParallelExecutor.2.gowrap1()
/.../forgejo-runner/act/common/executor.go:109 +0x4f
...
==================
```
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- bug fixes
- [PR](https://code.forgejo.org/forgejo/runner/pulls/871): <!--number 871 --><!--line 0 --><!--description Zml4OiBkYXRhIHJhY2UgaW4gJ3J1bnMtb24nIGV4cHJlc3Npb25zIGNhdXNlcyBpbmNvcnJlY3Qgam9iIGxhYmVscyBkdXJpbmcgZXhlY3V0aW9u-->fix: data race in 'runs-on' expressions causes incorrect job labels during execution<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/871
Reviewed-by: earl-warren <earl-warren@noreply.code.forgejo.org>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
The `setupShell` function would update the shell stored on the `Step` object, setting it to either a default value from the job, an expression evaluated in the context of the job, a default from the workflow, or finally falling back to bash or powershell defaults. Typically this would be fine -- although it would trigger the data race detector because the `Step` is a shared object between multiple concurrent matrix evaluations for the job.
In the *really quite unlikely* case that the `shell` field on a step or job referenced a matrix variable, this data race would actually trigger the shared step's `Shell` value to end up as "whichever one was evaluated last", causing the wrong shell to be used. The new `matrix-shell` test triggers this behavior, and fails without the associated code fix.
As a fix, the `Shell` field in `Step` is never mutated; instead only the value on non-shared `stepRun` instance is updated from `setupShellCommand`. `Shell` was renamed to `RawShell` as part of verifying all references were updated and it seemed to make sense to keep that name since it is a pre-evaluator value.
```
==================
WARNING: DATA RACE
Write at 0x00c00013e9b0 by goroutine 1470:
code.forgejo.org/forgejo/runner/v9/act/runner.(*stepRun).setupShell()
/.../forgejo-runner/act/runner/step_run.go:210 +0x8f2
code.forgejo.org/forgejo/runner/v9/act/common/git.FindGitRevision()
/.../forgejo-runner/act/common/git/git.go:58 +0xc4
code.forgejo.org/forgejo/runner/v9/act/model.(*GithubContext).SetSha()
/.../forgejo-runner/act/model/github_context.go:161 +0x6b5
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).getGithubContext()
/.../forgejo-runner/act/runner/run_context.go:1228 +0x26ca
...
Previous write at 0x00c00013e9b0 by goroutine 1469:
code.forgejo.org/forgejo/runner/v9/act/runner.(*stepRun).setupShell()
/.../forgejo-runner/act/runner/step_run.go:210 +0x8f2
code.forgejo.org/forgejo/runner/v9/act/common/git.FindGitRevision()
/.../forgejo-runner/act/common/git/git.go:58 +0xc4
code.forgejo.org/forgejo/runner/v9/act/model.(*GithubContext).SetSha()
/.../forgejo-runner/act/model/github_context.go:161 +0x6b5
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).getGithubContext()
/.../forgejo-runner/act/runner/run_context.go:1228 +0x26ca
...
==================
```
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- bug fixes
- [PR](https://code.forgejo.org/forgejo/runner/pulls/865): <!--number 865 --><!--line 0 --><!--description Zml4OiBkYXRhIHJhY2UgY29uZGl0aW9uIGNhdXNpbmcgaW5jb3JyZWN0IGBzaGVsbGAgb24gYSB0YXNrIHN0ZXAgaWYgaXQgcmVmZXJlbmNlZCBhIG1hdHJpeCB2YXJpYWJsZQ==-->fix: data race condition causing incorrect `shell` on a task step if it referenced a matrix variable<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/865
Reviewed-by: earl-warren <earl-warren@noreply.code.forgejo.org>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
In `setJobResult` there is no coordination between multiple jobs that are completing, leading to a possible condition where `jobResult` can be read from the matrix job as `"success"` by a job, marked as `"failed"` by another job, and then marked as `"success"` by other jobs.
To my knowledge, the race condition has not been observed in a real-world case, but has been reproduced in a unit test.
```
==================
WARNING: DATA RACE
Read at 0x00c0006d08a0 by goroutine 29232:
code.forgejo.org/forgejo/runner/v9/act/runner.setJobResult()
/.../forgejo-runner/act/runner/job_executor.go:173
+0x359
code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.func6()
/.../forgejo-runner/act/runner/job_executor.go:118
+0x15d
code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.Executor.Finally.func14()
/.../forgejo-runner/act/common/executor.go:183 +0x86
code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.func7()
/.../forgejo-runner/act/runner/job_executor.go:161
+0x191
code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.Executor.Finally.func16()
/.../forgejo-runner/act/common/executor.go:183 +0x86
...
Previous write at 0x00c0006d08a0 by goroutine 29234:
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).result()
/.../forgejo-runner/act/runner/run_context.go:897
+0x271
code.forgejo.org/forgejo/runner/v9/act/runner.setJobResult()
/.../forgejo-runner/act/runner/job_executor.go:181
+0x66e
code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.func6()
/.../forgejo-runner/act/runner/job_executor.go:118
+0x15d
code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.Executor.Finally.func14()
/.../forgejo-runner/act/common/executor.go:183 +0x86
code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.func7()
/.../forgejo-runner/act/runner/job_executor.go:161
+0x191
code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.Executor.Finally.func16()
/.../forgejo-runner/act/common/executor.go:183 +0x86
...
==================
```
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- bug fixes
- [PR](https://code.forgejo.org/forgejo/runner/pulls/862): <!--number 862 --><!--line 0 --><!--description Zml4OiByYWNlIGNvbmRpdGlvbiBpbiBtYXRyaXggam9iIHJlc3VsdCBzdGF0ZSBtYXkgcmVzdWx0IGluIGZhaWxlZCBqb2JzIGJlaW5nIG1hcmtlZCBhcyBzdWNjZXNzZnVs-->fix: race condition in matrix job result state may result in failed jobs being marked as successful<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/862
Reviewed-by: earl-warren <earl-warren@noreply.code.forgejo.org>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
Tests were using `count` and similar variables without any concurrency safety and have been updated to use atomic operations. This may have caused rare miscounts in tests as operations like `count++` are not thread-safe, but to my knowledge these have never been observed.
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/860
Reviewed-by: Gusted <gusted@noreply.code.forgejo.org>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
`ContainerDaemonSocket` is stored on a shared struct and was mutated to a default value when empty, which trips the data race detector as a mutation of shared state without any synchronization. However as all codepaths would be setting it to the same value in the mutation, here's no functional bug. This commit prevents the "false positive", but it also centralizes the default value for a slightly better programming practice.
```
==================
WARNING: DATA RACE
Read at 0x00c00027f9e0 by goroutine 1104:
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).GetBindsAndMounts()
/.../forgejo-runner/act/runner/run_context.go:130 +0x87
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).prepareJobContainer()
/.../forgejo-runner/act/runner/run_context.go:449 +0xad1
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).startContainer.func1.(*RunContext).startJobContainer.2()
/.../forgejo-runner/act/runner/run_context.go:587 +0x5e
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).startContainer.func1()
/.../forgejo-runner/act/runner/run_context.go:836 +0xf3
code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.NewPipelineExecutor.Executor.Then.func21()
/.../forgejo-runner/act/common/executor.go:136 +0x57
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).Executor.func1()
/.../forgejo-runner/act/runner/run_context.go:929 +0x68
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.1()
/.../forgejo-runner/act/runner/runner.go:218 +0x271
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.NewParallelExecutor.2.1()
/.../forgejo-runner/act/common/executor.go:107 +0x61
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.NewParallelExecutor.2.gowrap1()
/.../forgejo-runner/act/common/executor.go:109 +0x4f
Previous write at 0x00c00027f9e0 by goroutine 1103:
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).GetBindsAndMounts()
/.../forgejo-runner/act/runner/run_context.go:131 +0xc7
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).prepareJobContainer()
/.../forgejo-runner/act/runner/run_context.go:449 +0xad1
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).startContainer.func1.(*RunContext).startJobContainer.2()
/.../forgejo-runner/act/runner/run_context.go:587 +0x5e
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).startContainer.func1()
/.../forgejo-runner/act/runner/run_context.go:836 +0xf3
code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.NewPipelineExecutor.Executor.Then.func21()
/.../forgejo-runner/act/common/executor.go:136 +0x57
code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).Executor.func1()
/.../forgejo-runner/act/runner/run_context.go:929 +0x68
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.1()
/.../forgejo-runner/act/runner/runner.go:218 +0x271
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.NewParallelExecutor.2.1()
/.../forgejo-runner/act/common/executor.go:107 +0x61
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.NewParallelExecutor.2.gowrap1()
/.../forgejo-runner/act/common/executor.go:109 +0x4f
Goroutine 1104 (running) created at:
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.NewParallelExecutor.2()
/.../forgejo-runner/act/common/executor.go:105 +0x144
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.NewParallelExecutor.3.1()
/.../forgejo-runner/act/common/executor.go:107 +0x61
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.NewParallelExecutor.3.gowrap1()
/.../forgejo-runner/act/common/executor.go:109 +0x4f
Goroutine 1103 (running) created at:
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.NewParallelExecutor.2()
/.../forgejo-runner/act/common/executor.go:105 +0x144
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.NewParallelExecutor.3.1()
/.../forgejo-runner/act/common/executor.go:107 +0x61
code.forgejo.org/forgejo/runner/v9/act/runner.(*runnerImpl).NewPlanExecutor.func1.NewParallelExecutor.3.gowrap1()
/.../forgejo-runner/act/common/executor.go:109 +0x4f
==================
```
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- other
- [PR](https://code.forgejo.org/forgejo/runner/pulls/859): <!--number 859 --><!--line 0 --><!--description Y2hvcmU6IHByZXZlbnQgImZhbHNlIHBvc2l0aXZlIiBkYXRhIHJhY2UgZGV0ZWN0aW9uIHcvIENvbnRhaW5lckRhZW1vblNvY2tldA==-->chore: prevent "false positive" data race detection w/ ContainerDaemonSocket<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/859
Reviewed-by: Gusted <gusted@noreply.code.forgejo.org>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
Data race is being flagged because a goroutine is currently writing tar data into `atar`, and `assert.NotEmpty(atar)` performs internal structure reflection into the returned `io.PipeWriter` and violates its internal synchronization primitives. This assertion doesn't seem to add any value to the test compared to just reading the pipe, so it has been removed.
Data race details (abbreviated):
```
==================
WARNING: DATA RACE
Write at 0x00c00041fbc0 by goroutine 55:
sync/atomic.CompareAndSwapInt32()
/home/mfenniak/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.6.linux-amd64/src/runtime/race_amd64.s:361
+0xb
sync/atomic.CompareAndSwapInt32()
<autogenerated>:1 +0x18
sync.(*Mutex).Lock()
/home/mfenniak/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.6.linux-amd64/src/sync/mutex.go:46
+0x28
io.(*pipe).write()
/home/mfenniak/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.6.linux-amd64/src/io/pipe.go:81
+0x9e
io.(*PipeWriter).Write()
/home/mfenniak/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.6.linux-amd64/src/io/pipe.go:161
+0x46
archive/tar.(*Writer).Flush()
/home/mfenniak/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.6.linux-amd64/src/archive/tar/writer.go:59
+0xcc
archive/tar.(*Writer).WriteHeader()
/home/mfenniak/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.6.linux-amd64/src/archive/tar/writer.go:71
+0x46
code.forgejo.org/forgejo/runner/v9/act/runner.actionCacheCopyFileOrDir()
/home/mfenniak/Dev/forgejo-runner/act/runner/action_cache.go:203
+0xabd
code.forgejo.org/forgejo/runner/v9/act/runner.GoGitActionCache.GetTarArchive.func2.1()
/home/mfenniak/Dev/forgejo-runner/act/runner/action_cache.go:154
+0xa5
...
Previous read at 0x00c00041fbc0 by goroutine 9:
reflect.typedmemmove()
/home/mfenniak/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.6.linux-amd64/src/runtime/mbarrier.go:213
+0x0
reflect.packEface()
/home/mfenniak/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.6.linux-amd64/src/reflect/value.go:136
+0xc5
reflect.valueInterface()
/home/mfenniak/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.6.linux-amd64/src/reflect/value.go:1513
+0x179
reflect.Value.Interface()
/home/mfenniak/go/pkg/mod/golang.org/toolchain@v0.0.1-go1.24.6.linux-amd64/src/reflect/value.go:1484
+0x106
github.com/stretchr/testify/assert.isEmpty()
/home/mfenniak/go/pkg/mod/github.com/stretchr/testify@v1.10.0/assert/assertions.go:735
+0x7f
github.com/stretchr/testify/assert.NotEmpty()
/home/mfenniak/go/pkg/mod/github.com/stretchr/testify@v1.10.0/assert/assertions.go:769
+0x56
github.com/stretchr/testify/assert.(*Assertions).NotEmpty()
/home/mfenniak/go/pkg/mod/github.com/stretchr/testify@v1.10.0/assert/assertion_forward.go:1175
+0xb1
code.forgejo.org/forgejo/runner/v9/act/runner.TestActionCache.func1()
/home/mfenniak/Dev/forgejo-runner/act/runner/action_cache_test.go:60
+0x21e
==================
```
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- other
- [PR](https://code.forgejo.org/forgejo/runner/pulls/858): <!--number 858 --><!--line 0 --><!--description dGVzdDogcHJldmVudCBkYXRhIHJhY2UgZGV0ZWN0aW9uIGluIFRlc3RBY3Rpb25DYWNoZSBbc2tpcCBjYXNjYWRlXQ==-->test: prevent data race detection in TestActionCache [skip cascade]<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/858
Reviewed-by: Gusted <gusted@noreply.code.forgejo.org>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
When a reusable workflow is called twice in the same workflow in
parallel, it may require a dedicated network to run (for instance if
it spawns services) and will always require unique volumes to mount
the workdir and the env.
There really is no way to guarantee a unique name derived from the
job name etc. Instead, a random name is set and used as a base for
both the internal volumes and the dedicated network (if any).
Replace asserting hard coded names with assertions on how the services
and the job container relate. It slightly improves logic coverage and
makes the test insensitive to how network and volume names are created.
- compare the network names of the services and the job container to be
equal, demonstrating they can communicate
- verify the mounts and valid volumes of services to be empty
- verify the internal mounts of the job container to be valid volumes