1
0
Fork 0
mirror of https://code.forgejo.org/forgejo/runner.git synced 2025-09-15 18:57:01 +00:00

fix: race condition in matrix job result state may result in failed jobs being marked as successful (#862)

In `setJobResult` there is no coordination between multiple jobs that are completing, leading to a possible condition where `jobResult` can be read from the matrix job as `"success"` by a job, marked as `"failed"` by another job, and then marked as `"success"` by other jobs.

To my knowledge, the race condition has not been observed in a real-world case, but has been reproduced in a unit test.

```
==================
WARNING: DATA RACE
Read at 0x00c0006d08a0 by goroutine 29232:
  code.forgejo.org/forgejo/runner/v9/act/runner.setJobResult()
      /.../forgejo-runner/act/runner/job_executor.go:173
+0x359
  code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.func6()
      /.../forgejo-runner/act/runner/job_executor.go:118
+0x15d
  code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.Executor.Finally.func14()
      /.../forgejo-runner/act/common/executor.go:183 +0x86
  code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.func7()
      /.../forgejo-runner/act/runner/job_executor.go:161
+0x191
  code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.Executor.Finally.func16()
      /.../forgejo-runner/act/common/executor.go:183 +0x86
  ...

Previous write at 0x00c0006d08a0 by goroutine 29234:
  code.forgejo.org/forgejo/runner/v9/act/runner.(*RunContext).result()
      /.../forgejo-runner/act/runner/run_context.go:897
+0x271
  code.forgejo.org/forgejo/runner/v9/act/runner.setJobResult()
      /.../forgejo-runner/act/runner/job_executor.go:181
+0x66e
  code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.func6()
      /.../forgejo-runner/act/runner/job_executor.go:118
+0x15d
  code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.Executor.Finally.func14()
      /.../forgejo-runner/act/common/executor.go:183 +0x86
  code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.func7()
      /.../forgejo-runner/act/runner/job_executor.go:161
+0x191
  code.forgejo.org/forgejo/runner/v9/act/runner.newJobExecutor.Executor.Finally.func16()
      /.../forgejo-runner/act/common/executor.go:183 +0x86
  ...
==================
```

<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- bug fixes
  - [PR](https://code.forgejo.org/forgejo/runner/pulls/862): <!--number 862 --><!--line 0 --><!--description Zml4OiByYWNlIGNvbmRpdGlvbiBpbiBtYXRyaXggam9iIHJlc3VsdCBzdGF0ZSBtYXkgcmVzdWx0IGluIGZhaWxlZCBqb2JzIGJlaW5nIG1hcmtlZCBhcyBzdWNjZXNzZnVs-->fix: race condition in matrix job result state may result in failed jobs being marked as successful<!--description-->
<!--end release-notes-assistant-->

Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/862
Reviewed-by: earl-warren <earl-warren@noreply.code.forgejo.org>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
This commit is contained in:
Mathieu Fenniak 2025-08-15 19:11:57 +00:00 committed by earl-warren
parent 03fbeab01b
commit 69a3adad21
No known key found for this signature in database
GPG key ID: F128CBE6AB3A7201
3 changed files with 82 additions and 1 deletions

View file

@ -162,6 +162,11 @@ func newJobExecutor(info jobInfo, sf stepFactory, rc *RunContext) common.Executo
func setJobResult(ctx context.Context, info jobInfo, rc *RunContext, success bool) {
logger := common.Logger(ctx)
// As we're reading the matrix build's status (`rc.Run.Job().Result`), it's possible for it change in another
// goroutine running `setJobResult` and invoking `.result(...)`. Prevent concurrent execution of `setJobResult`...
rc.Run.Job().ResultMutex.Lock()
defer rc.Run.Job().ResultMutex.Unlock()
jobResult := "success"
// we have only one result for a whole matrix build, so we need
// to keep an existing result state if we run a matrix