1
0
Fork 0
mirror of https://code.forgejo.org/forgejo/runner.git synced 2025-09-30 19:22:09 +00:00
Commit graph

33 commits

Author SHA1 Message Date
Mathieu Fenniak
d79d043696
fix: allow GC & cache operations to operate concurrently (#1040)
Fixes #1039.

Rather than opening and closing the Bolt DB instance constantly, the cache now maintains one open `*bolthold.Store` for its lifetime, allowing GC, cache read, and cache write operations to occur concurrently.

The major risk is this change is, "is it safe to use one Bolt instance across goroutines concurrently?"  [Bolt does document its concurrency requirements](https://github.com/boltdb/bolt?tab=readme-ov-file#transactions), and an analysis of our DB interactions looks to me like it introduces very little risk.

Most of the cache operations perform multiple touches to the database; for example `useCache` performs a read to fetch a cache object, and then an update to set its `UsedAt` timestamp.  If we wanted to ensure consistency in these operations, they should use a Bolt ReadWrite transaction -- but concurrent access would just be setting the field to the same value anyway.

The `gcCache` is the complex operation where a transaction might be warranted -- but doing so would also cause the same bug that #1039 indicates.  I believe it is safe to run without a transaction because it is protected by an application-level mutex (to prevent multiple concurrent GCs), it is the only code that performs deletes from the database -- these should guarantee that all its delete attempts are successful.  In the event of unexpected failure to do the DB write, `gcCache` deletes from the storage before deleting from the DB, so it should just attempt to cleanup again next run.

<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- bug fixes
  - [PR](https://code.forgejo.org/forgejo/runner/pulls/1040): <!--number 1040 --><!--line 0 --><!--description Zml4OiBhbGxvdyBHQyAmIGNhY2hlIG9wZXJhdGlvbnMgdG8gb3BlcmF0ZSBjb25jdXJyZW50bHk=-->fix: allow GC & cache operations to operate concurrently<!--description-->
<!--end release-notes-assistant-->

Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/1040
Reviewed-by: earl-warren <earl-warren@noreply.code.forgejo.org>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
2025-09-30 19:12:45 +00:00
Earl Warren
5f0b036e34
chore: cache: move findCacheWithIsolationKeyFallback out of handler.find 2025-09-05 17:30:08 +02:00
Earl Warren
c28a98082b
chore: cache: move repo != cache.Repo in readCache
- it only is used after calling readCache
- add unit test

it reduces the number of testcase to be considered in handler
2025-09-05 17:30:08 +02:00
Earl Warren
6c4e705f97
chore: cache: split caches implementation out of handler
- create the caches interface and matching cachesImpl
- move the cache logic out of handler
  - openDB
  - readCache
  - useCache
  - gcCache
  - access to the storage struct
    - serve
    - commit
    - exist
    - write
- add getCaches / setCaches to the handler interface so it can be
  used by tests. The caches test should be implemented independently
  in the future but this is a different kind of cleanup.
- no functional change, minimal refactor
2025-09-05 17:30:08 +02:00
Earl Warren
37f634fd31
fix: cache: call fatal() on errors that are not recoverable
- responseFatalJSON(w, r, err) replaces responseJSON(w, r, 500, err)
  and calls fatal() when the following fail because they are
  not recoverable. There may be other non-recoverable errors but
  it is difficult to be 100% sure they cannot be engineered by the
  caller of the API for DoS purposes.
  - openDB
  - findCache
  - cache.Repo != repo
- wrap errors in
  - openDB() - it was missing
  - readCache() - it was missing
  - useCache() - it was missing
  - findCache() - some had identical messages
- in gc
  - replace logger.Warnf with h.fatal
  - differentiate errors that have identical messages
  - call fatal if openDB fails instead of returning
2025-09-05 17:29:04 +02:00
Earl Warren
36ca627f2e
feat: cache: fatal() helper to gracefully terminate the runner
in case of an error that is not recoverable (e.g. failing to open the
bolthold database), the cache can call fatal() to log the error and
send a TERM signal that will gracefully shutdown the daemon.
2025-09-05 17:26:12 +02:00
Earl Warren
8a7f760d3c
chore: bump version to v11 (#940)
the license change from MIT to GPLv3+ is a breaking change

Refs forgejo/runner#773

<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- other
  - [PR](https://code.forgejo.org/forgejo/runner/pulls/940): <!--number 940 --><!--line 0 --><!--description Y2hvcmU6IGJ1bXAgdmVyc2lvbiB0byB2MTE=-->chore: bump version to v11<!--description-->
<!--end release-notes-assistant-->

Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/940
Reviewed-by: Michael Kriese <michael.kriese@gmx.de>
Co-authored-by: Earl Warren <contact@earl-warren.org>
Co-committed-by: Earl Warren <contact@earl-warren.org>
2025-09-05 07:29:38 +00:00
Mathieu Fenniak
a3aedba3f1
refactor: remove duplicate computeMac function (#936)
It was raised during embargo review of #925 that there are two implementations of `computeMac`; this PR fixes that.

As all the tests for `computeMac` were in the `artifactcache` package, it made more sense to keep the method there.  That required reversing the dependency `artifactcache->cacheproxy` package dependency -- it makes more sense to me for the proxy to depend on the cache, rather than vice-versa.

<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- other
  - [PR](https://code.forgejo.org/forgejo/runner/pulls/936): <!--number 936 --><!--line 0 --><!--description cmVmYWN0b3I6IHJlbW92ZSBkdXBsaWNhdGUgY29tcHV0ZU1hYyBmdW5jdGlvbg==-->refactor: remove duplicate computeMac function<!--description-->
<!--end release-notes-assistant-->

Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/936
Reviewed-by: Michael Kriese <michael.kriese@gmx.de>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
2025-09-05 06:01:49 +00:00
Earl Warren
69c6c70845
chore: refactor act/artifactcache Handler to an interface (#934)
- the Handler struct becomes handler (lowercase)
- the Handler interface is defined to be the existing methods
- isClosed() is added and used only in tests
- setgcAt() is added and used only in tests

---

This is to allow mocking the Handler interface for testing.

<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- other
  - [PR](https://code.forgejo.org/forgejo/runner/pulls/934): <!--number 934 --><!--line 0 --><!--description Y2hvcmU6IHJlZmFjdG9yIGFjdC9hcnRpZmFjdGNhY2hlIEhhbmRsZXIgdG8gYW4gaW50ZXJmYWNl-->chore: refactor act/artifactcache Handler to an interface<!--description-->
<!--end release-notes-assistant-->

Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/934
Reviewed-by: Mathieu Fenniak <mfenniak@noreply.code.forgejo.org>
Co-authored-by: Earl Warren <contact@earl-warren.org>
Co-committed-by: Earl Warren <contact@earl-warren.org>
2025-09-04 14:38:50 +00:00
Mathieu Fenniak
da7ef7c2a1
fix: PRs cache artifacts separate from other runs 2025-09-01 13:45:43 +02:00
Mathieu Fenniak
022d5ad3e7
fix: artifact cache DB not using indexes for searching (#878)
Uses the `Repo` field as an index during searches of the cache database.  Removes unused indexes.

To measure the performance of this change, I created a synthetic test which wrote 10,000 records into the artifact cache DB.  Of course, all benchmarks are lies that can't be generalized to real-world usage, but it seems clear from the magnitude of improvement that this fixes a flawed implementation, even if it's not perfect.
- Unmodified performance:
    - Write: 196 records/second
    - Read: 1 record/second
- With `Repo` index being used for reads, and other indexes being removed:
    - Write: 347 records/second
    - Read: 22,398 records/second

`Repo` is, I think, the only index that made sense to remain, with an eye on workflow run performance:
- `Key` -- can't be used for index because `findCache` searches for key *prefixes*, not equal values.
- `Version` -- isn't very distinct for different workflow runs (https://code.forgejo.org/actions/cache#cache-version)
- `Complete` - significant portion of the cache DB will be complete, making it the least selective possible index
- `UsedAt` & `CreatedAt` - only used in GC operation, so could remain, but this isn't a performance-sensitive codepath

Closes #874.

<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- bug fixes
  - [PR](https://code.forgejo.org/forgejo/runner/pulls/878): <!--number 878 --><!--line 0 --><!--description Zml4OiBhcnRpZmFjdCBjYWNoZSBEQiBub3QgdXNpbmcgaW5kZXhlcyBmb3Igc2VhcmNoaW5n-->fix: artifact cache DB not using indexes for searching<!--description-->
<!--end release-notes-assistant-->

Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/878
Reviewed-by: Michael Kriese <michael.kriese@gmx.de>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
2025-08-19 20:19:23 +00:00
Earl Warren
ec99579451
chore: to allow the runner to be imported, v9 needs to be in the go module (#777)
It will be imported by Forgejo.

<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- other
  - [PR](https://code.forgejo.org/forgejo/runner/pulls/777): <!--number 777 --><!--line 0 --><!--description Y2hvcmU6IHRvIGFsbG93IHRoZSBydW5uZXIgdG8gYmUgaW1wb3J0ZWQsIHY5IG5lZWRzIHRvIGJlIGluIHRoZSBnbyBtb2R1bGU=-->chore: to allow the runner to be imported, v9 needs to be in the go module<!--description-->
<!--end release-notes-assistant-->

Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/777
Reviewed-by: Michael Kriese <michael.kriese@gmx.de>
Co-authored-by: Earl Warren <contact@earl-warren.org>
Co-committed-by: Earl Warren <contact@earl-warren.org>
2025-07-31 10:35:11 +00:00
Earl Warren
ebc7758c1f
chore: s|github.com/nektos/act/pkg|code.forgejo.org/forgejo/runner/act| 2025-07-28 19:23:07 +02:00
Earl Warren
21f71e5cdc Revert "fix: docker buildx cache restore not working" (#173)
This reverts commit f147e45da3.

https://code.forgejo.org/forgejo/act/pulls/122/commits/f147e45da3b29e555527cd178a5c07f1240aeb62

is not the same as

https://github.com/nektos/act/pull/2236/files

Refs: https://code.forgejo.org/forgejo/act/pulls/122

Reviewed-on: https://code.forgejo.org/forgejo/act/pulls/173
Reviewed-by: Michael Kriese <michael.kriese@gmx.de>
Co-authored-by: Earl Warren <contact@earl-warren.org>
Co-committed-by: Earl Warren <contact@earl-warren.org>
2025-07-07 11:06:04 +00:00
Christoph Heiss
92b7df3da7 fix: artifacts: format IP:port pair using net.JoinHostPort()
This ensures that brackets are added for IPv6 addresses.
Without this, This could result in addresses like "2001:db8::1:3456",
which - obviously - would break further down and prevent the server from
starting.

Signed-off-by: Christoph Heiss <christoph@c8h4.io>
2025-06-01 12:10:47 +02:00
ChristopherHX
f147e45da3 fix: docker buildx cache restore not working 2025-04-24 09:00:51 +00:00
Kwonunn
835a9d2068 fix: move reading cache to separate function 2025-03-24 10:48:28 +01:00
Kwonunn
639b83c26c fix: do not immediately close the db after opening it 2025-03-24 10:17:04 +01:00
Earl Warren
187e1df52c fix: reduce the time during which the database stays open
* During get/upload, close the database while reading/writing so
  it does not stay open for longer than necessary. This may be helpful
  when uploads run in parallel.
* Be more informative when returning error 500
* Make useCache handle errors
* Return 500 immediately when writing the cache fails instead of falling
  through to 200

Refs: https://code.forgejo.org/forgejo/runner/issues/509
2025-03-23 23:25:09 +01:00
Kwonunn
9150081892 return 404 when not found 2025-03-21 13:45:51 +00:00
Kwonunn
7a21d64333 review: discard params in clean 2025-03-21 13:45:51 +00:00
Kwonunn
11062e4d6a return 403 instead of 500 when not authorized correctly 2025-03-21 13:45:51 +00:00
Kwonunn
e3adb49c50 functional save and restore through proxy 2025-03-21 13:45:51 +00:00
Kwonunn
95e754c06b integrate the new cache proxy with the server viceice set up 2025-03-21 13:45:51 +00:00
Michael Kriese
1082b31367 fix: partial secure cache 2025-03-21 13:45:51 +00:00
Michael Kriese
d8376ed890 fix(cache-server): use consistent uint64 2024-11-22 01:01:12 +01:00
ChristopherHX
017db5edae fix: cache adjust restore order of exact key matches (#2267)
* wip: adjust restore order

* fixup

* add tests

* cleanup

* fix typo

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
(cherry picked from commit 88ac88c5413a430b41d767a942c8e70e778e1d61)
2024-11-07 16:25:40 +01:00
Jason Song
b0c54edc78 Support overwriting caches (#2265)
* feat: support overwrite caches

* test: fix case

* test: fix get_with_multiple_keys

* chore: use atomic.Bool

* test: improve get_with_multiple_keys

* chore: use ping to improve path

* fix: wrong CompareAndSwap

* test: TestHandler_gcCache

* chore: lint code

* chore: lint code

(cherry picked from commit 087b28afc56351b93dd68d7e59a2c8740f6c0e44)
2024-11-07 16:25:26 +01:00
ChristopherHX
4b2554db86 fix: docker buildx cache restore not working (#2236)
* To take effect artifacts v4 pr is needed with adjusted claims

(cherry picked from commit c606759e8c0c2d5036c5bb15d7ec87beca1150cf)
2024-11-07 16:25:15 +01:00
Kristoffer
a25c37e83c fix: match cache restore-keys in creation reverse order (#2153)
* Match cache restore-keys in creation reverse order

* Match full prefix when selecting cache

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
2024-01-20 12:11:50 +00:00
Jason Song
ab0f270c64 fix: handle zero size (#1888) 2023-07-10 20:35:27 -07:00
ChristopherHX
4d3a674f12 refactor: open boltdb only while using it (#1879)
* refactor: open boltdb only while using it

* patch

* Update handler_test.go

* Update handler_test.go

* Update handler_test.go

* Update handler.go

* timeout * 10

* pr feedback

* fixup
2023-07-10 16:57:06 +00:00
Jason Song
b51f608660 Support cache (#1770)
* feat: port

* fix: use httprouter

* fix: WriteHeader

* fix: bolthold

* fix: bugs

* chore: one less file

* test: test handler

* fix: bug in id

* test: fix cases

* chore: tidy

* fix: use atomic.Int32

* fix: use atomic.Store

* feat: support close

* chore: lint

* fix: cache keys are case insensitive

* fix: options

* fix: use options

* fix: close

* fix: ignore close error

* Revert "fix: close"

This reverts commit d53ea7568ba03908eb153031c435008fd47e7ccb.

* fix: cacheUrlKey

* fix: nil close

* chore: lint code

* fix: test key

* test: case insensitive

* chore: lint
2023-04-28 15:57:40 +00:00