Fixes#1039.
Rather than opening and closing the Bolt DB instance constantly, the cache now maintains one open `*bolthold.Store` for its lifetime, allowing GC, cache read, and cache write operations to occur concurrently.
The major risk is this change is, "is it safe to use one Bolt instance across goroutines concurrently?" [Bolt does document its concurrency requirements](https://github.com/boltdb/bolt?tab=readme-ov-file#transactions), and an analysis of our DB interactions looks to me like it introduces very little risk.
Most of the cache operations perform multiple touches to the database; for example `useCache` performs a read to fetch a cache object, and then an update to set its `UsedAt` timestamp. If we wanted to ensure consistency in these operations, they should use a Bolt ReadWrite transaction -- but concurrent access would just be setting the field to the same value anyway.
The `gcCache` is the complex operation where a transaction might be warranted -- but doing so would also cause the same bug that #1039 indicates. I believe it is safe to run without a transaction because it is protected by an application-level mutex (to prevent multiple concurrent GCs), it is the only code that performs deletes from the database -- these should guarantee that all its delete attempts are successful. In the event of unexpected failure to do the DB write, `gcCache` deletes from the storage before deleting from the DB, so it should just attempt to cleanup again next run.
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- bug fixes
- [PR](https://code.forgejo.org/forgejo/runner/pulls/1040): <!--number 1040 --><!--line 0 --><!--description Zml4OiBhbGxvdyBHQyAmIGNhY2hlIG9wZXJhdGlvbnMgdG8gb3BlcmF0ZSBjb25jdXJyZW50bHk=-->fix: allow GC & cache operations to operate concurrently<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/1040
Reviewed-by: earl-warren <earl-warren@noreply.code.forgejo.org>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
- create the caches interface and matching cachesImpl
- move the cache logic out of handler
- openDB
- readCache
- useCache
- gcCache
- access to the storage struct
- serve
- commit
- exist
- write
- add getCaches / setCaches to the handler interface so it can be
used by tests. The caches test should be implemented independently
in the future but this is a different kind of cleanup.
- no functional change, minimal refactor
- responseFatalJSON(w, r, err) replaces responseJSON(w, r, 500, err)
and calls fatal() when the following fail because they are
not recoverable. There may be other non-recoverable errors but
it is difficult to be 100% sure they cannot be engineered by the
caller of the API for DoS purposes.
- openDB
- findCache
- cache.Repo != repo
- wrap errors in
- openDB() - it was missing
- readCache() - it was missing
- useCache() - it was missing
- findCache() - some had identical messages
- in gc
- replace logger.Warnf with h.fatal
- differentiate errors that have identical messages
- call fatal if openDB fails instead of returning
in case of an error that is not recoverable (e.g. failing to open the
bolthold database), the cache can call fatal() to log the error and
send a TERM signal that will gracefully shutdown the daemon.
the license change from MIT to GPLv3+ is a breaking change
Refs forgejo/runner#773
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- other
- [PR](https://code.forgejo.org/forgejo/runner/pulls/940): <!--number 940 --><!--line 0 --><!--description Y2hvcmU6IGJ1bXAgdmVyc2lvbiB0byB2MTE=-->chore: bump version to v11<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/940
Reviewed-by: Michael Kriese <michael.kriese@gmx.de>
Co-authored-by: Earl Warren <contact@earl-warren.org>
Co-committed-by: Earl Warren <contact@earl-warren.org>
It was raised during embargo review of #925 that there are two implementations of `computeMac`; this PR fixes that.
As all the tests for `computeMac` were in the `artifactcache` package, it made more sense to keep the method there. That required reversing the dependency `artifactcache->cacheproxy` package dependency -- it makes more sense to me for the proxy to depend on the cache, rather than vice-versa.
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- other
- [PR](https://code.forgejo.org/forgejo/runner/pulls/936): <!--number 936 --><!--line 0 --><!--description cmVmYWN0b3I6IHJlbW92ZSBkdXBsaWNhdGUgY29tcHV0ZU1hYyBmdW5jdGlvbg==-->refactor: remove duplicate computeMac function<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/936
Reviewed-by: Michael Kriese <michael.kriese@gmx.de>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
- the Handler struct becomes handler (lowercase)
- the Handler interface is defined to be the existing methods
- isClosed() is added and used only in tests
- setgcAt() is added and used only in tests
---
This is to allow mocking the Handler interface for testing.
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- other
- [PR](https://code.forgejo.org/forgejo/runner/pulls/934): <!--number 934 --><!--line 0 --><!--description Y2hvcmU6IHJlZmFjdG9yIGFjdC9hcnRpZmFjdGNhY2hlIEhhbmRsZXIgdG8gYW4gaW50ZXJmYWNl-->chore: refactor act/artifactcache Handler to an interface<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/934
Reviewed-by: Mathieu Fenniak <mfenniak@noreply.code.forgejo.org>
Co-authored-by: Earl Warren <contact@earl-warren.org>
Co-committed-by: Earl Warren <contact@earl-warren.org>
Uses the `Repo` field as an index during searches of the cache database. Removes unused indexes.
To measure the performance of this change, I created a synthetic test which wrote 10,000 records into the artifact cache DB. Of course, all benchmarks are lies that can't be generalized to real-world usage, but it seems clear from the magnitude of improvement that this fixes a flawed implementation, even if it's not perfect.
- Unmodified performance:
- Write: 196 records/second
- Read: 1 record/second
- With `Repo` index being used for reads, and other indexes being removed:
- Write: 347 records/second
- Read: 22,398 records/second
`Repo` is, I think, the only index that made sense to remain, with an eye on workflow run performance:
- `Key` -- can't be used for index because `findCache` searches for key *prefixes*, not equal values.
- `Version` -- isn't very distinct for different workflow runs (https://code.forgejo.org/actions/cache#cache-version)
- `Complete` - significant portion of the cache DB will be complete, making it the least selective possible index
- `UsedAt` & `CreatedAt` - only used in GC operation, so could remain, but this isn't a performance-sensitive codepath
Closes#874.
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- bug fixes
- [PR](https://code.forgejo.org/forgejo/runner/pulls/878): <!--number 878 --><!--line 0 --><!--description Zml4OiBhcnRpZmFjdCBjYWNoZSBEQiBub3QgdXNpbmcgaW5kZXhlcyBmb3Igc2VhcmNoaW5n-->fix: artifact cache DB not using indexes for searching<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/878
Reviewed-by: Michael Kriese <michael.kriese@gmx.de>
Co-authored-by: Mathieu Fenniak <mathieu@fenniak.net>
Co-committed-by: Mathieu Fenniak <mathieu@fenniak.net>
It will be imported by Forgejo.
<!--start release-notes-assistant-->
<!--URL:https://code.forgejo.org/forgejo/runner-->
- other
- [PR](https://code.forgejo.org/forgejo/runner/pulls/777): <!--number 777 --><!--line 0 --><!--description Y2hvcmU6IHRvIGFsbG93IHRoZSBydW5uZXIgdG8gYmUgaW1wb3J0ZWQsIHY5IG5lZWRzIHRvIGJlIGluIHRoZSBnbyBtb2R1bGU=-->chore: to allow the runner to be imported, v9 needs to be in the go module<!--description-->
<!--end release-notes-assistant-->
Reviewed-on: https://code.forgejo.org/forgejo/runner/pulls/777
Reviewed-by: Michael Kriese <michael.kriese@gmx.de>
Co-authored-by: Earl Warren <contact@earl-warren.org>
Co-committed-by: Earl Warren <contact@earl-warren.org>
This ensures that brackets are added for IPv6 addresses.
Without this, This could result in addresses like "2001:db8::1:3456",
which - obviously - would break further down and prevent the server from
starting.
Signed-off-by: Christoph Heiss <christoph@c8h4.io>
* During get/upload, close the database while reading/writing so
it does not stay open for longer than necessary. This may be helpful
when uploads run in parallel.
* Be more informative when returning error 500
* Make useCache handle errors
* Return 500 immediately when writing the cache fails instead of falling
through to 200
Refs: https://code.forgejo.org/forgejo/runner/issues/509
* Match cache restore-keys in creation reverse order
* Match full prefix when selecting cache
---------
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>