There is no need to use SHA256 everywhere, especially on small inputs where we
don't care about its cryptographic properties. We're using FNV as it's the
faster available hash in go's standard library, and we're picking its "a"
version as it's slightly better avalanche characteristics, which are
relevant for small inputs.
This commit has the side-effect of invalidating all favicons saved in the
database, which is desirable to benefit from the resize process implemented in
777d0dd2, as it didn't apply retro-actively.
We're also making use of hex.EncodeToString instead of fmt.Sprintf, as it's
marginally faster.
Note that we can't change the usage of sha256 for feed.Hash as it's used to
deduplicate entries in the database.
Go 1.24 provides the helpful rand.Text() function, returning a base32-encoded
string containing at least 128 bits of randomness. We should make use of it
everywhere it makes sense to do so, if only to not having to think about much
entropy do we need for each cases, and just trust the go crypto team.
Also, rand.Read() can't fail, so no need to check its return value:
https://pkg.go.dev/crypto/rand#Read This behaviour is consistent with go's
standard library itself.
- TLS 1.2 is used as MinVersion by default
- With regard to CipherSuites, in Go 1.22 RSA key exchange based cipher suites
were removed from the default list, and in Go 1.23 3DES cipher suites were
removed as well. Ciphers for TLS1.3 aren't configurable.
- No need to specify CurveP25, as the servers will likely disable the weird
ones like CurveP384 and CurveP521. Removing the explicit specification also
enables the post-quantum X25519MLKEM768, wow!
I trust the go team to make better choices on the long term than us keeping
miniflux up to date with the latest TLS trend.
- Factorize some conditions
- Remove useless `default` case and move the return at the end of the functions
- Use strings.CutPrefix instead of strings.HasPrefix + strings.TrimPrefix
- Use switch-case constructs instead of slices.Contains, as this reduces the
complexity of the functions and allows them to be inlined, as well as helping
the compiler to optimize them, as it sucks at interprocedural optimizations.
The previous regex was using the [ABC..D]*[ABC] pattern, resulting in a lot of
backtracking. The new regex is stopping the matching at the first space or end
of text (and removes the trailing `.` should one be present).
The backtracking was taking around 50% of the CPU time spent in atom.Parse
Previously, url.Parse(baseUrl) was called on every self-closing tags, and on
most opening tags, accounting for around 15% of the CPU time spent in
processor.ProcessFeedEntries
The `sanitizer.StripTags` function is calling `html.NewTokenizer`, which is
allocating a 4096 bytes buffer on the heap, as well a running a complex state
machine to tokenize html. There is no need to do all of this for empty strings.
This commit also fixes a TrimSpace/StripTags call inversion.
A bit more than 10% of processor.ProcessFeedEntries' CPU time is spent in
urlcleaner.RemoveTrackingParameters, specifically calling url.Parse, so let's
extract this operation outside of it, and do it once before calling
urlcleaner.RemoveTrackingParameters multiple times.
Co-authored-by: Frédéric Guillot <f@miniflux.net>
Calls to urllib.AbsoluteURL take a bit less than 10% of the time spent in
parser.ParseFeed, completely parsing an url only to check if it's absolute, and
if not, to make it so.
Checking if it starts with `https://` or `http://` is usually enough to find if
an url is absolute, and if is doesn't, it's always possible to fall back to
urllib.AbsoluteURL.
This also comes with the advantage of reducing heap allocations, as most of the
time spent in urllib.AbsoluteURL is heap-related (de)allocations.
- There is no need to have groups as we're only using this regex for
`MatchString`.
- Since the only place where this regex is used is already calling
strings.ToLower, there is no need to check for `A-Z`.
Display the article's external URL directly in the single entry view.
Rationale: On mobile devices, users couldn't see where a link pointed before tapping it.
Previously, the only way to view the external URL was by hovering - an action not available on touch devices.
Instead of using bytes.Map which is returning a copy of the provided []byte,
use a custom in-place implementation, as the bytes.Map call is taking around
25% of rss.Parse
io.ReadAll is growing the underlying buffer progressively, while
io.Copy is able to allocate it in one go, which is significantly faster.
io.ReadAll is currently accounting for around 10% of the CPU time of rss.Parse
The call to fmt.Sprintf in WithFeedID accounts for more than 20% of the time
spent in GetFeed. Use strconv.Itoa instead, as it's much much faster.
Also change WithCategoryID in the same way, for consistency's sake.
Rationale: Opening links in the current tab is the default browser behavior.
Using `target="_blank"` on external links can lead to accessibility issues and override user preferences. It may also interfere with assistive technologies and expected browser behavior.
To maintain backward compatibility, this option is enabled by default (`true`), which adds `target="_blank"` to links.