- Replace a completely overkill regex
- Use `.Remove()` instead of a hand-rolled loop
- Use a strings.Builder instead of a bytes.NewBufferString
- Replace a call to Fprintf with string concatenation, as the latter are much
faster
- Remove a superfluous cast
- Delay some computations
- Add some tests
As mentioned in goquery's documentation (https://pkg.go.dev/github.com/PuerkitoBio/goquery#Single):
> By default, Selection.Find and other functions that accept a selector string
to select nodes will use all matches corresponding to that selector. By using
the Matcher returned by Single, at most the first match will be selected.
>
> The one using Single is optimized to be potentially much faster on large documents.
In internal/reader/handler/handler.go:RefreshFeed, there is a call to
store.UserByID pretty early, which is only used for
originalFeed.WithTranslatedErrorMessage(localizedError.Translate(user.Language)
Its only other usage is in processor.ProcessFeedEntries(store, originalFeed,
user, forceRefresh), which is pretty late in RefreshFeed, and only called if
there are new items in the feed. It makes sense to only fetch the user's
language if the error localization function is used.
Calls to `store.UserByID` take around 10% of the CPU time of RefreshFeed in my
profiling.
This commit also makes `processor.ProcessFeedEntries` take a `userID` instead
of a `user`, to make the code a bit more concise.
This should close#2984
While doing some profiling for #2900, I noticed that
`miniflux.app/v2/internal/locale.LoadCatalogMessages` is responsible for more
than 10% of the consumed memory. As most miniflux instances won't have enough
diverse users to use all the available translations at the same time, it
makes sense to load them on demand.
The overhead is a single function call and a check in a map, per call to
translation-related functions.
- Use string concatenation instead of `Sprintf`, as this is much faster, and the
call to `Sprintf` is responsible for 30% of the CPU time of the function
- Anchor the youtube regex, to allow it to bail early, as this also account for
another 30% of the CPU time. It might be worth chaining calls to `TrimPrefix`
and check if the string has been trimmed instead of using a regex, to speed
things up even more, but this needs to be benchmarked properly.
It was added in 2022 by #1513, to support blog.laravel.com, which has
since switched to HTML. The Atom 0.3/1.0, RSS 1.0/2.0, RDF, and JSON formats
don't support markdown in their spec, and any website serving it there should
be considered as buggy and fixed.
This shaves off 2MB from the miniflux binary, which is quite steep for a
feature that nobody is/should be using, and remove a dependency which is always
a good thing.
- Use `token.String()` instead of `html.EscapeString(token.Data)`
- Refactor conditions to highlight their similitude, enabling further
refactoring
This refactoring brings forth at least one bug: `tagStack` is never emptied.
The `isAnchor` function's first parameter was always `a`, instead of being
passed `tagName`. As this function is a single line and was only called in a
single place, it can be inlined.
- Use `[^"]` instead of `.`, to help the regex engine to determine boundaries,
instead of having it bruteforce its way to find them
- Use `+` instead of `*`, as empty rules don't make sense
This function takes around 1.5% of the total CPU time on my instance, and most
of it is spent in `mapassign_faststr` to initialize the `map`. This is replaced
with a switch-case construct, that should be both significantly faster as well
as pretty dull in term of memory consumption.
- Pre-allocate a slice
- Inline a local variable
- Remove a superfluous call to `strings.TrimSpace`
- Simplify some conditions via a switch-case construct
The `pg_timezone_names` view was added in 8.2.
It should be equivalent to the function query.
See: https://pgpedia.info/p/pg_timezone_names.html
This small change allows `miniflux` to run on postgres-compatible
databases like CockroachDB, which don't have this function.