The implementation is equivalent to
`cases.Title(language.English).String(strings.ToLower(…))`,
and this is the only place in miniflux where
"golang.org/x/text/cases" and "golang.org/x/text/language"
are (directly) used.
This reduces the binary size from 27015590 to
26686112 on my machine.
Kudos to https://gsa.zxilly.dev for making it straightforward to catch things
like this.
- Use chained strings.Contains instead of a regex for
blacklistCandidatesRegexp, as this is a bit faster
- Simplify a Find.Each.Remove to Find.Remove
- Don't concatenate id and class for removeUnlikelyCandidates, as it makes no
sense to match on overlaps. It might also marginally improve performances, as
regex now have to run on two strings separately, instead of both.
- Add a small benchmark
Some websites are using images of O(10kB) when not )O(100kB) for their
favicons. As miniflux only displays them with a 16x16 resolution, let's do our
best to resize them before storing them in the database. This should make
miniflux consume less bandwidth when serving pages, for the joy of mobile users
on a small data plan.
Of course, images that already are 16x16 aren't resized.
No need for brittle regex when matching plain strings or domain names.
This should save some negligible amount of heap memory as well as
tremendously speeding up the matching.
- Replace a completely overkill regex
- Use `.Remove()` instead of a hand-rolled loop
- Use a strings.Builder instead of a bytes.NewBufferString
- Replace a call to Fprintf with string concatenation, as the latter are much
faster
- Remove a superfluous cast
- Delay some computations
- Add some tests
As mentioned in goquery's documentation (https://pkg.go.dev/github.com/PuerkitoBio/goquery#Single):
> By default, Selection.Find and other functions that accept a selector string
to select nodes will use all matches corresponding to that selector. By using
the Matcher returned by Single, at most the first match will be selected.
>
> The one using Single is optimized to be potentially much faster on large documents.
In internal/reader/handler/handler.go:RefreshFeed, there is a call to
store.UserByID pretty early, which is only used for
originalFeed.WithTranslatedErrorMessage(localizedError.Translate(user.Language)
Its only other usage is in processor.ProcessFeedEntries(store, originalFeed,
user, forceRefresh), which is pretty late in RefreshFeed, and only called if
there are new items in the feed. It makes sense to only fetch the user's
language if the error localization function is used.
Calls to `store.UserByID` take around 10% of the CPU time of RefreshFeed in my
profiling.
This commit also makes `processor.ProcessFeedEntries` take a `userID` instead
of a `user`, to make the code a bit more concise.
This should close#2984
- Use string concatenation instead of `Sprintf`, as this is much faster, and the
call to `Sprintf` is responsible for 30% of the CPU time of the function
- Anchor the youtube regex, to allow it to bail early, as this also account for
another 30% of the CPU time. It might be worth chaining calls to `TrimPrefix`
and check if the string has been trimmed instead of using a regex, to speed
things up even more, but this needs to be benchmarked properly.
It was added in 2022 by #1513, to support blog.laravel.com, which has
since switched to HTML. The Atom 0.3/1.0, RSS 1.0/2.0, RDF, and JSON formats
don't support markdown in their spec, and any website serving it there should
be considered as buggy and fixed.
This shaves off 2MB from the miniflux binary, which is quite steep for a
feature that nobody is/should be using, and remove a dependency which is always
a good thing.
- Use `token.String()` instead of `html.EscapeString(token.Data)`
- Refactor conditions to highlight their similitude, enabling further
refactoring
This refactoring brings forth at least one bug: `tagStack` is never emptied.
The `isAnchor` function's first parameter was always `a`, instead of being
passed `tagName`. As this function is a single line and was only called in a
single place, it can be inlined.
- Use `[^"]` instead of `.`, to help the regex engine to determine boundaries,
instead of having it bruteforce its way to find them
- Use `+` instead of `*`, as empty rules don't make sense
This function takes around 1.5% of the total CPU time on my instance, and most
of it is spent in `mapassign_faststr` to initialize the `map`. This is replaced
with a switch-case construct, that should be both significantly faster as well
as pretty dull in term of memory consumption.
- Pre-allocate a slice
- Inline a local variable
- Remove a superfluous call to `strings.TrimSpace`
- Simplify some conditions via a switch-case construct