- Factorize some conditions
- Remove useless `default` case and move the return at the end of the functions
- Use strings.CutPrefix instead of strings.HasPrefix + strings.TrimPrefix
- Use switch-case constructs instead of slices.Contains, as this reduces the
complexity of the functions and allows them to be inlined, as well as helping
the compiler to optimize them, as it sucks at interprocedural optimizations.
Previously, url.Parse(baseUrl) was called on every self-closing tags, and on
most opening tags, accounting for around 15% of the CPU time spent in
processor.ProcessFeedEntries
A bit more than 10% of processor.ProcessFeedEntries' CPU time is spent in
urlcleaner.RemoveTrackingParameters, specifically calling url.Parse, so let's
extract this operation outside of it, and do it once before calling
urlcleaner.RemoveTrackingParameters multiple times.
Co-authored-by: Frédéric Guillot <f@miniflux.net>
Rationale: Opening links in the current tab is the default browser behavior.
Using `target="_blank"` on external links can lead to accessibility issues and override user preferences. It may also interfere with assistive technologies and expected browser behavior.
To maintain backward compatibility, this option is enabled by default (`true`), which adds `target="_blank"` to links.
The [strings.Fields](https://pkg.go.dev/strings#Fields) considers `'\t', '\n',
'\v', '\f', '\r', ' ', U+0085 (NEL), U+00A0 (NBSP).` as spaces, so no need to
remove them beforehand.
This is a continuation of f2f60a8f73
- Use string concatenation instead of `Sprintf`, as this is much faster, and the
call to `Sprintf` is responsible for 30% of the CPU time of the function
- Anchor the youtube regex, to allow it to bail early, as this also account for
another 30% of the CPU time. It might be worth chaining calls to `TrimPrefix`
and check if the string has been trimmed instead of using a regex, to speed
things up even more, but this needs to be benchmarked properly.
- Use `token.String()` instead of `html.EscapeString(token.Data)`
- Refactor conditions to highlight their similitude, enabling further
refactoring
This refactoring brings forth at least one bug: `tagStack` is never emptied.
The `isAnchor` function's first parameter was always `a`, instead of being
passed `tagName`. As this function is a single line and was only called in a
single place, it can be inlined.
This function takes around 1.5% of the total CPU time on my instance, and most
of it is spent in `mapassign_faststr` to initialize the `map`. This is replaced
with a switch-case construct, that should be both significantly faster as well
as pretty dull in term of memory consumption.
- Pre-allocate a slice
- Inline a local variable
- Remove a superfluous call to `strings.TrimSpace`
- Simplify some conditions via a switch-case construct
- allow youtube urls to start with `www`
- use `strings.Builder` instead of a `bytes.Buffer`
- use a `strings.NewReader` instead of a `bytes.NewBufferString`
- sprinkles a couple of `continue` to make the code-flow more obvious
- inline calls to `inList`, and put their parameters in the right order
- simplify isPixelTracker
- simplify `isValidIframeSource`, by extracting the hostname and comparing it
directly, instead of using the full url and checking if it starts with
multiple variations of the same one (`//`, `http:`, `https://` multiplied by
``/`www.`)
- add a benchmark