miniflux-v2

mirror of https://github.com/miniflux/v2.git synced 2025-06-27 16:36:00 +00:00

Author	SHA1	Message	Date
jvoisin	44c48d109f	perf(sanitizer): extract a call to url.Parse and make intensive use of it Previously, url.Parse(baseUrl) was called on every self-closing tags, and on most opening tags, accounting for around 15% of the CPU time spent in processor.ProcessFeedEntries	2025-06-13 17:05:17 -07:00
Frédéric Guillot	40727704c2	feat(rewrite): add support for YouTube Shorts video URL pattern	2025-06-12 21:02:46 -07:00
jvoisin	8a014c6abc	perf(readability): minor regex improvement - Improve the check for tags by matching only if its name is followed either by a space, a slash or a closing angle - Use an anonymous group	2025-06-12 19:13:58 -07:00
jvoisin	60ad19c427	perf(rss): early return when looking for an item's author The `sanitizer.StripTags` function is calling `html.NewTokenizer`, which is allocating a 4096 bytes buffer on the heap, as well a running a complex state machine to tokenize html. There is no need to do all of this for empty strings. This commit also fixes a TrimSpace/StripTags call inversion.	2025-06-11 19:06:15 -07:00
jvoisin	f40c1e7f63	fix(reader): fix a crash introduced by `d59990f1` And add a fuzzer and a testcase as well to validate that nothing breaks.	2025-06-11 19:04:46 -07:00
Frédéric Guillot	a4d16cc5c1	refactor(rewrite): rename `Rewriter` function to `ApplyContentRewriteRules`	2025-06-10 20:28:15 -07:00
jvoisin	7c857bdc72	perf(reader): optimize RemoveTrackingParameters A bit more than 10% of processor.ProcessFeedEntries' CPU time is spent in urlcleaner.RemoveTrackingParameters, specifically calling url.Parse, so let's extract this operation outside of it, and do it once before calling urlcleaner.RemoveTrackingParameters multiple times. Co-authored-by: Frédéric Guillot <f@miniflux.net>	2025-06-10 19:29:25 -07:00
jvoisin	0caadf82f2	perf(rss): optimize a bit BuildFeed Calls to urllib.AbsoluteURL take a bit less than 10% of the time spent in parser.ParseFeed, completely parsing an url only to check if it's absolute, and if not, to make it so. Checking if it starts with `https://` or `http://` is usually enough to find if an url is absolute, and if is doesn't, it's always possible to fall back to urllib.AbsoluteURL. This also comes with the advantage of reducing heap allocations, as most of the time spent in urllib.AbsoluteURL is heap-related (de)allocations.	2025-06-10 19:23:16 -07:00
Frédéric Guillot	cecc18420d	feat(sanitizer): add validation for empty width and height attributes in img tags	2025-06-09 20:38:17 -07:00
Frédéric Guillot	d53fd17e10	feat(sanitizer): validate MathML XML namespace	2025-06-09 20:28:54 -07:00
Frédéric Guillot	21d22d7f0b	feat(sanitizer): add support for fetchpriority and decoding attributes in img tags	2025-06-09 20:12:15 -07:00
jvoisin	d59990f1dd	perf(xml): optimize xml filtering Instead of using bytes.Map which is returning a copy of the provided []byte, use a custom in-place implementation, as the bytes.Map call is taking around 25% of rss.Parse	2025-06-09 13:49:10 -07:00
jvoisin	49085daefe	perf(xml): optimized NewXMLDecoder io.ReadAll is growing the underlying buffer progressively, while io.Copy is able to allocate it in one go, which is significantly faster. io.ReadAll is currently accounting for around 10% of the CPU time of rss.Parse	2025-06-09 13:49:10 -07:00
Frédéric Guillot	8db637cb39	feat(ui): add user setting to control `target="_blank"` on links Rationale: Opening links in the current tab is the default browser behavior. Using `target="_blank"` on external links can lead to accessibility issues and override user preferences. It may also interfere with assistive technologies and expected browser behavior. To maintain backward compatibility, this option is enabled by default (`true`), which adds `target="_blank"` to links.	2025-06-08 21:07:11 -07:00
Frédéric Guillot	8142268799	feat: populate feed description automatically	2025-05-24 21:15:52 -07:00
Anton Larionov	553c578f2e	feat(rssbridge): support auth token for RSS-Bridge	2025-05-19 20:47:12 -07:00
Frédéric Guillot	828a4334db	fix(sanitizer): MathML tags are not fully supported by `golang.org/x/net/html` See https://github.com/golang/net/blob/master/html/atom/gen.go and https://github.com/golang/net/blob/master/html/atom/table.go	2025-05-06 21:18:19 -07:00
jvoisin	d1dc369bb2	feat(sanitizer): add MathML tags to the sanitizer This was found by reading the article pointed by https://lobste.rs/s/nobvmp/how_prime_factorizations_govern_collatz	2025-05-06 20:19:56 -07:00
jvoisin	ff2dfe977b	feat: remove the `ref` parameter from url This is used by (at least) Ghost (https://forum.ghost.org/t/ref-parameter-being-added-to-links/38335) Examples: - https://blog.exploits.club/exploits-club-weekly-newsletter-66-mitigations-galore-dirtycow-revisited-program-analysis-for-uafs-and-more/ - https://labs.watchtowr.com/is-the-sofistication-in-the-room-with-us-x-forwarded-for-and-ivanti-connect-secure-cve-2025-22457/	2025-05-06 19:59:55 -07:00
NoelNegash	81c7669945	feat(sanitized): allow Spotify iframes	2025-05-02 16:25:17 -07:00
Frédéric Guillot	d33e305af9	fix(api): `hide_globally` categories field should be a boolean	2025-04-21 19:43:25 -07:00
Frédéric Guillot	c87c93d85f	feat(config): add `SCHEDULER_ROUND_ROBIN_MAX_INTERVAL` option Add option to cap maximum refresh interval when RSS TTL, Retry-After, Cache-Control, or Expires headers specify excessively high values.	2025-04-11 15:40:32 -07:00
Frédéric Guillot	ef22e95f8b	feat: implement proxy URL per feed	2025-04-06 21:05:19 -07:00
Frédéric Guillot	c45b51d1f8	feat: use `Cache-Control` max-age and `Expires` headers to calculate next check	2025-04-06 16:24:00 -07:00
Frédéric Guillot	0af1a6e121	refactor: avoid logging twice the feed errors in the background worker	2025-04-06 15:39:40 -07:00
Frédéric Guillot	535fd050b7	feat: add proxy rotation functionality	2025-04-06 14:59:00 -07:00
Frédéric Guillot	51560f191f	fix(subscription): add `/rss/feed.xml` to the list of known feed URLs	2025-03-28 16:59:06 -07:00
Frédéric Guillot	e342a4f143	fix: address minor issues detected by Go linters	2025-03-24 20:48:46 -07:00
Frédéric Guillot	315e72c412	fix(rewrite): remove obsolete rule for webtoons.com	2025-03-06 20:11:03 -08:00
jvoisin	f916373f55	fix: allow the `<b>` tag	2025-03-06 19:27:30 -08:00
jvoisin	5353211206	fix: allow the `<u>` tag in feeds	2025-03-06 19:26:26 -08:00
AiraNadih	ad02f21d04	refactor(rewrite): reorganize referer rules and remove obsolete mapping	2025-03-02 19:40:52 -08:00
Maytham Alsudany	f01ff067a5	fix(processor): add missing quotation marks to import comments	2025-02-24 16:34:26 -08:00
jvoisin	117d711d7d	feat(urlcleaner): add more Google Analytics parameters	2025-02-22 17:07:59 -08:00
jvoisin	4a77e937af	perf(sanitizer): remove two useless calls to strings.ReplaceAll The [strings.Fields](https://pkg.go.dev/strings#Fields) considers `'\t', '\n', '\v', '\f', '\r', ' ', U+0085 (NEL), U+00A0 (NBSP).` as spaces, so no need to remove them beforehand. This is a continuation of `f2f60a8f73`	2025-02-18 19:42:39 -08:00
Frédéric Guillot	462ba8d7f7	feat(sanitizer): allow `img` tags with only a `srcset` and no `src` attribute	2025-02-15 18:03:36 -08:00
Frédéric Guillot	6eedf4111f	fix(scraper): avoid encoding issue if charset meta tag is after 1024 bytes	2025-02-15 17:05:14 -08:00
Frédéric Guillot	af1f966250	test(encoding): add unit tests for CharsetReader function	2025-02-15 15:40:07 -08:00
Frédéric Guillot	7f54b27079	fix(rss): handle item title with CDATA content correctly Fix regression introduced in commit `a3ce03cc`	2025-02-15 14:51:27 -08:00
Frédéric Guillot	a3ce03cc9d	feat(rss): add workaround for RSS item title with HTML content	2025-02-14 21:21:49 -08:00
Frédéric Guillot	f2f60a8f73	feat(sanitizer): improve text truncation with better space handling	2025-02-06 21:21:49 -08:00
Frédéric Guillot	e777f12490	fix(sanitizer): correct HTML tag name from `tfooter` to `tfoot`	2025-02-06 21:16:29 -08:00
Julien Voisin	7eb1d15315	refactor(date): use an else-if instead of two if statements	2025-02-06 19:44:12 -08:00
Julien Voisin	b193bc212a	refactor(xml): improve the performances of `NewXMLDecoder` - Invert a condition to make the code more readable - Extract the encoding directly from the slice of bytes instead of converting it to string first.	2025-01-30 19:37:06 -08:00
Julien Voisin	7275bc808a	feat(urlcleaner): add trackers to the blocklist	2025-01-29 19:32:19 -08:00
Frédéric Guillot	369054b02d	feat(processor): fetch YouTube watch time in bulk using the API	2025-01-24 15:16:23 -08:00
Frédéric Guillot	c3c42b0c37	fix(scraper): update TechCrunch scraper rule	2025-01-23 19:29:32 -08:00
jvoisin	2e57e3351b	Remove superfluous parenthesis	2025-01-23 19:20:13 -08:00
jvoisin	a412cde3b3	Don't define receivers on both values and pointer And use `o` instead of `outline` as done everywhere else.	2025-01-23 19:20:13 -08:00
jvoisin	abfd9306a4	Guard against a potential null dereference	2025-01-23 19:20:13 -08:00

1 2 3 4 5

229 commits