miniflux-v2

mirror of https://github.com/miniflux/v2.git synced 2025-06-27 16:36:00 +00:00

Author	SHA1	Message	Date
Frédéric Guillot	db49e41acf	refactor(processor): move FilterEntryMaxAgeDays filter to filter package	2025-06-19 17:56:45 -07:00
Frédéric Guillot	cb59944d6b	refactor(processor): move `RewriteEntryURL` function to `rewrite` package	2025-06-19 13:22:29 -07:00
Frédéric Guillot	bc6ab44ff2	fix(filter): skip invalid rules instead of exiting the loop	2025-06-19 12:36:35 -07:00
Frédéric Guillot	6282ac1f38	refactor(processor): move filters to a `filter` package	2025-06-19 12:06:30 -07:00
jvoisin	96c0ef4efd	refactor(processor): massive refactoring of filters.go - Use proper variable names for `key=value` strings parts - Explicitly assign false to the `match` boolean - Use an explicit `len(parts) == 2` assertion to help the compiler remove `isSliceInBounds` calls. - Refactor identical code into a containsRegexPattern function. - Early exit when parsing the first date fails when using the `Between` operator, instead of trying to parse the second one.	2025-06-19 11:43:47 -07:00
jvoisin	b139ac4a2c	refactor(youtube): Remove a regex and make use of fetchWatchTime	2025-06-19 11:43:47 -07:00
jvoisin	c818d5bbb8	refactor(youtube): initiliaze two maps to the proper length	2025-06-19 11:43:47 -07:00
jvoisin	e366710529	refactor(processor): remove a useless type declaration	2025-06-19 11:43:47 -07:00
jvoisin	5cff4d7117	refactor(processor): remove a duplication function call As youtubeVideoID is assigned to getVideoIDFromYouTubeURL(entry.URL), there is no need to call the latter again when we can simly use youtubeVideoID instead.	2025-06-19 11:43:47 -07:00
jvoisin	f31a784eaa	refactor(processor): refactor common code into a fetchWatchTime function Both nebula and odysee were using the same function to parse time.	2025-06-19 11:43:47 -07:00
jvoisin	7edfcc3cf7	refactor(processor): remove a useless type declaration	2025-06-19 11:43:47 -07:00
jvoisin	fe4b00b9f8	refactor(processor): extract some functions into an utils.go file	2025-06-19 11:43:47 -07:00
jvoisin	46b159ac58	refactor(processor): simplify bilibili processing - Use strings.Contains instead of a regex - Use strings concatenation instead of a call to fmt.Sprintf - Use `any` instead of `interface{}`	2025-06-19 11:43:47 -07:00
jvoisin	72486b9bd1	refactor(processor): minor simplification of a loop This makes the code a tad clearer.	2025-06-17 17:30:13 -07:00
Frédéric Guillot	a4d16cc5c1	refactor(rewrite): rename `Rewriter` function to `ApplyContentRewriteRules`	2025-06-10 20:28:15 -07:00
jvoisin	7c857bdc72	perf(reader): optimize RemoveTrackingParameters A bit more than 10% of processor.ProcessFeedEntries' CPU time is spent in urlcleaner.RemoveTrackingParameters, specifically calling url.Parse, so let's extract this operation outside of it, and do it once before calling urlcleaner.RemoveTrackingParameters multiple times. Co-authored-by: Frédéric Guillot <f@miniflux.net>	2025-06-10 19:29:25 -07:00
Frédéric Guillot	8db637cb39	feat(ui): add user setting to control `target="_blank"` on links Rationale: Opening links in the current tab is the default browser behavior. Using `target="_blank"` on external links can lead to accessibility issues and override user preferences. It may also interfere with assistive technologies and expected browser behavior. To maintain backward compatibility, this option is enabled by default (`true`), which adds `target="_blank"` to links.	2025-06-08 21:07:11 -07:00
jvoisin	ff2dfe977b	feat: remove the `ref` parameter from url This is used by (at least) Ghost (https://forum.ghost.org/t/ref-parameter-being-added-to-links/38335) Examples: - https://blog.exploits.club/exploits-club-weekly-newsletter-66-mitigations-galore-dirtycow-revisited-program-analysis-for-uafs-and-more/ - https://labs.watchtowr.com/is-the-sofistication-in-the-room-with-us-x-forwarded-for-and-ivanti-connect-secure-cve-2025-22457/	2025-05-06 19:59:55 -07:00
Frédéric Guillot	ef22e95f8b	feat: implement proxy URL per feed	2025-04-06 21:05:19 -07:00
Frédéric Guillot	535fd050b7	feat: add proxy rotation functionality	2025-04-06 14:59:00 -07:00
Maytham Alsudany	f01ff067a5	fix(processor): add missing quotation marks to import comments	2025-02-24 16:34:26 -08:00
Frédéric Guillot	369054b02d	feat(processor): fetch YouTube watch time in bulk using the API	2025-01-24 15:16:23 -08:00
jvoisin	2e57e3351b	Remove superfluous parenthesis	2025-01-23 19:20:13 -08:00
Sevi.C	bca9bea676	feat: add date-based entry filtering rules	2024-12-16 20:38:20 -08:00
Julien Voisin	1b0b8b9c42	refactor: use a better construct than `doc.Find(…).First()` As mentioned in goquery's documentation (https://pkg.go.dev/github.com/PuerkitoBio/goquery#Single): > By default, Selection.Find and other functions that accept a selector string to select nodes will use all matches corresponding to that selector. By using the Matcher returned by Single, at most the first match will be selected. > > The one using Single is optimized to be potentially much faster on large documents.	2024-12-11 19:40:55 -08:00
Julien Voisin	3caa16ac31	refactor(processor): use URL parsing instead of a regex	2024-12-11 19:30:59 -08:00
Julien Voisin	637fb85de0	refactor(handler): delay `store.UserByID` as much as possible In internal/reader/handler/handler.go:RefreshFeed, there is a call to store.UserByID pretty early, which is only used for originalFeed.WithTranslatedErrorMessage(localizedError.Translate(user.Language) Its only other usage is in processor.ProcessFeedEntries(store, originalFeed, user, forceRefresh), which is pretty late in RefreshFeed, and only called if there are new items in the feed. It makes sense to only fetch the user's language if the error localization function is used. Calls to `store.UserByID` take around 10% of the CPU time of RefreshFeed in my profiling. This commit also makes `processor.ProcessFeedEntries` take a `userID` instead of a `user`, to make the code a bit more concise. This should close #2984	2024-12-09 19:32:59 -08:00
Julien Voisin	fefbf2c935	refactor(processor): improve the `rewrite` URL rule regex - Use `[^"]` instead of `.`, to help the regex engine to determine boundaries, instead of having it bruteforce its way to find them - Use `+` instead of `*`, as empty rules don't make sense	2024-12-07 16:35:51 -08:00
telnet23	7e2b50efee	feat: optionally fetch watch time from YouTube API instead of website	2024-12-07 16:00:35 -08:00
July	86c0cc61ba	feat: set entry URL to rewritten URL if a rewrite rule is defined	2024-10-13 21:21:28 -07:00
Frédéric Guillot	cfe410f202	refactor: split processor package into smaller files	2024-09-22 18:54:19 -07:00
Qeynos	c2ac2bfb83	feat: use Bilibili API instead of web scraping to get video watch time	2024-09-22 18:05:43 -07:00
Qeynos	bcbf9f4025	feat: add `FETCH_BILIBILI_WATCH_TIME` config option	2024-08-01 19:52:31 -07:00
Frédéric Guillot	29387f2d60	feat: implement base element handling in content scraper	2024-07-25 20:36:56 -07:00
Frédéric Guillot	c0f6e32a99	feat: remove well-known URL parameter trackers	2024-07-19 21:35:47 -07:00
privatmamtora	1a81866bb9	Add global block and keep filters	2024-07-02 21:03:49 -07:00
Ankit Pandey	b68b05c64c	reader/processor: error out for improper rewrite regexp It's possible to specify a rewrite regex that validates but doesn't compile such as: rewrite("(((unmatched-capture-group"\|"rewrite)))") In case we encounter one, exit early instead of letting the server panic.	2024-06-01 10:37:02 -07:00
fin444	a631bd527d	options: add FETCH_NEBULA_WATCH_TIME	2024-05-02 16:30:01 -07:00
Frédéric Guillot	fb075b60b5	reader/processor: minifier is breaking HTML entry content	2024-04-23 20:31:52 -07:00
jvoisin	b205b5aad0	reader/processor: minimize the feed's entries html Compress the html of feed entries before storing it. This should reduce the size of the database a bit, but more importantly, reduce the amount of data sent to clients minify being [stupidly fast](https://github.com/tdewolff/minify/?tab=readme-ov-file#performance), the performance impact should be in the noise level.	2024-04-10 19:48:48 -07:00
Frédéric Guillot	38b80d96ea	storage: change GetReadTime() function to use entries_feed_id_hash_key index	2024-04-09 20:37:30 -07:00
Frédéric Guillot	fdd1b3f18e	database: entry URLs can exceeds btree index size limit	2024-04-04 20:22:23 -07:00
Jean Khawand	a78d1c79da	Add `FILTER_ENTRY_MAX_AGE_DAYS` config option to limit fetching all feed items	2024-03-20 02:58:53 +00:00
Frédéric Guillot	b1e73fafdf	Enable go-critic linter and fix various issues detected	2024-03-17 13:52:34 -07:00
jvoisin	02a074ed26	Compile block/keep regex only once per feed No need to compile them once for matching on the url, once per tag, once per title, once per author, … one time is enough. It also simplify error handling, since while regexp compilation can fail, matching can't.	2024-03-17 12:08:03 -07:00
jvoisin	31ac62f410	Don't compute reading-time when unused If the user doesn't display reading times, there is no need to compute them. This should speed things up a bit, since `whatlanggo.Detect` is abysmally slow.	2024-02-29 19:14:17 -08:00
Frédéric Guillot	c493f8921e	Add missing regex anchor detected by CodeQL	2024-02-28 20:50:17 -08:00
Frédéric Guillot	eae4cb1417	Add feed option to disable HTTP/2 to avoid fingerprinting	2024-02-24 22:30:26 -08:00
jvoisin	b48ad6dbfb	Make use of go≥1.21 slices package instead of hand-rolled loops This makes the code a tad smaller, moderner, and maybe even marginally faster, yay!	2024-02-24 20:22:53 -08:00
Matt Stobo	4a50ca9122	Allow filtering feeds on entry.Author	2024-01-31 19:42:07 -08:00

1 2

59 commits