1
0
Fork 0
mirror of https://github.com/miniflux/v2.git synced 2025-06-27 16:36:00 +00:00
Commit graph

127 commits

Author SHA1 Message Date
Frédéric Guillot
f16735fd6d feat: update feed icon during force refresh 2024-10-04 20:51:40 -07:00
Scott Leggett
562a7b79a5 fix: update Last-Modified if it changes in a 304 response
When a server returns a 304 response with a strong validator, any other
stored fields must be updated if they are also present in the response.

This behaviour is described in RFC9111, sections 3.2 and 4.3.4.
2024-10-04 17:47:48 -07:00
Scott Leggett
cb610230d9 chore: update test case comment
The updated comment reflects a better understanding of the RFCs.
2024-10-04 17:47:48 -07:00
Frédéric Guillot
cfe410f202 refactor: split processor package into smaller files 2024-09-22 18:54:19 -07:00
Qeynos
c2ac2bfb83
feat: use Bilibili API instead of web scraping to get video watch time 2024-09-22 18:05:43 -07:00
Pontus Jensen Karlsson
ade412f453 fix: Honor hide_globally when creating a new feed through the api
TestGetGlobalEntriesEndpoint was failing because CreateFeed ignored HideGlobally, this fixes that.
2024-08-12 20:20:44 -07:00
Qeynos
bcbf9f4025
feat: add FETCH_BILIBILI_WATCH_TIME config option 2024-08-01 19:52:31 -07:00
Frédéric Guillot
37309adbc0 fix: do not alter the original URL if there is no tracker parameter 2024-07-25 22:10:28 -07:00
Frédéric Guillot
92f3dc26e4 feat: add support for aside HTML element in entry content 2024-07-25 21:11:37 -07:00
Frédéric Guillot
f6dc952551 feat: add support for base element when discovering feeds 2024-07-25 20:54:51 -07:00
Frédéric Guillot
29387f2d60 feat: implement base element handling in content scraper 2024-07-25 20:36:56 -07:00
Frédéric Guillot
c0f6e32a99 feat: remove well-known URL parameter trackers 2024-07-19 21:35:47 -07:00
Frédéric Guillot
36c25e7689 refactor: simplify Youtube feeds discovery 2024-07-13 12:17:13 -07:00
Frédéric Guillot
cb97d4a1a8 feat: remove YouTube video page subscription finder because meta[itemprop="channelId"] no longer exists 2024-07-13 11:11:50 -07:00
Frédéric Guillot
79ea9e28b5 fix: panic during YouTube channel feed discovery
Regression introduced in commit e54825b
2024-07-13 10:18:15 -07:00
Scott Leggett
bf1c851093 fetcher: use ETag as a stronger validator than Last-Modified
As per the MDN article on HTTP caching:

  During cache revalidation, if both If-Modified-Since and If-None-Match
  are present, then If-None-Match takes precedence for the validator.

  https://developer.mozilla.org/en-US/docs/Web/HTTP/Caching

Previously Miniflux would consider a resource unmodified if the
Last-Modified header had not changed, even if the ETag had changed.

With this commit, Miniflux will consider a resource modified if the ETag
header has changed, even if Last-Modified has not.

This fixes Bug 1 in https://rachelbythebay.com/w/2024/06/11/fsr/
2024-07-02 22:05:49 -07:00
Scott Leggett
c787bb5b48 fetcher: add tests for IsModified behaviour
In particular, add a failing test for the case where ETag changes but
Last-Modified does not.
2024-07-02 22:05:49 -07:00
privatmamtora
1a81866bb9
Add global block and keep filters 2024-07-02 21:03:49 -07:00
JohnnyJayJay
ee5e18ea9f sanitizer: add support for HTML hidden attribute
This commit adjusts the `Sanitize` function to skip tags with the
`hidden` attribute, similar to how it skips blocked tags and their
contents.
2024-06-21 14:00:40 -07:00
Ztec
9f3a8e7f1b Request builder: Allow the use of insecure TLS ciphers when Allow self-signed or invalid certificates is used
Some server on the wild are badly configured. Either by mistake or lack
of maintenance. Safe and unsafe Ciphers change overtime based on new
discoveries.

This proposition will include considered unsafe ciphers when `Allow self-signed or invalid certificates` is used.
It could be put into a separate option but, I felt this could fit in.

fix #2671
2024-06-13 20:23:37 -07:00
Ztec
e54825bf02 Improve YouTube page feed detection
In order to be more resilient to YouTube URLs variation and
to address this feature_request: https://github.com/miniflux/v2/issues/2628
I've reworked a bit the way the YouTube feed extraction is done.

I've kept all the `FindSubscriptionsFromYouTube*` in order
to keep all the existing unit tests as-is ensuring little to no
regressions. By doing so, I had to call twice `youtubeURLIDExtractor`.
Small performance penalty for peace of mind in my opinion.

`youtubeURLIDExtractor` is made in a way only one kind
of page can be detected at a time. This mean I can
solve the "video in a playlist" feature_request
by prioritizing the playlist ID over the Video ID

Also, by using `url.Parse()` to get ids, it's safer
to url mangle and variation. The most common variation
being the `t=42` parameters that start the playback
at a given position. Previously, this kind of url
would not be detected as "YouTube URL".

I deliberately ignored the url parsing error
to keep previous behavior (skip the YouTube analysis and follow with the other analysis)

I also tried to keep debug logs the same as before as much as I could.

I manually tested all the YouTube cases (video,channel,playlist)
and they all work as expected except for the video. But this one
does not work either on main. The `meta` html tag that was searched for
does not seem to exist anymore.

fix: #2628
2024-06-13 20:18:47 -07:00
x
839fc3843a Add pitchfork.com scraping rule 2024-06-10 21:08:59 -07:00
x
0bab8fac8e Update theverge.com rewrite rule: fix duplicate image
See: https://github.com/miniflux/v2/issues/1979
2024-06-10 21:08:59 -07:00
Ankit Pandey
b68b05c64c reader/processor: error out for improper rewrite regexp
It's possible to specify a rewrite regex that validates but doesn't compile such
as:

    rewrite("(((unmatched-capture-group"|"rewrite)))")

In case we encounter one, exit early instead of letting the server panic.
2024-06-01 10:37:02 -07:00
Zhizhen He
ae432bc9c6
reader/readingtime: fix incorrect package name 2024-05-21 18:12:24 -07:00
Jan-Lukas Else
a33b1adf13 Add description field to feed settings
This adds a new "description" field to the feed settings. This allows to
save custom description regarding a feed. It is also exported and
imported as "description" in OPML.
2024-05-06 15:40:36 -07:00
fin444
a631bd527d options: add FETCH_NEBULA_WATCH_TIME 2024-05-02 16:30:01 -07:00
Frédéric Guillot
fb075b60b5 reader/processor: minifier is breaking HTML entry content 2024-04-23 20:31:52 -07:00
Frédéric Guillot
2c4c845cd2 http/response: add brotli compression support 2024-04-19 12:16:49 -07:00
Frédéric Guillot
771f9d2b5f reader/fetcher: add brotli content encoding support 2024-04-19 10:50:46 -07:00
jvoisin
b205b5aad0 reader/processor: minimize the feed's entries html
Compress the html of feed entries before storing it. This should reduce the
size of the database a bit, but more importantly, reduce the amount of data
sent to clients

minify being [stupidly fast](https://github.com/tdewolff/minify/?tab=readme-ov-file#performance), the performance impact should be in the noise level.
2024-04-10 19:48:48 -07:00
Frédéric Guillot
38b80d96ea storage: change GetReadTime() function to use entries_feed_id_hash_key index 2024-04-09 20:37:30 -07:00
Frédéric Guillot
fdd1b3f18e database: entry URLs can exceeds btree index size limit 2024-04-04 20:22:23 -07:00
Evan Elias Young
1b8c45d162 finder: Find feed from YouTube playlist
The feed from a YouTube playlist page is derived in practically the same way as a feed from a YouTube channel page.
2024-04-01 21:16:32 -07:00
jvoisin
19ce519836 reader/rewrite: add a rule for oglaf.com
By default, Oglaf show some disclaimer/warning about its content, and this
doesn't play well with rss readers, so let's rewrite it to show the actual
comic instead of a placeholder.
2024-04-01 21:05:01 -07:00
jvoisin
f109e3207c reader/rss: don't add empty tags to RSS items
This commit adds a bunch of checks to prevent reader/rss from adding empty tags
to rss items, as well as some minor refactors like nested conditions and loops
unrolling.
2024-03-24 19:46:56 -07:00
Frédéric Guillot
ad1d349a0c rss: use Channel tags only if there is no Item tags 2024-03-23 13:46:48 -07:00
jvoisin
fc4bdf3ab0 Inline a one-liner function
No need to expose a symbol for this.
2024-03-20 17:21:30 -07:00
Frédéric Guillot
08640b27d5 Ensure enclosure URLs are always absolute 2024-03-19 21:57:46 -07:00
jvoisin
4be993e055 Minor refactoring of internal/reader/atom/atom_10_adapter.go
- Move the population of the feed's entries into a new function, to make
  `BuildFeed` easier to understand/separate concerns/implementation details
- Use `sort+compact` instead of `compact+sort` to remove duplicates
- Change `if !a { a = } if !a {a = }` constructs into `if !a { a = ; if !a {a = }}`.
  This reduce the number of comparisons, but also improves a tad the
  control-flow readability.
2024-03-19 20:41:44 -07:00
Jean Khawand
a78d1c79da
Add FILTER_ENTRY_MAX_AGE_DAYS config option to limit fetching all feed items 2024-03-20 02:58:53 +00:00
Frédéric Guillot
fa9697b972 Remove trailing space in SiteURL and FeedURL 2024-03-18 17:51:06 -07:00
jvoisin
91f5522ce0 Minor simplification of internal/reader/media/media.go
- Simplify a switch-case by moving a common condition above it.
- Remove a superfluous error-check: `strconv.ParseInt` returns `0` when passed
  an empty string.
2024-03-18 16:09:32 -07:00
Frédéric Guillot
8212f16aa2 atom: avoid debug message when the date is empty 2024-03-17 15:29:50 -07:00
Frédéric Guillot
b1e73fafdf Enable go-critic linter and fix various issues detected 2024-03-17 13:52:34 -07:00
jvoisin
c29ca0e313 Minor simplifications of the rewriter
- Online some one-line functions
- Transform a free-standing function into a method
- Massively simplify `removeClickbait`
- Use a proper constant instead of a magic number in `applyFuncOnTextContent`
2024-03-17 12:15:46 -07:00
jvoisin
02a074ed26 Compile block/keep regex only once per feed
No need to compile them once for matching on the url,
once per tag, once per title, once per author, … one time is enough.
It also simplify error handling, since while regexp compilation can fail,
matching can't.
2024-03-17 12:08:03 -07:00
Frédéric Guillot
309fdbb9fc Fix force refresh 2024-03-15 19:42:09 -07:00
Frédéric Guillot
4834e934f2 Remove some duplicated code in RSS parser 2024-03-15 18:40:06 -07:00
Frédéric Guillot
dd4fb660c1 Refactor Atom parser to use an adapter 2024-03-15 17:27:16 -07:00