1
0
Fork 0
mirror of https://github.com/miniflux/v2.git synced 2025-08-01 17:38:37 +00:00

Return outer HTML when scraping elements

This commit is contained in:
cinput 2019-12-21 21:18:31 -08:00 committed by Frédéric Guillot
parent 30f22fbd78
commit 8e1ed8bef3
8 changed files with 73 additions and 8 deletions

12
reader/scraper/testdata/iframe.html vendored Normal file
View file

@ -0,0 +1,12 @@
<!DOCTYPE html>
<html lang="en-US">
<body>
<article>
<iframe id="1" src="about:blank"></iframe>
<iframe id="2" src="about:blank"></iframe>
<iframe id="3" src="about:blank"></iframe>
<iframe id="4" src="about:blank"></iframe>
<iframe id="5" src="about:blank"></iframe>
</article>
</body>
</html>

View file

@ -0,0 +1 @@
<iframe id="1" src="about:blank"></iframe><iframe id="2" src="about:blank"></iframe><iframe id="3" src="about:blank"></iframe><iframe id="4" src="about:blank"></iframe><iframe id="5" src="about:blank"></iframe>

12
reader/scraper/testdata/img.html vendored Normal file
View file

@ -0,0 +1,12 @@
<!DOCTYPE html>
<html lang="en-US">
<body>
<article>
<img id="1" src="#" alt="" />
<img id="2" src="#" alt="" />
<img id="3" src="#" alt="" />
<img id="4" src="#" alt="" />
<img id="5" src="#" alt="" />
</article>
</body>
</html>

View file

@ -0,0 +1 @@
<img id="1" src="#" alt=""/><img id="2" src="#" alt=""/><img id="3" src="#" alt=""/><img id="4" src="#" alt=""/><img id="5" src="#" alt=""/>

10
reader/scraper/testdata/p.html vendored Normal file
View file

@ -0,0 +1,10 @@
<!DOCTYPE html>
<html lang="en-US">
<body>
<article>
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing ept.</p>
<p>Apquam tincidunt mauris eu risus.</p>
<p>Vestibulum auctor dapibus neque.</p>
</article>
</body>
</html>

1
reader/scraper/testdata/p.html-result vendored Normal file
View file

@ -0,0 +1 @@
<p>Lorem ipsum dolor sit amet, consectetuer adipiscing ept.</p><p>Apquam tincidunt mauris eu risus.</p><p>Vestibulum auctor dapibus neque.</p>