diff --git a/Creating-a-config-file-for-well-parsing-a-website.md b/Creating-a-config-file-for-well-parsing-a-website.md new file mode 100644 index 0000000..c142566 --- /dev/null +++ b/Creating-a-config-file-for-well-parsing-a-website.md @@ -0,0 +1,20 @@ +If wallabag is not able to correctly fetch an article, you can create a file for the website which causes trouble. + +Here is an example: + +For bfmtv.com, you must have a specific file. Create a `bfmtv.com.txt` file in `/inc/3rdparty/site_config/custom` with this content: + +``` +title: //title +body: //h2 | //span[@class='masque'] | //article[@class='corps_article_right'] +prune: no +tidy: no + +test_url: http://www.bfmtv.com/societe/cigarette-electronique-dangers-588622.html +``` + +The syntax for `title` and `body` parameters is http://en.wikipedia.org/wiki/XPath|XPath. + +You can also try [Visual content block selector](http://siteconfig.fivefilters.org/). + +You can find the files already created for specific websites here: https://github.com/wallabag/wallabag/tree/master/inc/3rdparty/site_config/standard \ No newline at end of file