aboutsummaryrefslogtreecommitdiffstats
path: root/doc/README.siteconf
diff options
context:
space:
mode:
authorTatsuya Kinoshita <tats@debian.org>2021-01-02 00:20:37 +0000
committerTatsuya Kinoshita <tats@debian.org>2021-01-02 00:20:37 +0000
commit1d0ba25a660483da1272a31dd077ed94441e3d9f (patch)
tree1d8dee52cd1e3d340fe178a8193dc96c4496db84 /doc/README.siteconf
parentMerge branch 'cvstrunk' into upstream (diff)
downloadw3m-1d0ba25a660483da1272a31dd077ed94441e3d9f.tar.gz
w3m-1d0ba25a660483da1272a31dd077ed94441e3d9f.zip
New upstream version 0.5.3+git20210102upstream/0.5.3+git20210102upstream
Diffstat (limited to 'doc/README.siteconf')
-rw-r--r--doc/README.siteconf73
1 files changed, 73 insertions, 0 deletions
diff --git a/doc/README.siteconf b/doc/README.siteconf
new file mode 100644
index 0000000..8514edf
--- /dev/null
+++ b/doc/README.siteconf
@@ -0,0 +1,73 @@
+The siteconf: Site-specific preferences
+
+The siteconf consists of URL patterns and preferences associated to them.
+You can improve "decode_url" feature by giving charsets of URLs site by site,
+or bypass Google's redirector for performance and your privacy.
+
+The siteconf is read from ~/.w3m/siteconf by default.
+
+===== The syntax =====
+
+url <url>|/<re-url>/|m@<re-url>@i [exact]
+substitute_url "<destination-url>"
+url_charset <charset>
+no_referer_from on|off
+no_referer_to on|off
+user_agent "string"
+
+The last match wins.
+
+===== Examples =====
+
+url m!^https?://([a-z]+\.)?twitter\.com/!
+substitute_url "https://nitter.net/"
+
+This forwards twitter.com to the alternative site.
+
+url "http://your.bookmark.net/"
+no_referer_from on
+
+This prevents HTTP referers from being sent when you follow links
+at the your.bookmark.net.
+
+url "http://www.google.com/url?" exact
+substitute_url "file:///cgi-bin/your-redirector.cgi?"
+
+This forwards the Google's redirector to your local CGI.
+
+url /^http:\/\/[a-z]*\.wikipedia\.org\//
+url_charset utf-8
+
+When combinated with "decode_url" option turned on, links to
+Wikipedia will be human-readable.
+
+url m@^https?://(.*\.)google\.com/@
+user_agent "Lynx/2.8.8dev.3 libwww-FM/2.14 SSL-MM/1.4.1"
+
+Tell Google we're actually Lynx. (So they send us a text-browser friendly
+results page.)
+
+url m!^https?://([a-z]+\.)?twitter\.com/!
+user_agent "Googlebot/2.1"
+
+Tell Twitter we're actually Googlebot. (So they send us a page without
+rejection of a JavaScript disabled browser.)
+
+===== Regular expressions notes =====
+
+Following expressions are all equivalent:
+
+/http:\/\/www\.example\.com\//
+m/http:\/\/www\.example\.com\//
+m@http://www\.example\.com/@
+m!http://www\.example\.com/!
+
+With a trailing 'i' modifier, you can specify a case-insensitive match.
+For example, m@^http://www\.example\.com/abc/@i matches to:
+
+http://www.example.com/abc/
+http://www.example.com/Abc/
+http://www.example.com/ABC/
+
+Hostnames, however, are always converted to lowercases before compared.
+