diff options
author | Kyle J. McKay <mackyle@gmail.com> | 2019-11-08 22:53:20 +0000 |
---|---|---|
committer | Kyle J. McKay <mackyle@gmail.com> | 2019-11-08 22:53:20 +0000 |
commit | e8948ec3a397d6e78e64902b99bb2f7a16facd18 (patch) | |
tree | 3df5fc7e8f04ef3b9a0316731952db54043daf57 /indep.c | |
parent | Update ChangeLog (diff) | |
download | w3m-e8948ec3a397d6e78e64902b99bb2f7a16facd18.tar.gz w3m-e8948ec3a397d6e78e64902b99bb2f7a16facd18.zip |
entities: support ' entity
The XHTML standard encompasses the XML standard.
From the beginning, the XML standard [1] has always included required
support for five character entities:
1. the ampersand (&) as &
2. the left angle bracket (<) as <
3. the right angle bracket (>) as >
4. the double-quote character (") as "
5. the apostrophe or single-quote character (') as '
See section "2.4 Character Data and Markup" of the XML standard [1]
for further details.
Add support for the character single-quote character entity (')
in order to fully support XHTML pages.
[1]: https://www.w3.org/TR/REC-xml/
Signed-off-by: Kyle J. McKay <mackyle@gmail.com>
Diffstat (limited to 'indep.c')
-rw-r--r-- | indep.c | 6 |
1 files changed, 3 insertions, 3 deletions
@@ -19,7 +19,7 @@ unsigned char QUOTE_MAP[0x100] = { /* DLE DC1 DC2 DC3 DC4 NAK SYN ETB CAN EM SUB ESC FS GS RS US */ 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, /* SPC ! " # $ % & ' ( ) * + , - . / */ - 24, 72, 76, 40, 8, 40, 41, 72, 72, 72, 72, 40, 72, 8, 0, 64, + 24, 72, 76, 40, 8, 40, 41, 77, 72, 72, 72, 40, 72, 8, 0, 64, /* 0 1 2 3 4 5 6 7 8 9 : ; < = > ? */ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 32, 72, 74, 72, 75, 40, /* @ A B C D E F G H I J K L M N O */ @@ -47,7 +47,7 @@ char *HTML_QUOTE_MAP[] = { "<", ">", """, - NULL, + "'", NULL, NULL, }; @@ -462,7 +462,7 @@ getescapechar(char **str) q = p; for (p++; IS_ALNUM(*p); p++) ; q = allocStr(q, p - q); - if (strcasestr("lt gt amp quot nbsp", q) && *p != '=') { + if (strcasestr("lt gt amp quot apos nbsp", q) && *p != '=') { /* a character entity MUST be terminated with ";". However, * there's MANY web pages which uses < , > or something * like them as <, >, etc. Therefore, we treat the most |