confi-emacs-inicial/elpa/ac-html-20151005.731/completion-data/html-attributes-short-docs/meta-charset
2020-08-02 01:57:21 -04:00

14 lines
2.2 KiB
Plaintext

charset [HTML5]
This attribute declares the character encoding used of the page. It can be locally overridden using the lang attribute on any element. This attribute is a literal string and must be one of the preferred MIME name for a character encoding as defined by the IANA. Though the standard doesn't request a specific character encoding, it gives some recommendations:
Authors are encouraged to use UTF-8.
Authors should not use ASCII-incompatible encodings (i.e. those that don't map the 8-bit code points 0x20 to 0x7E to the Unicode 0x0020 to 0x007E code points) as these represent a security risk: browsers not supporting them may interpret benign content as HTML Elements. This is the case of at least the following charsets: JIS_C6226-1983, JIS_X0212-1990, HZ-GB-2312, JOHAB, the ISO-2022 family, and the EBCDIC family.
Authors must not use CESU-8, UTF-7, BOCU-1 and SCSU, also falling in that category and not intended to be used on the web. Cross-scripting attacks with some of these encodings have been documented.
Authors should not use UTF-32 because not all HTML5 encoding algorithms can distinguish it from UTF-16.
Notes:
The declared character set must match the one of the page. There is no valid reason to declare an inaccurate character set.
This <meta> element must be inside the <head> element and within the 512 first bytes of the page, as some browsers only look at these first bytes before choosing a character set for the page.
This <meta> element is only a part of the algorithm to determine the character set of a page that browsers apply. The HTTP Content-Type header and any BOM elements have precedence over this element.
It is good practice, and strongly recommended, to define the character set using this attribute. If no character set is defined for a page, several cross-scripting techniques may be able to harm the page user, such as the UTF-7 fallback cross-scripting technique. Always setting this meta will protect against these risks.
This <meta> element is a synonym for the pre-HTML5 <meta http-equiv="Content-Type" content="text/html; charset=IANAcharset"> where IANAcharset corresponds of the value of the equivalent charset attribute. This syntax is still allowed, although obsolete and no more recommended.