Estándar HTML

19/7/2020 Estándar HTML
HTML
Nivel de vida: última actualización 19 de julio de 2020
← 11 Almacenamiento web - Tabla de contenido - 12.2 Análisis de documentos HTML →
12 La sintaxis HTML
12.1 Escribir documentos HTML
12.1.1 El DOCTYPE
12.1.2 Elementos
12.1.2.1 Etiquetas de inicio
12.1.2.2 Etiquetas finales
12.1.2.3 Atributos
12.1.2.4 Etiquetas opcionales
12.1.2.5 Restricciones en los modelos de contenido.
12.1.2.6 Restricciones en el contenido de texto sin formato y elementos de texto sin formato evitables
12.1.3 Texto
12.1.3.1 Líneas nuevas
12.1.4 Referencias de personajes
12.1.5 secciones CDATA
12.1.6 Comentarios
Presentar un problema sobre el texto seleccionado
https://html.spec.whatwg.org/multipage/syntax.html#optional-start-and-end-tags 1/13
12 La sintaxis HTML §
Note
Esta sección solo describe las reglas para los recursos etiquetados con un tipo HTML MIME . Las reglas para los recursos XML se analizan en la
siguiente sección titulada " La sintaxis XML ".
12.1 Escribir documentos HTML §
Esta sección solo se aplica a documentos, herramientas de creación y generadores de marcas. En particular, no se aplica a los verificadores de
conformidad; los verificadores de conformidad deben usar los requisitos que se dan en la siguiente sección ("análisis de documentos HTML").
Los documentos deben constar de las siguientes partes, en el orden dado:
1. Opcionalmente, un solo carácter U + FEFF BYTE ORDER MARK (BOM).
2. Cualquier número de comentarios y espacios en blanco ASCII .
3. Un tipo de documento .
5. El elemento del documento , en forma de html elemento .
Los diversos tipos de contenido mencionados anteriormente se describen en las siguientes secciones.
Además, existen algunas restricciones sobre cómo se deben serializar las declaraciones de codificación de caracteres , como se discutió en la sección
sobre ese tema.
Note
El espacio en blanco ASCII antes del htmlelemento, al comienzo del htmlelemento y antes del headelemento, se eliminará cuando se analice el
documento; El espacio en blanco ASCII después del htmlelemento se analizará como si estuviera al final del bodyelemento. Por lo tanto, el espacio
en blanco ASCII alrededor del elemento del documento no se redondea.
Se sugiere que se inserten nuevas líneas después del DOCTYPE, después de cualquier comentario que esté antes del elemento del documento,
después de la htmletiqueta de inicio del elemento (si no se omite ) y después de cualquier comentario que esté dentro del htmlelemento pero antes
del headelemento.
Muchas cadenas en la sintaxis HTML (p. Ej., Los nombres de los elementos y sus atributos) no distinguen entre mayúsculas y minúsculas, pero solo para
los alfa superiores ASCII y los alfa inferiores ASCII . Por conveniencia, en esta sección esto se conoce como "sin distinción entre mayúsculas y
minúsculas".
12.1.1 El DOCTYPE §
Un DOCTYPE es un preámbulo requerido.
Note
Los DOCTYPE se requieren por motivos heredados. Cuando se omite, los navegadores tienden a usar un modo de representación diferente que es
incompatible con algunas especificaciones. La inclusión del DOCTYPE en un documento garantiza que el navegador haga el mejor esfuerzo posible
para seguir las especificaciones relevantes.
Un DOCTYPE debe constar de los siguientes componentes, en este orden:
1. Una cadena que coincide con mayúsculas y minúsculas ASCII para la cadena " <!DOCTYPE".
2. Uno o más espacios en blanco ASCII .
3. Una cadena que coincide con mayúsculas y minúsculas ASCII para la cadena " html".
4. Opcionalmente, una cadena heredada DOCTYPE .
5. Cero o más espacios en blanco ASCII .
6. Un carácter U + 003E MAYOR QUE EL SIGNO (>).
Note
En otras palabras, <!DOCTYPE html>sin distinción entre mayúsculas y minúsculas.
A los fines de los generadores HTML que no pueden generar un marcado HTML con el DOCTYPE corto " <!DOCTYPE html>", se puede insertar una
cadena heredada DOCTYPE en el DOCTYPE (en la posición definida anteriormente). Esta cadena debe consistir en:

2. Una cadena que coincide con mayúsculas y minúsculas ASCII para la cadena " SYSTEM".
4. Un U + 0022 COTIZACIÓN MARCA o U + 0027 APÓSTROFO carácter (la comilla ).
5. La cadena literal " about:legacy-compat".
6. Una marca de comillas U + 0022 o un APÓSTROFO U + 0027 coincidente (es decir, el mismo carácter que en el paso anterior marcado con
comillas ).
Note
En otras palabras, <!DOCTYPE html SYSTEM "about:legacy-compat">o <!DOCTYPE html SYSTEM 'about:legacy-compat'>, sin distinción
entre mayúsculas y minúsculas, excepto la parte entre comillas simples o dobles.
La cadena heredada DOCTYPE no debe usarse a menos que el documento se genere desde un sistema que no puede generar la cadena más corta.
12.1.2 Elementos §
Hay seis tipos diferentes de elementos : elementos vacíos , el templateelemento , elementos de texto sin formato , elementos de texto sin formato
escapables , elementos extraños y elementos normales .
Elementos vacíos
area, base, br, col, embed, hr, img, input, link, meta, param, source, track,wbr
El templateelemento
template
Elementos de texto sin procesar

script, style
Elementos de texto sin procesar imposibles

textarea, title
Elementos extraños
Elementos del espacio de nombres MathML y el espacio de nombres SVG .
Elementos normales
Todos los demás elementos HTML permitidos son elementos normales.
Las etiquetas se utilizan para delimitar el inicio y el final de los elementos en el marcado. El texto sin formato , el texto sin formato y los elementos
normales tienen una etiqueta de inicio para indicar dónde comienzan y una etiqueta de final para indicar dónde terminan. Las etiquetas de inicio y fin de
ciertos elementos normales pueden omitirse , como se describe a continuación en la sección de etiquetas opcionales . Los que no pueden omitirse no
deben omitirse. Los elementos nulos solo tienen una etiqueta de inicio; Las etiquetas finales no deben especificarse para elementos vacíos . Elementos
extraños deben tener una etiqueta de inicio y una etiqueta de finalización, o una etiqueta de inicio que esté marcada como de cierre automático, en cuyo
caso no deben tener una etiqueta de finalización.
El contenido del elemento debe colocarse entre justo después de la etiqueta de inicio (que puede estar implícito, en ciertos casos ) y justo antes de la
etiqueta de finalización (que nuevamente, puede estar implícito en ciertos casos ). El contenido exacto permitido de cada elemento individual depende del
modelo de contenido de ese elemento, como se describió anteriormente en esta especificación. Los elementos no deben contener contenido que su
modelo de contenido no permita. Además de las restricciones impuestas a los contenidos por esos modelos de contenido, sin embargo, los cinco tipos de
elementos tienen requisitos sintácticos adicionales .
Los elementos vacíos no pueden tener ningún contenido (dado que no hay una etiqueta final, no se puede poner contenido entre la etiqueta inicial y la
etiqueta final).
El templateelemento puede tener contenido de plantilla , pero dicho contenido de plantilla no es hijo del templateelemento en sí. En cambio, se
almacenan en un DocumentFragment asociado con un diferente Document, sin un contexto de navegación , para evitar que el templatecontenido
interfiera con el principal Document. El marcado del contenido de la plantilla de un templateelemento se coloca justo después de la templateetiqueta de
inicio del elemento y justo antes de template la etiqueta de finalización del elemento (como con otros elementos), y puede consistir en cualquier texto ,
referencias de caracteres , elementos y comentarios., pero el texto no debe contener el carácter U + 003C MENOS QUE EL SIGNO (<) o un signo
ambiguo .
Los elementos de texto sin formato pueden tener texto , aunque tiene las restricciones que se describen a continuación.
Los elementos de texto sin procesar que pueden evitarse pueden tener referencias de texto y caracteres , pero el texto no debe contener un ampersand
ambiguo . También hay restricciones adicionales que se describen a continuación.
Los elementos extraños cuya etiqueta de inicio está marcada como de cierre automático no pueden tener ningún contenido (ya que, nuevamente, como
no hay una etiqueta de finalización, no se puede poner contenido entre la etiqueta de inicio y la etiqueta de finalización). Los elementos extraños cuya
etiqueta de inicio no está marcada como de cierre automático pueden tener texto , referencias de caracteres , secciones CDATA , otros elementos y
comentarios , pero el texto no debe contener el carácter U + 003C MENOS QUE EL SIGNO (<) o un signo ambiguo. .
Note
La sintaxis HTML no admite declaraciones de espacio de nombres, incluso en elementos extraños .
Por ejemplo, considere el siguiente fragmento HTML:
<p>
<svg>
<metadata>

<cdr:license xmlns:cdr="https://www.example.com/cdr/metadata" name="MIT"/>
</metadata>
</svg>
</p>
El elemento más interno, cdr:licenseen realidad está en el espacio de nombres SVG, ya que el xmlns:cdratributo " " no tiene ningún efecto (a
diferencia de XML). De hecho, como dice el comentario en el fragmento anterior, el fragmento en realidad no es conforme. Esto se debe a que SVG
2 no define ningún elemento llamado " cdr:license" en el espacio de nombres SVG.
Los elementos normales pueden tener texto , referencias de caracteres , otros elementos y comentarios , pero el texto no debe contener el carácter U +
003C MENOS QUE EL SIGNO (<) o un signo ambiguo . Algunos elementos normales también tienen aún más restricciones sobre el contenido que
pueden tener, más allá de las restricciones impuestas por el modelo de contenido y las descritas en este párrafo. Esas restricciones se describen a
continuación.
Las etiquetas contienen un nombre de etiqueta , dando el nombre del elemento. Todos los elementos HTML tienen nombres que solo usan caracteres
alfanuméricos ASCII . En la sintaxis HTML, los nombres de etiquetas, incluso los de elementos extraños , se pueden escribir con cualquier combinación
de letras mayúsculas y minúsculas que, cuando se convierten a minúsculas, coinciden con el nombre de la etiqueta del elemento; los nombres de las
etiquetas no distinguen entre mayúsculas y minúsculas.
12.1.2.1 Etiquetas de inicio §
Las etiquetas de inicio deben tener el siguiente formato:
1. El primer carácter de una etiqueta de inicio debe ser un carácter U + 003C MENOS QUE SEÑAL (<).
2. Los siguientes caracteres de una etiqueta de inicio deben ser el nombre de la etiqueta del elemento .
3. Si debe haber algún atributo en el siguiente paso, primero debe haber uno o más espacios en blanco ASCII .
4. Luego, la etiqueta de inicio puede tener varios atributos, cuya sintaxis se describe a continuación. Los atributos deben estar separados entre sí
por uno o más espacios en blanco ASCII .
5. Después de los atributos, o después del nombre de la etiqueta si no hay atributos, puede haber uno o más espacios en blanco ASCII . (Algunos
atributos deben ir seguidos de un espacio. Consulte la sección de atributos a continuación).
6. Entonces, si el elemento es uno de los elementos vacíos , o si el elemento es un elemento extraño , entonces puede haber un solo carácter U +
002F SOLIDUS (/). Este carácter no tiene efecto en los elementos vacíos , pero en los elementos extraños marca la etiqueta de inicio como
cierre automático.
7. Finalmente, las etiquetas de inicio deben cerrarse con un carácter U + 003E MAYOR QUE EL SIGNO (>).
12.1.2.2 Etiquetas finales §
Las etiquetas finales deben tener el siguiente formato:
1. El primer carácter de una etiqueta final debe ser un carácter U + 003C MENOS QUE SEÑAL (<).
2. El segundo carácter de una etiqueta final debe ser un carácter SOLIDUS U + 002F (/).
3. Los siguientes caracteres de una etiqueta final deben ser el nombre de la etiqueta del elemento .
4. Después del nombre de la etiqueta, puede haber uno o más espacios en blanco ASCII .
5. Finalmente, las etiquetas finales deben estar cerradas con un carácter U + 003E MAYOR QUE SEÑAL (>).
12.1.2.3 Atributos §
Los atributos de un elemento se expresan dentro de la etiqueta de inicio del elemento.
Los atributos tienen un nombre y un valor. Los nombres de los atributos deben constar de uno o más caracteres que no sean controles , U + 0020
SPACE, U + 0022 ("), U + 0027 ('), U + 003E (>), U + 002F (/), U + 003D ( =) y no caracteres . En la sintaxis HTML, los nombres de atributos, incluso los
de elementos extraños , se pueden escribir con cualquier combinación de alfabéticos ASCII inferiores y superiores ASCII .
Los valores de los atributos son una mezcla de referencias de texto y caracteres , excepto con la restricción adicional de que el texto no puede
contener un ampersand ambiguo .
Los atributos se pueden especificar de cuatro maneras diferentes:
Sintaxis de atributo vacío

Solo el nombre del atributo . El valor es implícitamente la cadena vacía.
Example
En el siguiente ejemplo, el disabledatributo se proporciona con la sintaxis de atributo vacía:
<input disabled>
Si un atributo que utiliza la sintaxis de atributo vacío debe ser seguido por otro atributo, entonces debe haber un espacio en blanco ASCII que separe
los dos.
Sintaxis de valor de atributo sin comillas

El nombre del atributo , seguido de cero o más espacios en blanco ASCII , seguido de un solo carácter U + 003D EQUALS SIGN, seguido de cero o
más espacios en blanco ASCII , seguido del valor del atributo , que, además de los requisitos dados anteriormente para los valores de los atributos,
no debe contener ningún espacio en blanco ASCII literal , ningún carácter de MARCA DE COTIZACIÓN U + 0022 ("), caracteres de APÓSTROFO U +
0027 ('), caracteres de SIGNO IGUAL U + 003D (=), caracteres de SIGNO MENOS QUE U + 003C (<), U + 003E MAYORES DE SEÑALES (>), o U +
0060 GRAVE ACCENT caracteres (`), y no debe ser la cadena vacía.
Example
En el siguiente ejemplo, el valueatributo se proporciona con la sintaxis de valor de atributo sin comillas:
<input value=yes>
Si un atributo que utiliza la sintaxis de atributo sin comillas debe ser seguido por otro atributo o por el carácter opcional U + 002F SOLIDUS (/)
permitido en el paso 6 de la sintaxis de etiqueta de inicio anterior, entonces debe haber un espacio en blanco ASCII que separe los dos.
Sintaxis de valor de atributo entre comillas simples

más espacios en blanco ASCII , seguido de un solo carácter U + 0027 APOSTROPHE ('), seguido del valor del atributo , que, además de los
requisitos dados anteriormente para los valores de los atributos, no debe contener ningún carácter APOSTROPHE U + 0027 literal (') y, finalmente,
seguido de un segundo carácter APOSTROPHE U + 0027 (').
Example
En el siguiente ejemplo, el typeatributo se proporciona con la sintaxis de valor de atributo entre comillas simples:
<input type='checkbox'>
Si un atributo que usa la sintaxis de atributo entre comillas simples debe ser seguido por otro atributo, entonces debe haber un espacio en blanco
ASCII que separe los dos.
Sintaxis de valor de atributo entre comillas dobles

más espacios en blanco ASCII , seguido de un solo carácter U + 0022 QUOTATION MARK ("), seguido del valor del atributo , que, además de los
requisitos dados anteriormente para los valores de los atributos, no debe contener ningún carácter de MARCA DE COTIZACIÓN literal U + 0022 (") y,
finalmente, seguido de un segundo carácter de MARCA DE COTICIÓN U + 0022 (").
Example
En el siguiente ejemplo, el nameatributo se proporciona con la sintaxis de valor de atributo entre comillas dobles:
<input name="be evil">
Si un atributo que utiliza la sintaxis del atributo entre comillas dobles debe ser seguido por otro atributo, entonces debe haber un espacio en blanco
ASCII que separe los dos.
Nunca debe haber dos o más atributos en la misma etiqueta de inicio cuyos nombres sean una coincidencia entre mayúsculas y minúsculas ASCII entre
sí.
Cuando un elemento extraño tiene uno de los atributos de espacio de nombres dados por el nombre local y el espacio de nombres de la primera y
segunda celdas de una fila de la siguiente tabla, debe escribirse usando el nombre dado por la tercera celda de la misma fila.
Nombre local Espacio de nombres Nombre del Atributo

actuate Espacio de nombres XLink xlink:actuate
arcrole Espacio de nombres XLink xlink:arcrole
href Espacio de nombres XLink xlink:href
role Espacio de nombres XLink xlink:role
show Espacio de nombres XLink xlink:show
title Espacio de nombres XLink xlink:title
type Espacio de nombres XLink xlink:type
lang Espacio de nombres XML xml:lang
space Espacio de nombres XML xml:space
xmlns Espacio de nombres XMLNS xmlns
xlink Espacio de nombres XMLNS xmlns:xlink
Ningún otro atributo de espacio de nombres se puede expresar en la sintaxis HTML .
Note
Si los atributos en la tabla anterior son conformes o no, está definido por otras especificaciones (por ejemplo, SVG 2 y MathML ); Esta sección solo
describe las reglas de sintaxis si los atributos se serializan utilizando la sintaxis HTML.
12.1.2.4 Etiquetas opcionales §
Ciertas etiquetas pueden omitirse .
Note
Omitir la etiqueta de inicio de un elemento en las situaciones descritas a continuación no significa que el elemento no esté presente; está implícito,
pero todavía está allí. Por ejemplo, un documento HTML siempre tiene un htmlelemento raíz , incluso si la cadena <html>no aparece en ninguna
parte del marcado.
htmlLa etiqueta de inicio de un elemento puede omitirse si lo primero dentro del htmlelemento no es un comentario .
Example
Por ejemplo, en el siguiente caso está bien eliminar la <html>etiqueta " ":
<!DOCTYPE HTML>
<html>
<head>
<title>Hello</title>
</head>
<body>
<p>Welcome to this example.</p>
</body>
</html>
Hacerlo haría que el documento se vea así:
<!DOCTYPE HTML>
<head>
</head>
<body>
</body>
</html>
Esto tiene exactamente el mismo DOM. En particular, tenga en cuenta que el analizador ignora los espacios en blanco alrededor del elemento del
documento . El siguiente ejemplo también tendría exactamente el mismo DOM:
<!DOCTYPE HTML><head>
</head>
<body>
</body>
</html>
Sin embargo, en el siguiente ejemplo, al eliminar la etiqueta de inicio, el comentario se mueve antes del htmlelemento:
<!DOCTYPE HTML>
<html>

<head>
</head>
<body>
</body>
</html>
Con la etiqueta eliminada, el documento en realidad se convierte en lo mismo que esto:

<!DOCTYPE HTML>

<html>
<head>
</head>
<body>
</body>
</html>
This is why the tag can only be removed if it is not followed by a comment: removing the tag when there is a comment there changes the document's
resulting parse tree. Of course, if the position of the comment does not matter, then the tag can be omitted, as if the comment had been moved to
before the start tag in the first place.
An html element's end tag may be omitted if the html element is not immediately followed by a comment.
A head element's start tag may be omitted if the element is empty, or if the first thing inside the head element is an element.
A head element's end tag may be omitted if the head element is not immediately followed by ASCII whitespace or a comment.
A body element's start tag may be omitted if the element is empty, or if the first thing inside the body element is not ASCII whitespace or a comment,
except if the first thing inside the body element is a meta, link, script, style, or template element.
A body element's end tag may be omitted if the body element is not immediately followed by a comment.
Example
Note that in the example above, the head element start and end tags, and the body element start tag, can't be omitted, because they are surrounded
by whitespace:
<!DOCTYPE HTML>
<html>
<head>
</head>
<body>
</body>
</html>
(The body and html element end tags could be omitted without trouble; any spaces after those get parsed into the body element anyway.)
Usually, however, whitespace isn't an issue. If we first remove the whitespace we don't care about:
<!DOCTYPE HTML><html><head><title>Hello</title></head><body><p>Welcome to this example.</p></body></html>
Then we can omit a number of tags without affecting the DOM:
<!DOCTYPE HTML><title>Hello</title><p>Welcome to this example.</p>
At that point, we can also add some whitespace back:
<!DOCTYPE HTML>
This would be equivalent to this document, with the omitted tags shown in their parser-implied positions; the only whitespace text node that results
from this is the newline at the end of the head element:
<!DOCTYPE HTML>
<html><head><title>Hello</title>
Presentar </head><body><p>Welcome to this example.</p></body></html>
un problema sobre el texto seleccionado
An li element's end tag may be omitted if the li element is immediately followed by another li element or if there is no more content in the parent
element.
A dt element's end tag may be omitted if the dt element is immediately followed by another dt element or a dd element.
A dd element's end tag may be omitted if the dd element is immediately followed by another dd element or a dt element, or if there is no more content in
the parent element.
A p element's end tag may be omitted if the p element is immediately followed by an address, article, aside, blockquote, details, div, dl,
fieldset, figcaption, figure, footer, form, h1, h2, h3, h4, h5, h6, header, hgroup, hr, main, menu, nav, ol, p, pre, section, table, or ul element,
or if there is no more content in the parent element and the parent element is an HTML element that is not an a, audio, del, ins, map, noscript, or
video element, or an autonomous custom element.
Example
We can thus simplify the earlier example further:
<!DOCTYPE HTML><title>Hello</title><p>Welcome to this example.
An rt element's end tag may be omitted if the rt element is immediately followed by an rt or rp element, or if there is no more content in the parent
element.
An rp element's end tag may be omitted if the rp element is immediately followed by an rt or rp element, or if there is no more content in the parent
element.
An optgroup element's end tag may be omitted if the optgroup element is immediately followed by another optgroup element, or if there is no more
content in the parent element.
An option element's end tag may be omitted if the option element is immediately followed by another option element, or if it is immediately followed by
an optgroup element, or if there is no more content in the parent element.
A colgroup element's start tag may be omitted if the first thing inside the colgroup element is a col element, and if the element is not immediately
preceded by another colgroup element whose end tag has been omitted. (It can't be omitted if the element is empty.)
A colgroup element's end tag may be omitted if the colgroup element is not immediately followed by ASCII whitespace or a comment.
A caption element's end tag may be omitted if the caption element is not immediately followed by ASCII whitespace or a comment.
A thead element's end tag may be omitted if the thead element is immediately followed by a tbody or tfoot element.
A tbody element's start tag may be omitted if the first thing inside the tbody element is a tr element, and if the element is not immediately preceded by a
tbody, thead, or tfoot element whose end tag has been omitted. (It can't be omitted if the element is empty.)
A tbody element's end tag may be omitted if the tbody element is immediately followed by a tbody or tfoot element, or if there is no more content in the
parent element.
A tfoot element's end tag may be omitted if there is no more content in the parent element.
A tr element's end tag may be omitted if the tr element is immediately followed by another tr element, or if there is no more content in the parent
element.
A td element's end tag may be omitted if the td element is immediately followed by a td or th element, or if there is no more content in the parent
element.
A th element's end tag may be omitted if the th element is immediately followed by a td or th element, or if there is no more content in the parent
element.
Example
The ability to omit all these table-related tags makes table markup much terser.
Take this example:
<table>
<caption>37547 TEE Electric Powered Rail Car Train Functions (Abbreviated)</caption>
<colgroup><col><col><col></colgroup>
<thead>
<tr>
<th>Function</th>
<th>Control Unit</th>
<th>Central Station</th>
</tr>
</thead>
<tbody>
<tr>
<td>Headlights</td>
<td>✔</td>
<td>✔</td>
</tr>
<tr>
<td>Interior Lights</td>
<td>✔</td>
<td>✔</td>
</tr>
<tr>
<td>Electric locomotive operating sounds</td>
<td>✔</td>
<td>✔</td>
</tr>
<tr>
<td>Engineer's cab lighting</td>
<td></td>
<td>✔</td>
</tr>
<tr>
<td>Station Announcements - Swiss</td>
<td></td>
<td>✔</td>
</tr>
</tbody>
</table>
The exact same table, modulo some whitespace differences, could be marked up as follows:
<table>
<caption>37547 TEE Electric Powered Rail Car Train Functions (Abbreviated)
<colgroup><col><col><col>
<thead>
<tr>
<th>Function
<th>Control Unit
<th>Central Station
<tbody>
<tr>
<td>Headlights
<td>✔
<td>✔
<tr>
<td>Interior Lights
<td>✔
<td>✔
<tr>
<td>Electric locomotive operating sounds
<td>✔
<td>✔
<tr>
<td>Engineer's cab lighting
<td>
<td>✔
<tr>
<td>Station Announcements - Swiss
<td>
<td>✔
</table>
Since the cells take up much less room this way, this can be made even terser by having each row on one line:
<table>
<caption>37547 TEE Electric Powered Rail Car Train Functions (Abbreviated)
<colgroup><col><col><col>
<thead>
<tr> <th>Function <th>Control Unit <th>Central Station
<tbody>
<tr> <td>Headlights <td>✔ <td>✔
<tr> <td>Interior Lights <td>✔ <td>✔
<tr> <td>Electric locomotive operating sounds <td>✔ <td>✔
<tr> <td>Engineer's cab lighting <td> <td>✔
<tr> <td>Station Announcements - Swiss <td> <td>✔
</table>
The only differences between these tables, at the DOM level, is with the precise position of the (in any case semantically-neutral) whitespace.
However, a start tag must never be omitted if it has any attributes.
Example
Returning to the earlier example with all the whitespace removed and then all the optional tags removed:
<!DOCTYPE HTML><title>Hello</title><p>Welcome to this example.
If the body element in this example had to have a class attribute and the html element had to have a lang attribute, the markup would have to
become:
<!DOCTYPE HTML><html lang="en"><title>Hello</title><body class="demo"><p>Welcome to this example.
Note
This section assumes that the document is conforming, in particular, that there are no content model violations. Omitting tags in the fashion described
in this section in a document that does not conform to the content models described in this specification is likely to result in unexpected DOM
differences (this is, in part, what the content models are designed to avoid).
12.1.2.5 Restrictions on content models §
For historical reasons, certain elements have extra restrictions beyond even the restrictions given by their content model.
A table element must not contain tr elements, even though these elements are technically allowed inside table elements according to the content
models described in this specification. (If a tr element is put inside a table in the markup, it will in fact imply a tbody start tag before it.)
A single newline may be placed immediately after the start tag of pre and textarea elements. This does not affect the processing of the element. The
otherwise optional newline must be included if the element's contents themselves start with a newline (because otherwise the leading newline in the
contents would be treated like the optional newline, and ignored).
Example
The following two pre blocks are equivalent:
<pre>Hello</pre>
<pre>
Hello</pre>
12.1.2.6 Restrictions on the contents of raw text and escapable raw text elements §
The text in raw text and escapable raw text elements must not contain any occurrences of the string "</" (U+003C LESS-THAN SIGN, U+002F SOLIDUS)
followed by characters that case-insensitively match the tag name of the element followed by one of U+0009 CHARACTER TABULATION (tab), U+000A
LINE FEED (LF), U+000C FORM FEED (FF), U+000D CARRIAGE RETURN (CR), U+0020 SPACE, U+003E GREATER-THAN SIGN (>), or U+002F
SOLIDUS (/).
12.1.3 Text §
Text is allowed inside elements, attribute values, and comments. Extra constraints are placed on what is and what is not allowed in text based on where
the text is to be put, as described in the other sections.
12.1.3.1 Newlines §
Newlines in HTML may be represented either as U+000D CARRIAGE RETURN (CR) characters, U+000A LINE FEED (LF) characters, or pairs of
U+000D CARRIAGE RETURN (CR), U+000A LINE FEED (LF) characters in that order.
Where character references are allowed, a character reference of a U+000A LINE FEED (LF) character (but not a U+000D CARRIAGE RETURN (CR)
character) also represents a newline.
12.1.4 Character references §
In certain cases described in other sections, text may be mixed with character references. These can be used to escape characters that couldn't
otherwise legally be included in text.
Character references must start with a U+0026 AMPERSAND character (&). Following this, there are three possible kinds of character references:
Named character references

The ampersand must be followed by one of the names given in the named character references section, using the same case. The name must be one
that is terminated by a U+003B SEMICOLON character (;).
Decimal numeric character reference

The ampersand must be followed by a U+0023 NUMBER SIGN character (#), followed by one or more ASCII digits, representing a base-ten integer
that corresponds to a code point that is allowed according to the definition below. The digits must then be followed by a U+003B SEMICOLON
character (;).
Hexadecimal numeric character reference

The ampersand must be followed by a U+0023 NUMBER SIGN character (#), which must be followed by either a U+0078 LATIN SMALL LETTER X
character (x) or a U+0058 LATIN CAPITAL LETTER X character (X), which must then be followed by one or more ASCII hex digits, representing a
hexadecimal integer that corresponds to a code point that is allowed according to the definition below. The digits must then be followed by a U+003B
SEMICOLON character (;).
The numeric character reference forms described above are allowed to reference any code point excluding U+000D CR, noncharacters, and controls other
than ASCII whitespace.
An ambiguous ampersand is a U+0026 AMPERSAND character (&) that is followed by one or more ASCII alphanumerics, followed by a U+003B
SEMICOLON character (;), where these characters do not match any of the names given in the named character references section.
12.1.5 CDATA sections §
CDATA sections must consist of the following components, in this order:
1. The string "<![CDATA[".
2. Optionally, text, with the additional restriction that the text must not contain the string "]]>".
3. The
Presentar string "]]>".
un problema sobre el texto seleccionado
Example
CDATA sections can only be used in foreign content (MathML or SVG). In this example, a CDATA section is used to escape the contents of a MathML
ms element:
<p>You can add a string to a number, but this stringifies the number:</p>
<math>
<ms><![CDATA[x<y]]></ms>
<mo>+</mo>
<mn>3</mn>
<mo>=</mo>
<ms><![CDATA[x<y3]]></ms>
</math>
12.1.6 Comments §
Comments must have the following format:
1. The string "", or "--!>", nor end with the string "<!-".
3. The string "-->".
Note
The text is allowed to end with the string "<!", as in .
← 11 Almacenamiento web - Tabla de contenido - 12.2 Análisis de documentos HTML →

Estándar HTML

Cargado por

Información del documento

Derechos de autor

Formatos disponibles

Compartir este documento

Compartir o incrustar documentos

Opciones para compartir

¿Le pareció útil este documento?

¿Este contenido es inapropiado?

Copyright:

Formatos disponibles

Estándar HTML

Cargado por

Copyright:

Formatos disponibles

19/7/2020 Estándar HTML

← 11 Almacenamiento web - Tabla de contenido - 12.2 Análisis de documentos HTML →

Presentar un problema sobre el texto seleccionado

12.1 Escribir documentos HTML §

Los documentos deben constar de las siguientes partes, en el orden dado:

1. Opcionalmente, un solo carácter U + FEFF BYTE ORDER MARK (BOM).

2. Cualquier número de comentarios y espacios en blanco ASCII .

4. Cualquier número de comentarios y espacios en blanco ASCII .

5. El elemento del documento , en forma de html elemento .

6. Cualquier número de comentarios y espacios en blanco ASCII .

Un DOCTYPE es un preámbulo requerido.

Presentar un problema sobre el texto seleccionado

1. Uno o más espacios en blanco ASCII .

Elementos de texto sin procesar

Elementos de texto sin procesar imposibles

Por ejemplo, considere el siguiente fragmento HTML:

12.1.2.1 Etiquetas de inicio §

Las etiquetas de inicio deben tener el siguiente formato:

12.1.2.2 Etiquetas finales §

Las etiquetas finales deben tener el siguiente formato:

Los atributos de un elemento se expresan dentro de la etiqueta de inicio del elemento.

Los atributos se pueden especificar de cuatro maneras diferentes:

Sintaxis de atributo vacío

Sintaxis de valor de atributo sin comillas

Presentar un problema sobre el texto seleccionado

Sintaxis de valor de atributo entre comillas simples

Sintaxis de valor de atributo entre comillas dobles

<input name="be evil">

Nombre local Espacio de nombres Nombre del Atributo

Ningún otro atributo de espacio de nombres se puede expresar en la sintaxis HTML .

Presentar un problema sobre el texto seleccionado

Ciertas etiquetas pueden omitirse .

Hacerlo haría que el documento se vea así:

Con la etiqueta eliminada, el documento en realidad se convierte en lo mismo que esto:

<!DOCTYPE HTML><html><head><title>Hello</title></head><body><p>Welcome to this example.</p></body></html>

Then we can omit a number of tags without affecting the DOM:

<!DOCTYPE HTML><title>Hello</title><p>Welcome to this example.</p>

At that point, we can also add some whitespace back:

<!DOCTYPE HTML><title>Hello</title><p>Welcome to this example.

Take this example:

However, a start tag must never be omitted if it has any attributes.

<!DOCTYPE HTML><title>Hello</title><p>Welcome to this example.

<!DOCTYPE HTML><html lang="en"><title>Hello</title><body class="demo"><p>Welcome to this example.

12.1.2.5 Restrictions on content models §

12.1.4 Character references §

Named character references

Decimal numeric character reference

Hexadecimal numeric character reference

12.1.5 CDATA sections §

CDATA sections must consist of the following components, in this order:

1. The string "<![CDATA[".

Comments must have the following format:

1. The string "<!--".

3. The string "-->".

← 11 Almacenamiento web - Tabla de contenido - 12.2 Análisis de documentos HTML →

Presentar un problema sobre el texto seleccionado

También podría gustarte