Copyright © 2025 World Wide Web Consortium. W3C® liability, trademark and permissive document license rules apply.
This section describes the status of this document at the time of its publication. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.downtownmelody.com/_x/d3d3LnczLm9yZw/TR/.
This document was published by the Web Applications Working Group as an Editor's Draft.
Publication as an Editor's Draft does not imply endorsement by W3C and its Members.
This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.
This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.
This document is governed by the 03 November 2023 W3C Process Document.
As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.
The IDL fragments in this specification must be interpreted as required for conforming IDL fragments, as described in the Web IDL specification. [WEBIDL]
Requirements phrased in the imperative as part of algorithms (such as "strip any leading space characters" or "return false and terminate these steps") are to be interpreted with the meaning of the key word ("must", "should", "may", etc) used in introducing the algorithm.
Conformance requirements phrased as algorithms or specific steps may be implemented in any manner, so long as the end result is equivalent. (In particular, the algorithms defined in this specification are intended to be easy to follow, and not intended to be performant.)
User agents may impose implementation-specific limits on otherwise unconstrained inputs, e.g. to prevent denial of service attacks, to guard against running out of memory, or to work around platform-specific limitations.
When a method or an attribute is said to call another method or attribute, the user agent must invoke its internal API for that attribute or method so that e.g. the author can't change the behavior by overriding attributes or methods with custom properties or functions in ECMAScript. [ECMA-262]
Unless otherwise stated, string comparisons are done in a case-sensitive manner.
If an algorithm calls into another algorithm, any exception that is thrown by the latter (unless it is explicitly caught), must cause the former to terminate, and the exception to be propagated up to its caller.
Vendor-specific proprietary extensions to this specification are strongly discouraged. Authors must not use such extensions, as doing so reduces interoperability and fragments the user base, allowing only users of specific user agents to access the content in question.
If vendor-specific extensions are needed, the members should be prefixed by vendor-specific strings to prevent clashes with future versions of this specification. Extensions must be defined so that the use of extensions neither contradicts nor causes the non-conformance of functionality defined in the specification.
When vendor-neutral extensions to this specification are needed, either this specification can be updated accordingly, or an extension specification can be written that overrides the requirements in this specification. Such an extension specification becomes an applicable specification for the purposes of conformance requirements in this specification.
A document object model (DOM) is an in-memory representation of various types of Nodes where each Node is connected in a tree. The [HTML5] and [DOM4] specifications describe DOM and its Nodes is greater detail.
Parsing is the term used for converting a string representation of a DOM into an actual DOM, and Serializing is the term used to transform a DOM back into a string. This specification concerns itself with defining various APIs for both parsing and serializing a DOM.
HTMLDivElement (nodeName: "div")
┃
┣━ HTMLSpanElement (nodeName: "span")
┃ ┃
┃ ┗━ Text (data: "some ")
┃
┗━ HTMLElement (nodeName: "em")
┃
┗━ Text (data: "text!")
And the HTMLDivElement
node is stored in a variable myDiv
,
then to serialize myDiv
's children simply get (read) the
Element's innerHTML property (this triggers the serialization):
var serializedChildren = myDiv.innerHTML;
// serializedChildren has the value:
// "<span>some </span><em>text!</em>"
To parse new children for myDiv
from a string (replacing its existing
children), simply set the innerHTML property (this triggers
parsing of the assigned string):
myDiv.innerHTML = "<span>new</span><em>children!</em>";
This specification describes two flavors of parsing and serializing: HTML and XML (with XHTML being a type of XML). Each follows the rules of its respective markup language. The above example shows HTML parsing and serialization. The specific algorithms for HTML parsing and serializing are defined in the [HTML5] specification. This specification contains the algorithm for XML serializing. The grammar for XML parsing is described in the [XML10] specification.
Round-tripping a DOM means to serialize and then immediately parse the serialized string back into a DOM. Ideally, this process does not result in any data loss with respect to the identity and attributes of the Node in the DOM. Round-tripping is especially tricky for an XML serialization, which must be concerned with preserving the Node's namespace identity in the serialization (wereas namespaces are ignored in HTML).
Element (nodeName: "root")
┃
┗━ HTMLScriptElement (nodeName: "script")
┃
┗━ Text (data: "alert('hello world')")
An XML serialization must include the HTMLScriptElement
Node's
namespace in order to preserve the identity of the
script
element, and to allow the serialized string to
round-trip through an XML parser. Assuming that root
is in a variable named root
:
var xmlSerialization = new XMLSerializer().serializeToString(root);
// xmlSerialization has the value:
// "<root><script xmlns="https://www.downtownmelody.com/_x/d3d3LnczLm9yZw/1999/xhtml">alert('hello world')</script></root>"
The term context object means the object on which the API being discussed was called.
The following terms are understood to represent their respective namespaces in this specification (and makes it easier to read):
https://www.downtownmelody.com/_x/d3d3LnczLm9yZw/1999/xhtml
https://www.downtownmelody.com/_x/d3d3LnczLm9yZw/XML/1998/namespace
https://www.downtownmelody.com/_x/d3d3LnczLm9yZw/2000/xmlns/
The definition of DOMParser
has moved to the HTML Standard.
The definition of XMLSerializer
has moved to the HTML Standard.
The definition of InnerHTML
has moved to the HTML Standard.
Element
interfaceThe definition of outerHTML
has moved to the HTML Standard.
The definition of insertAdjacentHTML
has moved to the HTML Standard.
Range
interfaceThe definition of createContextualFragment
has moved to the HTML Standard.
The definition of fragment parsing algorithm
has moved to the HTML Standard.
The definition of fragment serializing algorithm
has moved to the HTML Standard.
An XML serialization differs from an HTML serialization in the following ways:
namespaceURI
is preserved. In some cases this means that an existing
prefix
, prefix declaration attribute or default namespace declaration attribute
might be dropped, substituted or changed. An HTML serialization does not attempt to
preserve the namespaceURI
.
Otherwise, the algorithm for producing an XML serialization is designed to produce a serialization that is compatible with the HTML parser. For example, elements in the HTML namespace that contain no child nodes are serialized with an explicit begin and end tag rather than using the empty-element tag syntax.
Per [DOM4], Attr
objects do not inherit from Node, and
thus cannot be serialized by the XML serialization algorithm. An attempt to serialize an
Attr object will result in an empty string.
To produce an XML serialization of a Node
node given
a flag require well-formed, run the following steps:
null
.
The context namespace tracks the XML serialization algorithm's current default
namespace. The context namespace is changed when either an Element Node has
a default namespace declaration, or the algorithm generates a default namespace declaration for
the Element Node to match its own namespace. The algorithm assumes no namespace
(null
) to start.
xml
" to
prefix map.
1
. The generated namespace prefix index is used to generate a new unique
prefix value when no suitable existing namespace prefix is available to serialize a
node's namespaceURI
(or the namespaceURI
of one of
node's attributes). See the generate a prefix algorithm.
InvalidStateError
"
DOMException
.
Each of the following algorithms for producing an XML serialization of a DOM node take as input a node to serialize and the following arguments:
The XML serialization algorithm produces an XML serialization of an arbitrary DOM node node based on the node's interface type. Each referenced algorithm is to be passed the arguments as they were recieved by the caller and return their result to the caller. Re-throw any exceptions. If node's interface is:
Element
Document
Comment
Text
DocumentFragment
DocumentType
ProcessingInstruction
Attr
object