- DOM tree
- The in-memory tree structure that an XML or HTML parser builds from a text document. Each element, attribute, and text node becomes a node in the tree, which the formatter traverses to produce indented output.
- Namespace
- A mechanism for distinguishing XML elements from different vocabularies that might share the same name. Written as a prefix followed by a colon (soap:Body), with the prefix declared using an xmlns attribute.
- CDATA section
- A block wrapped in <![CDATA[ ... ]]> that tells the XML parser to treat its content as raw character data rather than markup. Used for embedding HTML or code snippets that contain characters like < and & without escaping them.
- Self-closing tag
- An XML element with no content written as <tagName /> instead of <tagName></tagName>. The two forms are semantically identical in XML, but some DTDs and parsers require one form over the other.
- Prolog
- The optional declaration at the very start of an XML document: <?xml version="1.0" encoding="UTF-8"?>. It must appear before any other content, including whitespace, to be valid.
- Well-formed XML
- XML that follows the basic structural rules of the XML specification: a single root element, all tags properly closed and nested, all attributes quoted, and special characters like < and & escaped. A document can be well-formed without conforming to any particular schema.