PR-DOM-Level-1-19980818

1. Document Object Model (Core) Level 1

Editors: Mike Champion, ArborText (from November 20, 1997); Steve Byrne, JavaSoft (until November 19, 1997); Gavin Nicol, Inso EPS; Lauren Wood, SoftQuad, Inc.

1.1. Overview of the DOM Core Interfaces
1.2. Fundamental Interfaces
1.2. Extended Interfaces

1.1. Overview of the DOM Core Interfaces

This section defines a minimal set of objects and interfaces for accessing and manipulating document objects. The functionality specified in this section (the Core functionality) should be sufficient to allow software developers and web script authors to access and manipulate parsed HTML and XML content inside conforming products. The DOM Core API also allows population of a Document object using only DOM API calls; creating the skeleton Document and saving it persistently is left to the product that implements the DOM API.

1.1.1. The DOM Structure Model

The DOM presents documents as a hierarchy of "Node" objects that also implement other, more specialized interfaces. Some types of nodes may have child nodes of various types, and others are leaf nodes that cannot have anything below them in the document structure. The node types, and which node types they may have as children, are as follows:

Document -- Element (maximum of one), ProcessingInstruction, Comment, DocumentType
DocumentFragment -- Element, ProcessingInstruction, Comment, Text, CDATASection, EntityReference
DocumentType -- Notation, Entity
EntityReference -- Element, ProcessingInstruction, Comment, Text, CDATASection, EntityReference
Element -- Element, Text, Comment, ProcessingInstruction, CDATASection, EntityReference
Attribute -- Text, EntityReference
ProcessingInstruction -- no other nodes
Comment -- no other nodes
Text -- no other nodes
CDATASection -- no other nodes
Entity -- no other nodes
Notation -- no other nodes

The DOM also specifies a "NodeList" interface to handle ordered lists of Nodes, such as the children of a Node, or the elements returned by the Element:getElementsByTagName method, and also a NamedNodeMap interface to handle unordered sets of Nodes referenced by their name attribute, such as the Attributes of an Element. NodeLists and NamedNodeMaps in the DOM are "live", that is, changes to the underlying document structure are reflected in all relevant NodeLists and NamedNodeMaps. For example, if a DOM user gets a NodeList object containing the children of an Element, then subsequently adds more children to that element (or removes children, or modifies them), those changes are automatically reflected in the NodeList without further action on the user's part. Likewise changes to a Node in the tree are reflected in all references to that Node in NodeLists and NamedNodeMaps.

1.1.2. Memory Management

Most of the APIs defined by this specification are interfaces rather than classes. That means that an actual implementation need only expose methods with the defined names and specified operation, not actually implement classes that correspond directly to the interfaces. This allows the DOM APIs to be implemented as a thin veneer on top of legacy applications with their own data structures, or on top of newer applications with different class hierarchies. This also means that ordinary constructors (in the Java or C++ sense) cannot be used to create DOM objects, since the underlying objects to be constructed may have little relationship to the DOM interfaces. The conventional solution to this in object-oriented design is to define factory methods that create instances of objects that implement the various interfaces. In the DOM Level 1, objects implementing some interface "X" are created by a "createX()" method on the Document interface; this is because all DOM objects live in the context of a specific Document.

The DOM Level 1 API does not define a standard way to create DOMImplementation or Document objects; actual DOM implementations must provide some proprietary way of bootstrapping these DOM interfaces, and then all other objects can be built from the Create methods on Document (or by various other convenience methods).

The Core DOM APIs are designed to be compatible with a wide range of languages, including both general-user scripting languages and the more challenging languages used mostly by professional programmers. Thus, the DOM APIs need to operate across a variety of memory management philosophies, from language platforms do not expose memory management to the user at all, through those (notably Java) that provide explicit constructors but provide an automatic garbage collection mechanism to automatically reclaim unused memory, to those (especially C/C++) that generally require the programmer to explicitly allocate object memory, track where it is used, and explicitly free it for re-use. To ensure a consistent API across these platforms, the DOM does not address memory management issues at all, but instead leaves these for the implementation. Neither of the explicit language bindings devised by the DOM Working Group (for ECMAScript and Java) require any memory management methods, but DOM bindings for other languages (especially C or C++) probably will require such support. These extensions will be the responsibility of those adapting the DOM API to a specific language, not the DOM WG.

1.1.3. Naming Conventions

While it would be nice to have attribute and method names that are short, informative, internally consistent, and familiar to users of similar APIs, the names also should not clash with the names in legacy APIs supported by DOM implementations. Furthermore, both OMG IDL and ECMAScript have significant limitations in their ability to disambiguate names from different namespaces that makes it difficult to avoid naming conflicts with short, familiar names. So, DOM names tend to be long and quite descriptive in order to be unique across all environments.

The Working Group has also attempted to be internally consistent in its use of various terms, even though these may not be common distinctions in other APIs. For example, we use the method name "remove" when the method changes the structural model, and the method name "delete" when the method gets rid of something inside the structure model. The thing that is deleted is not returned. The thing that is removed may be returned, when it makes sense to return it.

1.1.4. Inheritance vs Flattened Views of the API

The DOM Core APIs present two somewhat different sets of interfaces to an XML/HTML document; one presenting an "object oriented" approach with a hierarchy of inheritance, and a "simplified" view that allows all manipulation to be done via the Node interface without requiring casts (in Java and other C-like languages) or query interface calls in COM environments. These operations are fairly expensive in Java and COM, and the DOM may be used in performance-critical environments, so we allow significant functionality using just the Node interface. Because many other users will find the inheritance hierarchy easier to understand than the "everything is a Node" approach to the DOM, we also support the full higher-level interfaces for those who prefer a more object-oriented API.

In practice, this means that there is a certain amount of redundancy in the API. The Working Group considers the "inheritance" approach the primary view of the API, and the full set of functionality on Node to be "extra" functionality that users may employ, but that does not eliminate the need for methods on other interfaces that an object-oriented analysis would dictate. (Of course, when the O-O analysis yields an attribute or method that is identical to one on the Node interface, we don't specify a completely redundant one). Thus, even though there is a generic nodeName attribute on the Node interface, there is still a tagName attribute on the Element interface; these two attributes must contain the same value, but the Working Group considers it worthwhile to support both, given the different constituencies the DOM API must satisfy.

1.1.5. The `wstring` type

To ensure interoperability, the DOM specifies the wstring type as follows:

A wstring is a sequence of 16-bit quantities. This may be expressed in IDL terms as:
```
         typedef sequence<unsigned short> wstring;
```
Applications must encode wstring using UTF-16 (defined in Appendix C.3 of [UNICODE] and Amendment 1 of [ISO-10646]).The UTF-16 encoding was choosen because of its widespread industry practice. Please note that for both HTML and XML, the document character set (and therefore the notation of numeric character references) is based on UCS-4. A single numeric character reference in a source document may therefore in some cases correspond to two array positions in a wstring (a high surrogate and a low surrogate). Note: Even though the DOM defines the name of the string type to be wstring, bindings may used different names. For, example for Java, wstring is bound to the String type because it also uses UTF-16 as its encoding.

Note: As of August 1998, the OMG IDL specification included a wstring type. However, that definition did not meet the interoperability criteria of the DOM API since it relied on encoding negotiation to decide the width of a character.

1.1.6. Case sensitivity in the DOM

The DOM has many interfaces that imply string matching. HTML processors generally assume an uppercase (less often, lowercase) normalization of names for such things as elements, while XML is explicitly case sensitive. For the purposes of the DOM, string matching takes place on a character code by character code basis, on the 16 bit value of a wstring. As such, the DOM assumes that any normalizations will take place in the processor, before the DOM structures are built.

This then raises the issue of exactly what normalizations occur. The W3C I18N working group is in the process of defining exactly which normalizations are necessary for applications implementing the DOM.

1.2. Fundamental Interfaces

The interfaces within this section are considered fundamental, and must be fully implemented by all conformant implementations of the DOM, including all HTML DOM implementations.

Enumeration ExceptionCode

An integer indicating the type of error generated.

Enumerator Values

INDEX_SIZE_ERR
If index or size is negative, or greater than the allowed value

WSTRING_SIZE_ERR
If the specified range of text does not fit into a wstring

HIERARCHY_REQUEST_ERR
If any node is inserted somewhere it doesn't belong

WRONG_DOCUMENT_ERR
If a node is used in a different document than the one that created it (that doesn't support it)

INVALID_NAME_ERR
If an invalid name is specified

NO_DATA_ALLOWED_ERR
If data is specified for a node which does not support data

NO_MODIFICATION_ALLOWED_ERR
If an attempt is made to modify an object where modifications are not allowed

NOT_FOUND_ERR
If an attempt was made to reference a node in a context where it does not exist

NOT_SUPPORTED_ERR
If the implementation does not support the type of object requested

INUSE_ATTRIBUTE_ERR
If an attempt is made to add an attribute that is already inuse elsewhere

Exception DOMException

DOM operations only raise exceptions in "exceptional" circumstances, i.e., when an operation is impossible to perform (either for logical reasons, because data is lost, or because the implementation has become unstable). In general, DOM methods return specific error values in ordinary processing situation, such as out-of-bound errors when using NodeList.

Implementations may raise other exceptions under other circumstances. For example, implementations may raise an implementation-dependent exception if a null argument is passed.

Some languages and object systems do not support the concept of exceptions. For such systems, error conditions may be indicated using native error reporting mechanisms. For some bindings, for example, methods may return error codes similar to those listed in the corresponding method descriptions.

IDL Definition

exception DOMException {
   ExceptionCode   code;
};

Interface DOMImplementation

The DOMImplementation interface provides a number of methods for performing operations that are independent of any particular instance of the document object model.

The DOM Level 1 does not specify a way of creating a document instance, and hence document creation is an operation specific to an implementation. Future Levels of the DOM specification are expected to provide methods for creating documents directly.

IDL Definition

interface DOMImplementation {
  boolean                   hasFeature(in wstring feature, 
                                       in wstring version);
};

Methods

hasFeature

Test if the DOM implementation implements a specific feature.

Parameters

feature		The package name of the feature to test. In Level 1, the legal values are "HTML" and "XML" (case-insensitive).
version		This is the version number of the package name to test. In Level 1, this is the string "1.0" If the version is not specified, supporting any version of the feature will cause the method to return `true`.

Return Values: true if the feature is implemented in the specified version, false otherwise.

This method raises no exceptions.

Interface DocumentFragment

DocumentFragment is a "lightweight" or "minimal" Document object. It is very common to want to be able to extract a portion of a document's tree or to create a new fragment of a document. Imagine implementing a user command like cut or rearranging a document by moving fragments around. It is desirable to have an object which can hold such fragments and it is quite natural to use a Node for this purpose. While it is true that a Document object could fulfil this role, a Document object can potentially be a heavyweight object, depending on the underlying implementation. What is really needed for this is a very lightweight object. DocumentFragment is such an object.

Furthermore, various operations -- such as inserting nodes as children of another Node -- may take DocumentFragment objects as arguments; this results in all the child nodes of the DocumentFragment being moved to the child list of this node.

The children of a DocumentFragment node are zero or more nodes representing the tops of any sub-trees defining the structure of the document. DocumentFragment do not need to be well-formed XML documents (although they do need to follow the rules imposed upon well-formed XML parsed entities, which can have multiple top nodes). For example, a DocumentFragment might have only one child and that child node could be a Text node. Such a structure model represents neither an HTML document nor a well-formed XML document.

When a DocumentFragment is inserted into a Document (or indeed any other Node that may take children) the children of the DocumentFragment and not the DocumentFragment itself are inserted into the Node. This makes the DocumentFragment very useful when the user wishes to create nodes that are siblings; the DocumentFragment acts as the parent of these nodes so that the user can use the standard methods from the Node interface, such as insertBefore() and appendChild().

IDL Definition

interface DocumentFragment : Node {
};

Interface Document

The Document interface represents the entire HTML or XML document. Conceptually, it is the root of the document tree, and provides the primary access to the document's data.

Since elements, text nodes, comments, processing instructions, etc. cannot exist outside the context of a Document, the Document interface also contains the factory methods needed to create these objects. The Node objects created have a ownerDocument attribute which associates them with the Document within whose context they were created.

IDL Definition

interface Document : Node {
  readonly attribute  DocumentType         doctype;
  readonly attribute  DOMImplementation    implementation;
  readonly attribute  Element              documentElement;
  Element                   createElement(in wstring tagName)
                                          raises(DOMException);
  DocumentFragment          createDocumentFragment();
  Text                      createTextNode(in wstring data);
  Comment                   createComment(in wstring data);
  CDATASection              createCDATASection(in wstring data)
                                               raises(DOMException);
  ProcessingInstruction     createProcessingInstruction(in wstring target, 
                                                        in wstring data)
                                                        raises(DOMException);
  Attribute                 createAttribute(in wstring name)
                                            raises(DOMException);
  EntityReference           createEntityReference(in wstring name)
                                                  raises(DOMException);
  NodeList                  getElementsByTagName(in wstring tagname);
};

Attributes

doctype: For XML, this provides access to the Document Type Definition (see DocumentType) associated with this XML document. For HTML documents and XML documents without a document type definition this returns null.

implementation: A DOM application may use objects from multiple implementations. This provides access to the DOMImplementation object that handles this document.

documentElement: This is a convenience attribute that allows direct access to the child node that is the root element of the document. For HTML documents, this is the element with the tagName "HTML".

Methods

createElement

Create an element of the type specified. Note that the instance returned implements the Element interface, so attributes can be specified directly on the returned object.

Parameters

tagName

The name of the element type to instantiate. For XML, this is case-sensitive. For HTML, the tagName parameter may be provided in any case, but it must be mapped to the canonical uppercase form by the DOM implementation.

Return Values: A new Element object.

Exceptions

DOMException
: INVALID_NAME_ERR: Raised if an invalid name is specified.

createDocumentFragment

Create an empty DocumentFragment object.

Return Values: A new DocumentFragment.

This method has no parameters.
This method raises no exceptions.

createTextNode

Create a Text node given the specified string.

Parameters

data

The data for the node.

Return Values: The new Text object.

This method raises no exceptions.

createComment

Create a Comment node given the specified string.

Parameters

data

The data for the node.

Return Values: The new Comment object.

This method raises no exceptions.

createCDATASection

Create a CDATASection node whose value is the specified string.

Parameters

data

The data for the CDATASection contents.

Return Values: The new CDATASection object.

Exceptions

DOMException
: NOT_SUPPORTED_ERR: Raised if this document is an HTML document.

createProcessingInstruction

Create a ProcessingInstruction node given the specified name and data strings.

Parameters

target		The target part of the processing instruction.
data		The data for the node.

Return Values: The new ProcessingInstruction object.

Exceptions

DOMException

INVALID_NAME_ERR: Raised if an invalid name is specified.

NOT_SUPPORTED_ERR: Raised if this document is an HTML document.

createAttribute

Create an Attribute of the given name. Note that the Attribute instance can then be set on an Element using the setAttribute method.

Parameters

name

The name of the attribute.

Return Values: A new Attribute object.

Exceptions

DOMException
: INVALID_NAME_ERR: Raised if an invalid name is specified.

createEntityReference

Creates an EntityReference object.

Parameters

name

The name of the entity to reference.

Return Values: The new EntityReference object.

Exceptions

DOMException
: NOT_SUPPORTED_ERR: Raised if this document is an HTML document.

getElementsByTagName

Returns a NodeList of all the Elements with a given tag name in the order in which they would be encountered in a preorder traversal of the Document tree.

Parameters

tagname

The name of the tag to match on. If the string "*" is given, this method returns all elements in the document

Return Values: A new NodeList object containing all the Elements.

This method raises no exceptions.

Interface Node

The Node object is the primary datatype for the entire Document Object Model. It represents a single node in the document tree. While all objects implementing the Node interface expose methods for dealing with children, not all objects implementing the Node interface may have children. For example, Text nodes may not have children, and adding children to such nodes results in a DOMException being raised.

The attributes nodeName, nodeValue and attributes are included as a mechanism to get at node information without casting down to the specific derived interface. In cases where there is no obvious mapping of these attributes for a specific nodeType (e.g., nodeValue for an Element or attributes for a Comment), this returns null. Note that the specialized interfaces may contain additional and more convenient mechanisms to get and set the relevant information.

IDL Definition

interface Node {
  // NodeType
  const unsigned short      ELEMENT_NODE       = 1;
  const unsigned short      ATTRIBUTE_NODE     = 2;
  const unsigned short      TEXT_NODE          = 3;
  const unsigned short      CDATA_SECTION_NODE = 4;
  const unsigned short      ENTITY_REFERENCE_NODE = 5;
  const unsigned short      ENTITY_NODE        = 6;
  const unsigned short      PROCESSING_INSTRUCTION_NODE = 7;
  const unsigned short      COMMENT_NODE       = 8;
  const unsigned short      DOCUMENT_NODE      = 9;
  const unsigned short      DOCUMENT_TYPE_NODE = 10;
  const unsigned short      DOCUMENT_FRAGMENT_NODE = 11;
  const unsigned short      NOTATION_NODE      = 12;

  readonly attribute  wstring              nodeName;
           attribute  wstring              nodeValue;
  readonly attribute  unsigned short       nodeType;
  readonly attribute  Node                 parentNode;
  readonly attribute  NodeList             childNodes;
  readonly attribute  Node                 firstChild;
  readonly attribute  Node                 lastChild;
  readonly attribute  Node                 previousSibling;
  readonly attribute  Node                 nextSibling;
  readonly attribute  NamedNodeMap         attributes;
  readonly attribute  Document             ownerDocument;
  Node                      insertBefore(in Node newChild, 
                                         in Node refChild)
                                         raises(DOMException);
  Node                      replaceChild(in Node newChild, 
                                         in Node oldChild)
                                         raises(DOMException);
  Node                      removeChild(in Node oldChild)
                                        raises(DOMException);
  Node                      appendChild(in Node newChild)
                                        raises(DOMException);
  boolean                   hasChildNodes();
  Node                      cloneNode(in boolean deep);
};

Definition group NodeType

An integer indicating which type of node this is.

Defined Constants

ELEMENT_NODE	The node is a `Element`.
ATTRIBUTE_NODE	The node is an `Attribute`.
TEXT_NODE	The node is a `Text` node.
CDATA_SECTION_NODE	The node is a `CDATASection`.
ENTITY_REFERENCE_NODE	The node is an `EntityReference`.
ENTITY_NODE	The node is an `Entity`.
PROCESSING_INSTRUCTION_NODE	The node is a `ProcessingInstruction`.
COMMENT_NODE	The node is a `Comment`.
DOCUMENT_NODE	The node is a `Document`.
DOCUMENT_TYPE_NODE	The node is a `DocumentType`.
DOCUMENT_FRAGMENT_NODE	The node is a `DocumentFragment`.
NOTATION_NODE	The node is a `Notation`.

The values of nodeName, nodeValue, and attributes vary according to the node type as follows:
nodeName nodeValue attributes

Element tagName null NamedNodeMap

Attribute name of attribute value of attribute null

Text #text content of the text node null

CDATASection #cdata-section content of the CDATA Section null

EntityReference name of entity referenced null null

Entity entity name null null

ProcessingInstruction target entire content excluding the target null

Comment #comment content of the comment null

Document #document null null

DocumentType document type name null null

DocumentFragment #document-fragment null null

Notation notation name null null

Attributes

nodeName: The name of the node depends on its type; see the table above.

nodeValue: The value of a node depends on its type; see the table above. On setting a NO_MODIFICATION_ALLOWED_ERR DOMException is raised when the node is readonly. On retrieval a WSTRING_SIZE_ERR DOMException is raised when it would return more characters than fit in a wstring variable on the implementation platform.

nodeType: A code representing the type of the underlying object's type, as defined above.

parentNode: The parent of the given Node instance. All nodes, except Document, DocumentFragment, and Attribute may have a parent. However, if a node has just been created and not yet added to the tree, or if it has been removed from the tree, this is null.

childNodes: A NodeList object that enumerate all children of this node. If there are no children, this is a NodeList containing no nodes. The content of the returned NodeList is "live" in the sense that, for instance, changes to the children of the node object that it was created from are immediately reflected in the nodes returned by the NodeList accessors; it is not a static snapshot of the content of the Node. This is true for every NodeList, including the ones returned by the getElementsByTagName method.

firstChild: The first child of a node. If there is no such node, this returns null.

lastChild: The last child of a node. If there is no such node, this returns null.

previousSibling: The node immediately preceding the current node. If there is no such node, null is returned.

nextSibling: The node immediately following the current node. If there is no such node, this returns null.

attributes: Provides access to a NamedNodeMap containing the node's attributes (if it is an Element) or null otherwise.

ownerDocument: Provides access to the Document object associated with this Node. This is also the Document object used to create new Nodes. When the Node is a Document this is null.

Methods

insertBefore

Inserts a child node newChild before the existing child node refChild. If refChild is null, insert newChild at the end of the list of children.

If newChild is a DocumentFragment object, all of its children are inserted, in the same order, before refChild. If the newChild is already in the tree, it is first removed.

Parameters

newChild		The node to insert
refChild		The reference node, i.e., the node before which the new node must be inserted.

Return Values: The node being inserted.

Exceptions

DOMException

HIERARCHY_REQUEST_ERR: Raised if this node is of a type that does not allow children of the type of the newChild node.

WRONG_DOCUMENT_ERR: Raised if newChild was created from a different document than the one that created this node.

NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.

NOT_FOUND_ERR: Raised if refChild is not a child of this node.

replaceChild

Replaces the child node oldChild with newChild in the set of children of the given node, and returns the oldChild node. If the newChild is already in the tree, it is first removed.

Parameters

newChild		The new node to put in the child list.
oldChild		The node being replaced in the list.

Return Values: The node replaced.

Exceptions

DOMException

HIERARCHY_REQUEST_ERR: Raised if this node is of a type that does not allow children of the type of the newChild node.

WRONG_DOCUMENT_ERR: Raised if newChild was created from a different document than the one that created this node.

NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.

NOT_FOUND_ERR: Raised if oldChild is not a child of this node.

removeChild

Removes the child node indicated by oldChild from the list of children and returns it.

Parameters

oldChild

The node being removed

Return Values: The node removed.

Exceptions

DOMException

NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.

NOT_FOUND_ERR: Raised if oldChild is not a child of this node.

appendChild

Adds a child node to the end of the list of children for this node. If the newChild is already in the tree, it is first removed.

Parameters

newChild

The node to add.

If it is a DocumentFragment object, the entire contents of the document fragment are moved into the child list of this node

Return Values: The node added.

Exceptions

DOMException

HIERARCHY_REQUEST_ERR: Raised if this node is of a type that does not allow children of the type of the newChild node.

WRONG_DOCUMENT_ERR: Raised if newChild was created from a different document than the one that created this node.

NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.

hasChildNodes

This is a convenience method to allow easy determination of whether a Node has children or not.

Return Values: true if the node has any children, false if the node has no children.

This method has no parameters.
This method raises no exceptions.

cloneNode

Returns a duplicate of the node, i.e., serves as a generic copy constructor for Nodes. The duplicate node has no parent (parentNode returns null.).

Cloning an Element copies all attributes and their values, including those generated by the XML processor to represent defaulted attributes, but this method does not copy any text it contains unless it is a deep clone, since the text is contained in a child Text node. Cloning any other type of node simply returns a copy of this node.

Parameters

deep

If true, recursively clone the subtree under the specified node; if false, clone only the node itself (and its attributes, if it is an Element).

Return Values: The duplicate node.

This method raises no exceptions.

Interface NodeList

The NodeList interface provides the abstraction of an ordered collection of nodes, without defining or constraining how this collection is implemented.

The items in the NodeList are accessible via an integral index, starting from 0.

IDL Definition

interface NodeList {
  Node                      item(in unsigned long index);
  readonly attribute  unsigned long        length;
};

Methods

item

Returns the indexth item in the collection. If index is greater than or equal to the number of nodes in the list, null is returned.

Parameters

index

Index into the collection

Return Values: The node at the index position in the NodeList, or null if that is not a valid index.

This method raises no exceptions.

Attributes

length: The number of nodes in the NodeList instance. The range of valid child node indices is 0 to length-1 inclusive.

Interface NamedNodeMap

Objects implementing the NamedNodeMap interface are used to represent collections of nodes that can be accessed by name. Note that NamedNodeMap does not inherit from NodeList; NamedNodeMaps are not maintained in any particular order. Objects contained in an object implementing NamedNodeMap may also be accessed by an ordinal index, but this is simply to allow convenient enumeration of the contents of a NamedNodeMap, and does not imply that the DOM specifies an order to these Nodes. DOM implementations should, when possible, preserve the ordering of objects in a NamedNodeMap in case the author of the source document assigned some meaning to this ordering that is not defined in the DOM, XML or HTML specifications.

IDL Definition

interface NamedNodeMap {
  Node                      getNamedItem(in wstring name);
  Node                      setNamedItem(in Node arg)
                                         raises(DOMException);
  Node                      removeNamedItem(in wstring name)
                                            raises(DOMException);
  Node                      item(in unsigned long index);
  readonly attribute  unsigned long        length;
};

Methods

getNamedItem

Retrieves a node from a list by name

Parameters

name

Name of a node to retrieve.

Return Values: A Node (of any type) with the specified name, or null if the specified name did not identify any node in the list.

This method raises no exceptions.

setNamedItem

Add a node to a NamedNodeMap using the nodeName attribute of the node.

As the nodeName attribute is used to derive the name which the node must be stored under, multiple nodes of certain types (those that have a "special" string value) cannot be stored as the names would clash. This is seen as preferable to allowing nodes to be aliased.

Parameters

arg

A node to store in a named node list. The node will later be accessible using the value of the nodeName attribute of the node. If a node with that name is already present in the list, it is replaced by the new one.

Return Values: If the new Node replaces an existing node with the same name the previously existing Node is returned, otherwise null is returned.

Exceptions

DOMException

WRONG_DOCUMENT_ERR: Raised if arg was created from a different document than the one that created the NamedNodeMap.

NO_MODIFICATION_ALLOWED_ERR: Raised if this NamedNodeMap is readonly.

INUSE_ATTRIBUTE_ERR: Raised if arg is an Attribute that is already attribute of another Element object. The DOM user must explicitly clone Attribute nodes to re-use them in other elements.

removeNamedItem

Remove a node identified by its name. If the removed node is an Attribute with a default value it is immediately replaced.

Parameters

name

The name of a node to remove

Return Values: The node removed from the list or null if no node with such a name exists.

Exceptions

DOMException
: NOT_FOUND_ERR: Raised if there is no node named name in the list.

item

Returns the indexth item in the collection. If index is greater than or equal to the number of nodes in the list, null is returned.

Parameters

index

Index into the collection

Return Values: The node at the index position in the NamedNodeMap, or null if that is not a valid index.

This method raises no exceptions.

Attributes

length: The number of nodes in the NamedNodeMap instance. The range of valid child node indices is 0 to length-1 inclusive.

Interface CharacterData

The CharacterData interface extends Node with a set of attributes and methods for accessing character data in the DOM. This set is defined here rather than on each object that uses these attributes and methods for clarity. No DOM objects correspond directly to CharacterData, though Text and others do inherit the interface from it. All offsets in this interface start from 0.

IDL Definition

interface CharacterData : Node {
           attribute  wstring              data;
  readonly attribute  unsigned long        length;
  wstring                   substringData(in unsigned long offset, 
                                          in unsigned long count)
                                          raises(DOMException);
  void                      appendData(in wstring arg)
                                       raises(DOMException);
  void                      insertData(in unsigned long offset, 
                                       in wstring arg)
                                       raises(DOMException);
  void                      deleteData(in unsigned long offset, 
                                       in unsigned long count)
                                       raises(DOMException);
  void                      replaceData(in unsigned long offset, 
                                        in unsigned long count, 
                                        in wstring arg)
                                        raises(DOMException);
};

Attributes

data: This provides access to the character data of a node that implements this interface. The DOM implementation may not put arbitrary limits on the amount of data that may be stored in a CharacterData node. However, implementation limits may mean that the entirety of a node's data data may not fit into a single wstring. Attempts to retrieve data that does not fit in a single wstring causes a WSTRING_SIZE_ERR DOMException to be raised. In such cases, the user may call substringData to retrieve the data in appropriately sized pieces. In addition, on setting a NO_MODIFICATION_ALLOWED_ERR DOMException is raised when the node is readonly.

length: This provides access to the number of characters that are available through data and the substringData method below. This may have the value zero, i.e., CharacterData nodes may be empty.

Methods

substringData

Extracts a range of data from an object implementing this interface.

Parameters

offset		Start offset of substring to extract
count		The number of characters to extract.

Return Values: This method returns the specified substring. If the sum of offset and count exceeds the length, then all characters to the end of the data are returned.

Exceptions

DOMException

INDEX_SIZE_ERR: Raised if the specified offset is negative or greater than the number of characters in data, and if the specified count is negative.

WSTRING_SIZE_ERR: Raised if the specified range of text does not fit into a wstring.

appendData

Append the string to the end of the character data in the object implementing this interface. Upon success, data provides access to the concatenation of data and the wstring specified.

Parameters

arg

The wstring to append.

Exceptions

DOMException
: NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.

This method returns nothing.

insertData

Insert a string at the specified character offset.

Parameters

offset		The character offset at which to insert
arg		The `wstring` to insert

Exceptions

DOMException

INDEX_SIZE_ERR: Raised if the specified offset is negative or greater than the number of characters in data.

NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.

This method returns nothing.

deleteData

Remove a range of characters from the node. Upon success, data and length reflect the change.

Parameters

offset		The offset from which to remove characters.
count		The number of characters to delete. If the sum of `offset` and `count` exceeds `length` then all characters from `offset` to the end of the data are deleted.

Exceptions

DOMException

INDEX_SIZE_ERR: Raised if the specified offset is negative or greater than the number of characters in data, and if the specified count is negative.

"NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.

This method returns nothing.

replaceData

Replace the characters starting at the specified character offset with the specified string.

Parameters

offset		The offset from which to start replacing.
count		The number of characters to replace. If the sum of `offset` and `count` exceeds `length`, then all characters to the end of the data are replaced (i.e., the effect is the same as a `remove` method call with the same range, followed by an `append` method invocation).
arg		The `wstring` with which the range must be replaced.

Exceptions

DOMException

INDEX_SIZE_ERR: Raised if the specified offset is negative or greater than the number of characters in data, and if the specified count is negative.

NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.

This method returns nothing.

Interface Attribute

The Attribute interface represents an attribute in an Element object. Typically the allowable values for the attribute are defined in a document type definition.

DOM Attribute objects inherit the Node interface, but since they are not actually child nodes of the element they describe, the DOM does not consider them part of the document tree. Thus, the Node attributes parentNode, previousSibling, and nextSibling have a null value for Attribute objects. The DOM takes the view that attributes are properties of elements rather than having a separate identity separate from the elements they are associated with; this should make it more efficient to implement such features as default attributes associated with all elements of a given type. Furthermore, Attribute nodes may not be immediate children of a DocumentFragment. However, they can be associated with element nodes contained within a DocumentFragment. In short, users and implementors of the DOM need to be aware that Attribute nodes have some things in common with other objects inheriting the Node interface, but they also are quite distinct.

The attribute's effective value is determined as follows: if this attribute has been explicitly assigned any value, that value is the attribute's effective value; otherwise, if there is a declaration for this attribute, and that declaration includes a default value, then that default value is the attribute's effective value; otherwise, the attribute does not exist on this element in the structure model until it has been explicitly added. Note that the nodeValue attribute on the Attribute instance can also be used to retrieve the string version of the attribute's value(s).

In XML, the value of an attribute is represented by the child nodes of an Attribute node, since the value can contain entity references. Thus, attributes which contain entity references will have a child list containing both Text nodes and EntityReference nodes. In addition, because the attribute type may be unknown, there are no tokenised attribute values.

IDL Definition

interface Attribute : Node {
  readonly attribute  wstring              name;
  readonly attribute  boolean              specified;
           attribute  wstring              value;
};

Attributes

name: Returns the name of this attribute.

specified

If this attribute was explicitly given a value in the original document, this is true; otherwise, it is false. Note that the implementation is in charge of this attribute, not the user. If the user changes the value of the attribute (even if it ends up having the same value as the default value) then the specified flag is automatically flipped to true. To re-specify the attribute as the default value from the DTD, the user must delete the attribute, and then the implementation will make a new attribute available with specified == false and the default value (if one exists).

In summary:

If the attribute has an assigned value in the document and specified is true, the value is the assigned value.
If the attribute has no assigned value in the document and has a default value in the DTD, then specified is false, and the value is the default value in the DTD.
If the attribute has no assigned value in the document and has a value of #IMPLIED in the DTD, then the attribute does not appear in the structure model of the document.

value

When used to get the Value of an attribute, returns the value of the attribute as a string. Character and general entity references are replaced with their values in the returned string.

When used to set the Value of an Attribute, creates a Text node with the unparsed contents of the string.

Interface Element

By far the vast majority (apart from text) of objects that authors encounter when traversing a document are Element nodes. Assume the following XML document:

<elementExample id="demo">
  <subelement1/>
  <subelement2><subsubelement/></subelement2>
</elementExample>

When represented using DOM, the top node is "elementExample", which contains two child Element nodes, one for "subelement1" and one for "subelement2". "subelement1" contains no child nodes.

Elements may have attributes associated with them; since the Element interface inherits from Node, the generic Node interface method getAttributes may be used to retrieve the set of all attributes for an element. There are methods on the Element interface to retrieve either an Attribute object by name or directly an Attribute value by name. Attribute objects should be retrieved in XML, where attributes may contain entity references, meaning that their values may be a fairly complex sub-tree. On the other hand, in HTML, where all attributes have simple string values, methods to directly access an Attribute value can safely be used as a convenience.

IDL Definition

interface Element : Node {
  readonly attribute  wstring              tagName;
  wstring                   getAttribute(in wstring name);
  void                      setAttribute(in wstring name, 
                                         in wstring value)
                                         raises(DOMException);
  void                      removeAttribute(in wstring name)
                                            raises(DOMException);
  Attribute                 getAttributeNode(in wstring name);
  Attribute                 setAttributeNode(in Attribute newAttr)
                                             raises(DOMException);
  Attribute                 removeAttributeNode(in Attribute oldAttr)
                                                raises(DOMException);
  NodeList                  getElementsByTagName(in wstring name);
  void                      normalize();
};

Attributes

tagName

This attribute contains the string that is the element's name. For example, in:

<elementExample id="demo"> 
        ... 
</elementExample> ,

tagName has the value "elementExample". Note that this is case-preserving in XML, as are all of the operations of the DOM. The HTML DOM returns the tagName of an HTML element in the canonical uppercase form, regardless of the case in the source HTML document.

Methods

getAttribute

Retrieves an Attribute value by name.

Parameters

name

The name of the attribute to retrieve

Return Values: The Attribute value as a string, or the empty string if that attribute does not have a specified or defaulted value.

This method raises no exceptions.

setAttribute

Adds a new attribute. If an attribute with that name is already present in the element, its value is changed to be that of the value parameter. This value is a simple string, it is not parsed as it is being set. So any markup (such as syntax to be recognized as an entity reference) is treated as literal text, and needs to be appropriately escaped by the implementation when it is written out. In order to assign an attribute value that contains entity references, the user must create an Attribute node plus any Text and EntityReference nodes, build the appropriate subtree, and use setAttributeNode to assign it as the value of an attribute.

Parameters

name		Name of an attribute
value		Value to set in string form

Exceptions

DOMException

INVALID_NAME_ERR: Raised if an invalid name is specified.

NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.

This method returns nothing.

removeAttribute

Removes the Attribute with the specified name. If the removed Attribute has a default value it is immediately replaced.

Parameters

name

The name of the attribute to remove

Exceptions

DOMException
: "NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.

This method returns nothing.

getAttributeNode

Retrieves an attribute node by name.

Parameters

name

The name of the attribute to retrieve

Return Values: The attribute node with the specified attribute name or null if there is no such attribute.

This method raises no exceptions.

setAttributeNode

Adds a new attribute. If an attribute with that name is already present in the element, it is replaced by the new one.

Parameters

newAttr

The attribute node to add to the attribute list

Return Values: If the newAttr attribute replaces an existing attribute with the same name, the previously existing Attribute node is returned, otherwise null is returned.

Exceptions

DOMException

WRONG_DOCUMENT_ERR: Raised if newAttr was created from a different document than the one that created the element.

NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.

INUSE_ATTRIBUTE_ERR: Raised if newAttr is already an attribute of another Element object. The DOM user must explicitly clone Attribute nodes to re-use them in other elements.

removeAttributeNode

Removes the specified attribute.

Parameters

oldAttr

The Attribute node to remove from the attribute list. If the removed Attribute has a default value it is immediately replaced.

Return Values: Returns the Attribute node that was removed.

Exceptions

DOMException

NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.

NOT_FOUND_ERR: Raised if oldAttr is not an attribute of the element.

getElementsByTagName

Returns a NodeList of all descendant elements of with a given tag name in the order in which they would be encountered in a preorder traversal of the Element tree.

Parameters

name

The name of the tag to match on. If the string "*" is given, this method returns all descendant elements of the starting element.

Return Values: This method returns a list of element nodes that have the specified tag name.

This method raises no exceptions.

normalize: Puts all Text nodes in the full depth of the sub-tree underneath this Element into a "normal" form where only markup (e.g., tags, comments, processing instructions, CDATA sections, and entity references) separates Text nodes, i.e., there are no adjacent Text nodes. This can be used to ensure that the DOM view of a document is identical to how it would look if saved and re-loaded, and is useful when operations (such as XPointer lookups) that depend on a particular document tree structure are to be used.

This method has no parameters.
This method returns nothing.
This method raises no exceptions.

Interface Text

The text interface represents the textual content (termed character data in XML) of an Element or Attribute. If there is no markup inside an element's content, the text is contained in a single object implementing the Text interface that is the child of the element. Any markup is parsed into child elements that are siblings of the text nodes on either side of it, and whose content is represented as text node children of the markup element.

When a document is first made available to the DOM, there is only one Text node for each block of text. Users may create adjacent Text nodes that represent the contents of a given element without any intervening markup, but should be aware that there is no way to represent the separations between these nodes in XML or HTML, so they will not (in general) persist between DOM editing sessions. The normalize() method on Element merges any such adjacent Text objects into a single node for each block of text; this is recommended before employing operations that depend on a particular document structure, such as navigation with XPointers.

IDL Definition

interface Text : CharacterData {
  Text                      splitText(in unsigned long offset)
                                      raises(DOMException);
};

Methods

splitText

Breaks a text node into two text nodes at the specified offset, keeping both in the tree as siblings.

Parameters

offset

The offset at which to split, starting from 0.

Return Values: This method returns the new text node containing all the content at and after the offset point. The original node contains all the content up to the offset point.

Exceptions

DOMException

INDEX_SIZE_ERR: Raised if the specified offset is negative or greater than the number of characters in data.

NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.

Interface Comment

This represents the content of a comment, i.e. all the characters between the starting ''. Note that this is the definition of a comment in XML, and, in practice, HTML, although some HTML tools may implement the full SGML comment structure.

IDL Definition

interface Comment : CharacterData {
};

Interface ProcessingInstruction

The ProcessingInstruction interface represents a "processing instruction", used in XML (and legal, though seldom supported, in HTML) as a way to keep processor-specific information in the text of the document. The content of the node is the entire content between the delimiters of the processing instruction.

IDL Definition

interface ProcessingInstruction : Node {
  readonly attribute  wstring              target;
           attribute  wstring              data;
};

Attributes

target: XML defines a target as the first token following the markup that begins the processing instruction. This attribute value is that name. For HTML, the value is null.

data: The content of the processing instruction. In HTML this is from from the character immediately after the <? to the character immediately preceding the >. In XML this is from the first non white space character after the target to the character immediately preceding the ?>. On setting a NO_MODIFICATION_ALLOWED_ERR DOMException is raised when the node is readonly.

1.3. Extended Interfaces

The interfaces defined here form part of the DOM Level 1 Core specification, but objects that expose these interfaces will never be encountered in a DOM implementation that deals only with HTML. As such, HTML-only DOM implementations do not need to have objects that implement these interfaces.

Interface CDATASection

CDATA Sections are used to escape blocks of text containing characters that would otherwise be regarded as markup. The only delimiter that is recognised in a CDATA Section is the "]]>" string that ends the CDATA Section. CDATA Sections can not be nested. The primary purpose is for including material such as XML fragments, without needing to escape all the delimiters.

The wstring attribute of the Text node holds the text that is contained by the CDATA section. Note that this may contain characters that need to be escaped outside of CDATA sections.

The CDATA Section inherits the CharacterData interface through the Text interface. Adjacent CDATA Sections are not merged by use of the Element.normalize() method.

IDL Definition

interface CDATASection : Text {
};

Interface DocumentType

Each document has a (possibly null) attribute that contains a reference to a DocumentType object. The DocumentType class in the DOM Level 1 core provides an interface to the list of entities that are defined for the document, and little else because the effect of namespaces and the various XML scheme efforts on DTD representation are not clearly understood as of this writing.

IDL Definition

interface DocumentType : Node {
  readonly attribute  wstring              name;
  readonly attribute  NamedNodeMap         entities;
  readonly attribute  NamedNodeMap         notations;
};

Attributes

name: The name attribute is a wstring that holds the name of DTD; i.e., the name immediately following the DOCTYPE keyword.

entities

This is a NamedNodeMap containing the general entities, both external and internal, declared in the DTD. For example in:

<!DOCTYPE ex SYSTEM "ex.dtd" [
  <!ENTITY foo "foo">
  <!ENTITY bar "bar">
  <!ENTITY % baz "baz">
]>
<ex/>

the interface provides access to foo and bar but not baz. All objects supporting the Node interface that are accessed through this attribute, also support the Entity interface. For HTML, this is always null.

notations: This is a NamedNodeMap containing the notations declared in the DTD. Each node in this map also implements the Notation interface.

Interface Notation

This interface represents a notation declared in the DTD. A notation either declares, by name, the format of an unparsed entity (see section 4.7 of the XML 1.0 specification), or is used for formal declaration of Processing Instruction targets (see section 2.6 of the XML 1.0 specification). The nodeName attribute inherited from Node is set to the declared name of the notation.

IDL Definition

interface Notation : Node {
  readonly attribute  wstring              publicId;
  readonly attribute  wstring              systemId;
};

Attributes

publicId: The public identifier for the notation. If the public identifier was not specified, this is null.

systemId: The system identifier for the notation. If the system identifier was not specified, this is null.

Interface Entity

This interface represents an entity, either parsed or unparsed, in an XML document. Note that this models the entity itself not the entity declaration. Entity declaration modeling has been left for a later Level of the DOM specification.

An XML processor may choose to completely expand entities before the structure model is passed to the DOM; in this case there will be no entity references in the document tree.

The nodeName attribute that is inherited from Node contains the name of the entity.

The structure of the child list is exactly the same as the structure of the child list for an EntityReference with the same nodeName value.

Level 1 of the DOM API does not support editing Entity declarations; if a user wants to make changes to the contents of an Entity, the EntityReference node has to be replaced in the structure model by a clone of the Entity's contents. All the nodes beneath the entity reference are readonly.

IDL Definition

interface Entity : Node {
  readonly attribute  wstring              publicId;
  readonly attribute  wstring              systemId;
  readonly attribute  wstring              notationName;
};

Attributes

publicId: The public identifier associated with the entity, if specified. If the public identifier was not specified, this is null.

systemId: The system identifier associated with the entity, if specified. If the system identifier was not specified, this is null.

notationName: For unparsed entities, the name of the notation for the entity. For parsed entities, this is null.

Interface EntityReference

EntityReference objects may be inserted into the structure model when an entity reference is in the source document, or when the user wishes to insert an entity reference. Note that character entities are considered to be expanded by the HTML or XML processor so that characters are represented by their Unicode equivalent rather than by an entity reference. The replacement value of the referenced Entity, if available, appears in the child list of the EntityReference object. Alternatively, the XML processor may completely expand references to entities while building the structure model, instead of providing EntityReference objects.

XML does not mandate that a non-validating XML processor read and process entity declarations made in the external subset or declared in external parameter entities. This means that parsed entities declared in the external subset need not be expanded by some classes of applications, and that the replacement value of the entity may not be available.

The resolution of the children of the EntityReference (the replacement value of the referenced Entity) may be lazily evaluated; actions by the user (such as calling the childNodes method on the EntityReference Node) are assumed to trigger the evaluation.

IDL Definition

interface EntityReference : Node {
};

INDEX_SIZE_ERR	If index or size is negative, or greater than the allowed value
WSTRING_SIZE_ERR	If the specified range of text does not fit into a wstring
HIERARCHY_REQUEST_ERR	If any node is inserted somewhere it doesn't belong
WRONG_DOCUMENT_ERR	If a node is used in a different document than the one that created it (that doesn't support it)
INVALID_NAME_ERR	If an invalid name is specified
NO_DATA_ALLOWED_ERR	If data is specified for a node which does not support data
NO_MODIFICATION_ALLOWED_ERR	If an attempt is made to modify an object where modifications are not allowed
NOT_FOUND_ERR	If an attempt was made to reference a node in a context where it does not exist
NOT_SUPPORTED_ERR	If the implementation does not support the type of object requested
INUSE_ATTRIBUTE_ERR	If an attempt is made to add an attribute that is already inuse elsewhere

	nodeName	nodeValue	attributes
Element	tagName	null	NamedNodeMap
Attribute	name of attribute	value of attribute	null
Text	#text	content of the text node	null
CDATASection	#cdata-section	content of the CDATA Section	null
EntityReference	name of entity referenced	null	null
Entity	entity name	null	null
ProcessingInstruction	target	entire content excluding the target	null
Comment	#comment	content of the comment	null
Document	#document	null	null
DocumentType	document type name	null	null
DocumentFragment	#document-fragment	null	null
Notation	notation name	null	null

1. Document Object Model (Core) Level 1

Table of contents

1.1. Overview of the DOM Core Interfaces

1.1.1. The DOM Structure Model

1.1.2. Memory Management

1.1.3. Naming Conventions

1.1.4. Inheritance vs Flattened Views of the API

1.1.5. The wstring type

1.1.6. Case sensitivity in the DOM

1.2. Fundamental Interfaces

1.3. Extended Interfaces

1.1.5. The `wstring` type