This section defines a minimal set of objects and interfaces for accessing and manipulating document objects. The functionality specified in this section (the Core functionality) should be sufficient to allow software developers and web script authors to access and manipulate parsed HTML and XML content inside conforming products. The DOM Core API also allows population of a Document object using only DOM API calls; creating the skeleton Document and saving it persistently is left to the product that implements the DOM API.
The DOM presents documents as a hierarchy of "Node" objects that also implement other, more specialized interfaces. Some types of nodes may have child nodes of various types, and others are leaf nodes that cannot have anything below them in the document structure. The node types, and which node types they may have as children, are as follows:
The DOM also specifies a "NodeList" interface to handle
ordered lists of Nodes, such as the children of a Node, or the
elements returned by the Element:getElementsByTagName method,
and also a NamedNodeMap interface to handle unordered sets
of Nodes referenced by their name attribute, such as the
Attributes of an Element. NodeLists and NamedNodeMaps
in the DOM are "live", that is, changes to the
underlying document structure are reflected in all relevant
NodeLists and NamedNodeMaps. For example, if a DOM user gets a NodeList object
containing the children of an Element, then subsequently
adds more children to that element (or removes children, or
modifies them), those changes are automatically reflected in
the NodeList without further action on the user's part. Likewise
changes to a Node in the tree are reflected in all references to
that Node in NodeLists and NamedNodeMaps.
Most of the APIs defined by this specification are interfaces rather than classes. That means that an actual implementation need only expose methods with the defined names and specified operation, not actually implement classes that correspond directly to the interfaces. This allows the DOM APIs to be implemented as a thin veneer on top of legacy applications with their own data structures, or on top of newer applications with different class hierarchies. This also means that ordinary constructors (in the Java or C++ sense) cannot be used to create DOM objects, since the underlying objects to be constructed may have little relationship to the DOM interfaces. The conventional solution to this in object-oriented design is to define factory methods that create instances of objects that implement the various interfaces. In the DOM Level 1, objects implementing some interface "X" are created by a "createX()" method on the Document interface; this is because all DOM objects live in the context of a specific Document.
The DOM Level 1 API does not define a standard way to create DOMImplementation or Document objects; actual DOM implementations must provide some proprietary way of bootstrapping these DOM interfaces, and then all other objects can be built from the Create methods on Document (or by various other convenience methods).
The Core DOM APIs are designed to be compatible with a wide
range of languages, including both general-user scripting languages and
the more challenging languages used mostly by professional programmers.
Thus, the DOM
APIs need to operate across a variety of memory management
philosophies, from language platforms do not expose memory
management to the user at all, through those (notably Java) that
provide explicit constructors but provide an automatic garbage
collection mechanism to automatically reclaim unused memory,
to those (especially C/C++) that generally require the
programmer to explicitly allocate object memory, track where
it is used, and explicitly free it for re-use. To ensure a
consistent API across these platforms, the DOM does not
address memory management issues at all,
but instead leaves these for the
implementation. Neither of the explicit language bindings
devised by the DOM Working Group (for ECMAScript and Java)
require any memory management methods, but DOM bindings for
other languages (especially C or C++) probably will require
such support. These extensions will be the responsibility of
those adapting the DOM API to a specific language, not the DOM
WG.
While it would
be nice to have attribute and method names that are short,
informative, internally consistent, and familiar to users of
similar APIs, the names also should not clash with the names
in legacy APIs supported by DOM implementations.
Furthermore, both OMG IDL and ECMAScript
have
significant limitations in their ability to disambiguate names
from different namespaces that makes it difficult to avoid naming
conflicts with short, familiar names. So, DOM names tend to be
long and quite descriptive in order to be unique across all
environments.
The Working Group has also attempted to be internally
consistent in its use of various terms, even though these may
not be common distinctions in other APIs. For example, we use
the method name "remove" when the method changes the
structural model, and the method name "delete" when the method
gets rid of something inside the structure model. The thing
that is deleted is not returned. The thing that is removed may
be returned, when it makes sense to return it.
The DOM Core APIs present two somewhat different sets of
interfaces to an XML/HTML document; one presenting an "object
oriented" approach with a hierarchy of inheritance, and a
"simplified" view that allows all manipulation to be done via
the Node
interface without requiring casts (in
Java and other C-like languages) or query interface calls in
COM environments. These operations are fairly expensive in Java and
COM, and the DOM may be used in performance-critical
environments, so we allow significant functionality using just the
Node interface. Because many other users will find the
inheritance hierarchy easier to understand than the
"everything is a Node" approach to the DOM, we also support
the full higher-level interfaces for those who prefer a more
object-oriented API.
In practice, this means that there is a certain amount of
redundancy in the API. The Working Group considers the
"inheritance" approach the primary view of the API, and the
full set of functionality on Node
to be "extra"
functionality that users may employ, but that does not eliminate
the need for methods on other interfaces that an
object-oriented analysis would dictate. (Of course, when the
O-O analysis yields an attribute or method that is
identical to one on the Node
interface, we don't
specify a completely redundant one). Thus, even though there
is a generic nodeName
attribute on the Node
interface, there is still a tagName
attribute on the
Element
interface; these two attributes must
contain the same value, but the Working Group considers it
worthwhile to support both, given the different constituencies
the DOM API must satisfy.
wstring
typeTo ensure interoperability, the DOM specifies the
wstring
type as follows:
wstring
is a sequence of 16-bit
quantities. This may be expressed in IDL terms as:typedef sequence<unsigned short> wstring;
wstring
using UTF-16
(defined in Appendix C.3 of [UNICODE] and Amendment 1 of
[ISO-10646]).The UTF-16 encoding was choosen because of its widespread
industry practice. Please note that for both HTML and XML, the document
character set (and therefore the notation of numeric character
references) is based on UCS-4. A single numeric character reference in
a source document may therefore in some cases correspond to two array
positions in a wstring
(a high surrogate and a low
surrogate). Note: Even though the DOM defines the name of the string type to
be wstring
, bindings may used different names. For,
example for Java, wstring
is bound to the
String
type because it also uses UTF-16 as its
encoding.wstring
type. However, that definition did not meet the
interoperability criteria of the DOM API since it relied on encoding
negotiation to decide the width of a character.The DOM has many interfaces that imply string matching.
HTML processors generally assume an uppercase (less often,
lowercase) normalization of names for such things as
elements, while XML is explicitly case sensitive. For the
purposes of the DOM, string matching takes place on a character
code by character code basis, on the 16 bit value of a
wstring
. As such, the DOM assumes that any
normalizations will take place in the processor,
before the DOM structures are built.
This then raises the issue of exactly what normalizations
occur. The W3C I18N working group is in the process of
defining exactly which normalizations are necessary for applications
implementing the DOM.
The interfaces within this section are considered fundamental, and must be fully implemented by all conformant implementations of the DOM, including all HTML DOM implementations.
An integer indicating the type of error generated.
Enumerator Values |
INDEX_SIZE_ERR | If index or size is negative, or greater than the allowed value |
WSTRING_SIZE_ERR | If the specified range of text does not fit into a wstring |
HIERARCHY_REQUEST_ERR | If any node is inserted somewhere it doesn't belong |
WRONG_DOCUMENT_ERR | If a node is used in a different document than the one that created it (that doesn't support it) |
INVALID_NAME_ERR | If an invalid name is specified |
NO_DATA_ALLOWED_ERR | If data is specified for a node which does not support data |
NO_MODIFICATION_ALLOWED_ERR | If an attempt is made to modify an object where modifications are not allowed |
NOT_FOUND_ERR | If an attempt was made to reference a node in a context where it does not exist |
NOT_SUPPORTED_ERR | If the implementation does not support the type of object requested |
INUSE_ATTRIBUTE_ERR | If an attempt is made to add an attribute that is already inuse elsewhere |
DOM operations only raise exceptions in "exceptional"
circumstances, i.e., when an operation is impossible
to perform (either for logical reasons, because data is lost, or
because the implementation has become unstable). In general, DOM methods
return specific error values in ordinary
processing situation, such as out-of-bound errors when using
NodeList
.
Implementations may raise other exceptions under other circumstances. For
example, implementations may raise an implementation-dependent
exception if a null
argument is passed.
Some languages and object systems do not support the concept of exceptions. For such systems, error conditions may be indicated using native error reporting mechanisms. For some bindings, for example, methods may return error codes similar to those listed in the corresponding method descriptions.
exception DOMException { ExceptionCode code; };
The DOMImplementation
interface provides a
number of methods for performing operations that are independent
of any particular instance of the document object model.
The DOM Level 1 does not specify a way of creating a document instance, and hence document creation is an operation specific to an implementation. Future Levels of the DOM specification are expected to provide methods for creating documents directly.
interface DOMImplementation { boolean hasFeature(in wstring feature, in wstring version); };
hasFeature
Test if the DOM implementation implements a specific feature.
feature |
The package name of the feature to test. In Level 1, the legal values are "HTML" and "XML" (case-insensitive). | |
version |
This is the version number of the package name to
test. In Level 1, this is the string "1.0"
If the version is not specified, supporting any version of the
feature will cause the method to return |
true
if the feature is implemented in the specified
version, false
otherwise.
DocumentFragment
is a "lightweight" or
"minimal" Document
object. It is very common to want to be able to
extract a portion of a document's tree or to create a new fragment of
a document. Imagine implementing a user command like cut or
rearranging a document by moving fragments around. It is
desirable to have an object which can hold such fragments and it
is quite natural to use a Node for this purpose. While it is
true that a Document
object could fulfil this role,
a Document
object can potentially be a heavyweight
object, depending on the underlying implementation. What is really
needed for this is a very lightweight object.
DocumentFragment
is such an object.
Furthermore, various operations -- such as inserting nodes as
children of another Node
-- may take
DocumentFragment
objects as arguments; this
results in all the child nodes of the DocumentFragment
being moved to the child list of this node.
The children of a DocumentFragment
node are zero
or more nodes representing the tops of any sub-trees defining
the structure of the document. DocumentFragment
do not
need to be well-formed XML documents (although they do need to
follow the rules imposed upon well-formed XML parsed entities,
which can have multiple top nodes).
For example, a DocumentFragment
might have only one child and
that child node could be a Text
node. Such a structure model
represents neither an HTML document nor a well-formed XML document.
When a DocumentFragment
is inserted into a
Document
(or indeed any other Node
that may take children)
the children of the DocumentFragment
and not the DocumentFragment
itself are inserted into the Node
. This makes the DocumentFragment
very useful when the user wishes to create nodes that are siblings;
the DocumentFragment
acts as the parent of these nodes so that the
user can use the standard methods from the Node
interface, such as insertBefore()
and
appendChild()
.
interface DocumentFragment : Node { };
The Document
interface represents the entire
HTML or XML document. Conceptually, it is the root of the
document tree, and provides the primary access to the
document's data.
Since elements, text nodes, comments, processing instructions,
etc. cannot exist outside the context of a
Document
, the Document
interface also
contains the factory methods needed to create these objects.
The Node
objects created have a ownerDocument
attribute which associates them with the Document
within whose
context they were created.
interface Document : Node { readonly attribute DocumentType doctype; readonly attribute DOMImplementation implementation; readonly attribute Element documentElement; Element createElement(in wstring tagName) raises(DOMException); DocumentFragment createDocumentFragment(); Text createTextNode(in wstring data); Comment createComment(in wstring data); CDATASection createCDATASection(in wstring data) raises(DOMException); ProcessingInstruction createProcessingInstruction(in wstring target, in wstring data) raises(DOMException); Attribute createAttribute(in wstring name) raises(DOMException); EntityReference createEntityReference(in wstring name) raises(DOMException); NodeList getElementsByTagName(in wstring tagname); };
doctype
For XML, this provides access to the Document Type
Definition (see DocumentType) associated with
this XML document. For HTML documents and XML documents
without a document type definition this returns
null
.
implementation
A DOM application may use objects from multiple
implementations. This provides access to the
DOMImplementation
object that handles this
document.
documentElement
This is a convenience attribute that allows direct access to the child node that is the root element of the document. For HTML documents, this is the element with the tagName "HTML".
createElement
Create an element of the type specified. Note that the instance returned implements the Element interface, so attributes can be specified directly on the returned object.
tagName |
The name of the element type to
instantiate. For XML, this is case-sensitive. For HTML, the
|
A new Element
object.
INVALID_NAME_ERR: Raised if an invalid name is specified.
createDocumentFragment
Create an empty DocumentFragment
object.
A new DocumentFragment
.
createTextNode
Create a Text
node given the specified
string.
data |
The data for the node. |
The new Text
object.
createComment
Create a Comment
node given the specified
string.
data |
The data for the node. |
The new Comment
object.
createCDATASection
Create a CDATASection
node whose value is
the specified string.
data |
The data for the |
The new CDATASection
object.
NOT_SUPPORTED_ERR: Raised if this document is an HTML document.
createProcessingInstruction
Create a ProcessingInstruction
node given
the specified name and data strings.
target |
The target part of the processing instruction. | |
data |
The data for the node. |
The new ProcessingInstruction
object.
INVALID_NAME_ERR: Raised if an invalid name is specified.
NOT_SUPPORTED_ERR: Raised if this document is an HTML document.
createAttribute
Create an Attribute
of the given name.
Note that the Attribute
instance
can then be set on an Element
using the
setAttribute
method.
name |
The name of the attribute. |
A new Attribute
object.
INVALID_NAME_ERR: Raised if an invalid name is specified.
createEntityReference
Creates an EntityReference object.
name |
The name of the entity to reference. |
The new EntityReference
object.
NOT_SUPPORTED_ERR: Raised if this document is an HTML document.
getElementsByTagName
Returns a NodeList
of all the Elements
with a given tag name in the order in which they would be encountered
in a preorder traversal of the Document
tree.
tagname |
The name of the tag to match on. If the string "*" is given, this method returns all elements in the document |
A new NodeList
object containing
all the Elements
.
The Node object is the primary datatype for the entire
Document Object Model. It represents a single node in the
document tree. While all objects implementing the
Node
interface expose methods for dealing with
children, not all objects implementing the Node
interface may have children. For example, Text
nodes may not have children, and adding children to such nodes
results in a DOMException
being raised.
The attributes nodeName
, nodeValue
and attributes
are
included as a mechanism to get at node information without
casting down to the specific derived interface. In cases where
there is no obvious mapping of these attributes for a
specific nodeType (e.g., nodeValue
for an Element
or attributes
for a Comment), this returns null
. Note that the
specialized interfaces may contain
additional and more convenient mechanisms to get and set the relevant
information.
interface Node { // NodeType const unsigned short ELEMENT_NODE = 1; const unsigned short ATTRIBUTE_NODE = 2; const unsigned short TEXT_NODE = 3; const unsigned short CDATA_SECTION_NODE = 4; const unsigned short ENTITY_REFERENCE_NODE = 5; const unsigned short ENTITY_NODE = 6; const unsigned short PROCESSING_INSTRUCTION_NODE = 7; const unsigned short COMMENT_NODE = 8; const unsigned short DOCUMENT_NODE = 9; const unsigned short DOCUMENT_TYPE_NODE = 10; const unsigned short DOCUMENT_FRAGMENT_NODE = 11; const unsigned short NOTATION_NODE = 12; readonly attribute wstring nodeName; attribute wstring nodeValue; readonly attribute unsigned short nodeType; readonly attribute Node parentNode; readonly attribute NodeList childNodes; readonly attribute Node firstChild; readonly attribute Node lastChild; readonly attribute Node previousSibling; readonly attribute Node nextSibling; readonly attribute NamedNodeMap attributes; readonly attribute Document ownerDocument; Node insertBefore(in Node newChild, in Node refChild) raises(DOMException); Node replaceChild(in Node newChild, in Node oldChild) raises(DOMException); Node removeChild(in Node oldChild) raises(DOMException); Node appendChild(in Node newChild) raises(DOMException); boolean hasChildNodes(); Node cloneNode(in boolean deep); };
An integer indicating which type of node this is.
ELEMENT_NODE |
The node is a |
ATTRIBUTE_NODE |
The node is an |
TEXT_NODE |
The node is a |
CDATA_SECTION_NODE |
The node is a |
ENTITY_REFERENCE_NODE |
The node is an |
ENTITY_NODE |
The node is an |
PROCESSING_INSTRUCTION_NODE |
The node is a |
COMMENT_NODE |
The node is a |
DOCUMENT_NODE |
The node is a |
DOCUMENT_TYPE_NODE |
The node is a |
DOCUMENT_FRAGMENT_NODE |
The node is a |
NOTATION_NODE |
The node is a |
The values of nodeName
, nodeValue
,
and attributes
vary according to the node type as follows:
nodeName | nodeValue | attributes | |
Element | tagName | null | NamedNodeMap |
Attribute | name of attribute | value of attribute | null |
Text | #text | content of the text node | null |
CDATASection | #cdata-section | content of the CDATA Section | null |
EntityReference | name of entity referenced | null | null |
Entity | entity name | null | null |
ProcessingInstruction | target | entire content excluding the target | null |
Comment | #comment | content of the comment | null |
Document | #document | null | null |
DocumentType | document type name | null | null |
DocumentFragment | #document-fragment | null | null |
Notation | notation name | null | null |
nodeName
The name of the node depends on its type; see the table above.
nodeValue
The value of a node depends on its type; see the table above. On
setting a NO_MODIFICATION_ALLOWED_ERR DOMException
is raised
when the node is readonly. On retrieval a WSTRING_SIZE_ERR
DOMException
is raised when it would return more characters than
fit in a wstring
variable on the implementation platform.
nodeType
A code representing the type of the underlying object's type, as defined above.
parentNode
The parent of the given Node
instance. All nodes,
except Document
, DocumentFragment
, and
Attribute
may have a parent. However, if a
node has just been created and not yet added to the tree, or if it has
been removed from the tree, this is null
.
childNodes
A NodeList
object that enumerate all
children of this node. If there are no children, this is a
NodeList
containing no nodes. The content of the
returned NodeList
is "live" in the
sense that, for instance, changes to the children of the node object
that it was created from are immediately reflected in the nodes
returned by the NodeList
accessors; it is not a
static snapshot of the content of the Node. This is true for every
NodeList
, including the ones returned by the
getElementsByTagName
method.
firstChild
The first child of a node. If there is no such
node, this returns null
.
lastChild
The last child of a node. If there is no such
node, this returns null
.
previousSibling
The node immediately preceding the current node. If there is no
such node, null
is returned.
nextSibling
The node immediately following the current node. If there is no
such node, this returns null
.
attributes
Provides access to a NamedNodeMap
containing the
node's attributes (if it is an Element
) or
null
otherwise.
ownerDocument
Provides access to the Document
object
associated with this Node
. This is also the
Document object used to create new Nodes. When the
Node
is a Document
this is null
.
insertBefore
Inserts a child node newChild
before the
existing child node refChild
. If
refChild
is null
, insert
newChild
at the end of the list of children.
If newChild
is a DocumentFragment
object, all of its children are inserted, in the same order, before
refChild
. If the newChild
is already in the
tree, it is first removed.
newChild |
The node to insert | |
refChild |
The reference node, i.e., the node before which the new node must be inserted. |
The node being inserted.
HIERARCHY_REQUEST_ERR: Raised if this node is of a type
that does not allow children of the type of
the newChild
node.
WRONG_DOCUMENT_ERR: Raised if newChild
was created from
a different document than the one that created this node.
NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.
NOT_FOUND_ERR: Raised if refChild
is not a child
of this node.
replaceChild
Replaces the child node oldChild
with
newChild
in the set of children of the given
node, and returns the oldChild
node. If the
newChild
is already in the tree, it is first removed.
newChild |
The new node to put in the child list. | |
oldChild |
The node being replaced in the list. |
The node replaced.
HIERARCHY_REQUEST_ERR: Raised if this node is of a type
that does not allow children of the type of
the newChild
node.
WRONG_DOCUMENT_ERR: Raised if newChild
was created from
a different document than the one that created this node.
NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.
NOT_FOUND_ERR: Raised if oldChild
is not a
child of this node.
removeChild
Removes the child node indicated by
oldChild
from the list of children and returns it.
oldChild |
The node being removed |
The node removed.
NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.
NOT_FOUND_ERR: Raised if oldChild
is not a child
of this node.
appendChild
Adds a child node to the end of the list of children for
this node. If the newChild
is already in the tree, it is
first removed.
newChild |
The node to add. If it is a
|
The node added.
HIERARCHY_REQUEST_ERR: Raised if this node is of a type
that does not allow children of the type of
the newChild
node.
WRONG_DOCUMENT_ERR: Raised if newChild
was created from
a different document than the one that created this node.
NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.
hasChildNodes
This is a convenience method to allow easy determination
of whether a Node
has children or not.
true
if the node has any children,
false
if the node has no children.
cloneNode
Returns a duplicate of the node, i.e., serves
as a generic copy constructor for Nodes. The duplicate node has no
parent (parentNode
returns null
.).
Cloning an Element
copies
all attributes and their values, including those generated by the
XML processor to represent defaulted attributes, but this method does
not copy any text it contains unless it is a deep clone, since the text
is contained in a child Text
node. Cloning any other type of
node simply returns a copy of this node.
deep |
If |
The duplicate node.
The NodeList
interface provides the abstraction of an
ordered collection of nodes, without defining or
constraining how this collection is implemented.
The items in the NodeList
are accessible via an
integral index, starting from 0.
interface NodeList { Node item(in unsigned long index); readonly attribute unsigned long length; };
item
Returns the index
th item in the collection.
If index
is greater than or equal to the number
of nodes in the list, null is returned.
index |
Index into the collection |
The node at the index
position in the
NodeList
, or null
if that is not a
valid index.
length
The number of nodes in the NodeList
instance.
The range of valid child node indices is 0 to
length-1
inclusive.
Objects implementing the NamedNodeMap
interface are used to represent collections of nodes that can be
accessed by name. Note that NamedNodeMap
does not inherit
from NodeList
; NamedNodeMap
s are not maintained in any particular
order. Objects contained in an object implementing
NamedNodeMap
may also be accessed by an
ordinal index, but this is simply to allow convenient enumeration of
the contents of a NamedNodeMap
, and does not imply that the DOM specifies an
order to these Nodes. DOM implementations should, when possible,
preserve the ordering of objects in a NamedNodeMap
in case the author of the source document assigned some meaning
to this ordering that is not defined in the DOM, XML or HTML specifications.
interface NamedNodeMap { Node getNamedItem(in wstring name); Node setNamedItem(in Node arg) raises(DOMException); Node removeNamedItem(in wstring name) raises(DOMException); Node item(in unsigned long index); readonly attribute unsigned long length; };
getNamedItem
Retrieves a node from a list by name
name |
Name of a node to retrieve. |
A Node
(of any type) with the specified
name, or null
if the specified name did not
identify any node in the list.
setNamedItem
Add a node to a NamedNodeMap
using the
nodeName
attribute of the node.
As the nodeName
attribute is used to
derive the name which the node must be stored under, multiple
nodes of certain types (those that have a "special" string
value) cannot be stored as the names would clash. This is seen
as preferable to allowing nodes to be aliased.
arg |
A node to store in a named node list. The node will
later be accessible using the value of the
|
If the new Node replaces an existing node with the same name the previously existing Node is returned, otherwise null is returned.
WRONG_DOCUMENT_ERR: Raised if arg
was created from a different
document than the one that created the NamedNodeMap
.
NO_MODIFICATION_ALLOWED_ERR: Raised if this
NamedNodeMap
is readonly.
INUSE_ATTRIBUTE_ERR: Raised if arg
is an Attribute
that is already attribute of another Element
object. The
DOM user must explicitly clone Attribute
nodes to re-use them in other elements.
removeNamedItem
Remove a node identified by its name. If the removed
node is an Attribute
with a default value it is immediately
replaced.
name |
The name of a node to remove |
The node removed from the list or null
if no node with such a name exists.
NOT_FOUND_ERR: Raised if there is no node named
name
in the list.
item
Returns the index
th item in the collection.
If index
is greater than or equal to the number
of nodes in the list, null
is returned.
index |
Index into the collection |
The node at the index
position in the
NamedNodeMap
, or null
if that is not a
valid index.
length
The number of nodes in the NamedNodeMap
instance.
The range of valid child node indices is 0 to
length-1
inclusive.
The CharacterData
interface extends Node with a set
of attributes
and methods for accessing character data in the DOM. This set is defined
here rather than on each object that uses these attributes and methods
for clarity. No DOM objects correspond directly to CharacterData
,
though Text
and
others do inherit the interface from it. All offset
s in
this interface start from 0.
interface CharacterData : Node { attribute wstring data; readonly attribute unsigned long length; wstring substringData(in unsigned long offset, in unsigned long count) raises(DOMException); void appendData(in wstring arg) raises(DOMException); void insertData(in unsigned long offset, in wstring arg) raises(DOMException); void deleteData(in unsigned long offset, in unsigned long count) raises(DOMException); void replaceData(in unsigned long offset, in unsigned long count, in wstring arg) raises(DOMException); };
data
This provides access to the character data of a node
that implements this interface. The DOM implementation may not
put arbitrary limits on the amount of data that may be stored in a
CharacterData
node. However, implementation limits may
mean that the entirety of a node's data data may not fit into a single
wstring
. Attempts to retrieve data
that does not fit in a single wstring
causes a
WSTRING_SIZE_ERR DOMException
to be raised. In such cases,
the user may call substringData
to retrieve the data in
appropriately sized pieces. In addition, on setting a
NO_MODIFICATION_ALLOWED_ERR DOMException
is raised when
the node is readonly.
length
This provides access to the number of characters that
are available through data
and the
substringData
method below. This may have the value zero,
i.e., CharacterData
nodes may be empty.
substringData
Extracts a range of data from an object implementing this interface.
offset |
Start offset of substring to extract | |
count |
The number of characters to extract. |
This method returns the specified substring. If the sum of
offset
and count
exceeds the
length
, then all characters to the end of the data are
returned.
INDEX_SIZE_ERR: Raised if the specified offset is negative or
greater than the number of characters in data
, and if the
specified count
is negative.
WSTRING_SIZE_ERR: Raised if the specified range of text does
not fit into a wstring
.
appendData
Append the string to the end of the character data in the
object implementing this interface. Upon success,
data
provides access to the concatenation of
data
and the wstring
specified.
arg |
The |
NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.
insertData
Insert a string at the specified character offset.
offset |
The character offset at which to insert | |
arg |
The |
INDEX_SIZE_ERR: Raised if the specified offset is negative or
greater than the number of characters in data
.
NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.
deleteData
Remove a range of characters from the node. Upon success,
data
and length
reflect the change.
offset |
The offset from which to remove characters. | |
count |
The number of characters to delete. If the sum of
|
INDEX_SIZE_ERR: Raised if the specified offset is negative or
greater than the number of characters in data
, and if the
specified count
is negative.
"NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.
replaceData
Replace the characters starting at the specified character offset with the specified string.
offset |
The offset from which to start replacing. | |
count |
The number of characters to replace. If the sum of
| |
arg |
The |
INDEX_SIZE_ERR: Raised if the specified offset is negative or
greater than the number of characters in data
, and if the
specified count
is negative.
NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.
The Attribute interface represents an attribute in an Element object. Typically the allowable values for the attribute are defined in a document type definition.
DOM Attribute
objects inherit the Node
interface, but since they are not actually child nodes of the element
they describe, the DOM does not consider them part of the document
tree. Thus, the Node
attributes parentNode
,
previousSibling
, and nextSibling
have a
null value for Attribute
objects. The DOM takes the
view that attributes are properties of elements rather than having a
separate identity separate from the elements they are associated with;
this should make it more efficient to implement
such features as default attributes associated with all elements of a
given type. Furthermore, Attribute
nodes may not be immediate children of a DocumentFragment
.
However, they can be associated with element nodes contained within
a DocumentFragment
.
In short, users and implementors of the DOM need to be aware that
Attribute
nodes have some things in
common with other objects inheriting the Node
interface,
but they also are quite distinct.
The attribute's effective value is determined as follows: if this
attribute has been explicitly assigned any value, that value is the
attribute's effective value; otherwise, if there is a declaration for
this attribute, and that declaration includes a default value, then
that default value is the attribute's effective value; otherwise, the
attribute does not exist on this element in the structure model until
it has been explicitly added. Note that the nodeValue
attribute on the Attribute instance can also be used to
retrieve the string version of the attribute's value(s).
In XML, the value of an attribute is represented by the
child nodes of an Attribute
node, since the value can
contain entity references. Thus, attributes which contain
entity references will have a child list containing both Text
nodes and EntityReference
nodes. In addition, because
the attribute type may be unknown, there are no tokenised attribute
values.
interface Attribute : Node { readonly attribute wstring name; readonly attribute boolean specified; attribute wstring value; };
name
Returns the name of this attribute.
specified
If this attribute was explicitly given a value in the
original document, this is true
; otherwise, it is
false
. Note that the implementation is in charge
of this attribute,
not the user. If the user changes the value of the attribute (even
if it ends up having the same value as the default value) then the
specified flag is automatically flipped to true
.
To re-specify the attribute as the default value from the DTD,
the user must delete the attribute, and then the implementation
will make a new attribute available with specified == false
and
the default value (if one exists).
In summary:
specified
is true
, the value is the
assigned value.specified
is false
,
and the value is the default value in the DTD.value
When used to get the Value of an attribute, returns the value of the attribute as a string. Character and general entity references are replaced with their values in the returned string.
When used to set the Value of an Attribute, creates a Text node with the unparsed contents of the string.
By far the vast majority (apart from text) of objects
that authors encounter when traversing a document
are Element
nodes.
Assume the following XML document:
<elementExample id="demo"> <subelement1/> <subelement2><subsubelement/></subelement2> </elementExample>
When represented using DOM, the top node is
"elementExample", which contains two child Element
nodes, one for "subelement1" and one
for "subelement2". "subelement1" contains no
child nodes.
Elements may have attributes associated with them; since the
Element
interface inherits from Node
,
the generic Node
interface method
getAttributes
may
be used to retrieve the set of all attributes for an element.
There are methods on the Element
interface to retrieve either an
Attribute object by name or directly an Attribute value by name.
Attribute objects should be retrieved in XML, where attributes may
contain entity references, meaning that their values may be a fairly
complex sub-tree. On the other hand, in HTML, where all attributes have
simple string values, methods to directly access an Attribute value can
safely be used as a convenience.
interface Element : Node { readonly attribute wstring tagName; wstring getAttribute(in wstring name); void setAttribute(in wstring name, in wstring value) raises(DOMException); void removeAttribute(in wstring name) raises(DOMException); Attribute getAttributeNode(in wstring name); Attribute setAttributeNode(in Attribute newAttr) raises(DOMException); Attribute removeAttributeNode(in Attribute oldAttr) raises(DOMException); NodeList getElementsByTagName(in wstring name); void normalize(); };
tagName
This attribute contains the string that is the element's name. For example, in:
<elementExample id="demo"> ... </elementExample> ,
tagName
has the value
"elementExample"
. Note that this is
case-preserving in XML, as are all of the operations of the DOM.
The HTML DOM returns the tagName
of an HTML element
in the canonical uppercase form, regardless of the case in the
source HTML document. getAttribute
Retrieves an Attribute
value by name.
name |
The name of the attribute to retrieve |
The Attribute
value as a string, or the empty
string if that attribute does not have a specified or
defaulted value.
setAttribute
Adds a new attribute. If an attribute with that
name is already present in the element, its value is changed
to be that of the value parameter. This value is a simple string,
it is not parsed as it is being set. So any markup (such as
syntax to be recognized as an entity reference) is
treated as literal text, and needs to be appropriately
escaped by the implementation when it is written out. In
order to assign an attribute value
that contains entity references, the user must create an
Attribute
node plus any Text
and
EntityReference
nodes, build the appropriate
subtree, and use setAttributeNode
to assign it
as the value of an attribute.
name |
Name of an attribute | |
value |
Value to set in string form |
INVALID_NAME_ERR: Raised if an invalid name is specified.
NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.
removeAttribute
Removes the Attribute
with the specified name. If
the removed Attribute
has a default value it is immediately
replaced.
name |
The name of the attribute to remove |
"NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.
getAttributeNode
Retrieves an attribute node by name.
name |
The name of the attribute to retrieve |
The attribute node with the specified attribute
name or null
if there is no such attribute.
setAttributeNode
Adds a new attribute. If an attribute with that name is already present in the element, it is replaced by the new one.
newAttr |
The attribute node to add to the attribute list |
If the newAttr
attribute replaces
an existing attribute with the same name, the
previously existing Attribute
node is returned, otherwise
null is returned.
WRONG_DOCUMENT_ERR: Raised if newAttr
was
created from a different document than the one that created the
element.
NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.
INUSE_ATTRIBUTE_ERR: Raised if newAttr
is already
an attribute of another Element
object. The
DOM user must explicitly clone Attribute
nodes to re-use them in other elements.
removeAttributeNode
Removes the specified attribute.
oldAttr |
The |
Returns the Attribute node that was removed.
NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.
NOT_FOUND_ERR: Raised if oldAttr
is not an attribute of
the element.
getElementsByTagName
Returns a NodeList
of all descendant elements of
with a given tag name in the order in which they would be encountered
in a preorder traversal of the Element
tree.
name |
The name of the tag to match on. If the string "*" is given, this method returns all descendant elements of the starting element. |
This method returns a list of element nodes that have the specified tag name.
normalize
Puts all Text
nodes in the full depth of the
sub-tree underneath this Element
into a "normal" form
where only markup (e.g., tags, comments, processing
instructions, CDATA sections, and entity references)
separates Text
nodes, i.e., there are no adjacent
Text nodes. This can be used to
ensure that the DOM view of a document is identical to how
it would look if saved and re-loaded, and is useful when
operations (such as XPointer lookups) that depend on a
particular document tree structure are to be used.
This method has no parameters.
This method returns nothing.
This method raises no exceptions.
The text
interface represents the textual
content (termed character
data
in XML) of an Element
or
Attribute
.
If there is no markup
inside an element's content, the text is contained in a
single object implementing the Text
interface that
is the child of the element. Any markup is parsed into child
elements that are siblings of the text nodes on either side of
it, and whose content is represented as text node children
of the markup element.
When a document is first made available to the DOM, there is
only one Text node for each block of text. Users may create
adjacent Text nodes that represent the
contents of a given element without any intervening markup, but
should be aware that there is no way to represent the separations
between these nodes in XML or HTML, so they will not (in general)
persist between DOM editing sessions. The normalize()
method on Element
merges any such adjacent Text
objects into a single node for each block of text; this is
recommended before employing operations that depend on a particular
document structure, such as navigation with XPointers.
interface Text : CharacterData { Text splitText(in unsigned long offset) raises(DOMException); };
splitText
Breaks a text node into two text nodes at the specified offset, keeping both in the tree as siblings.
offset |
The offset at which to split, starting from 0. |
This method returns the new text node containing
all the content at and after the offset
point.
The original node contains all the content up to the
offset
point.
INDEX_SIZE_ERR: Raised if the specified offset is negative or
greater than the number of characters in data
.
NO_MODIFICATION_ALLOWED_ERR: Raised if this node is readonly.
This represents the content of a comment, i.e. all the
characters between the starting '<!--
' and
ending '-->
'. Note that this is the definition
of a comment in XML, and, in practice, HTML, although some HTML
tools may implement the full SGML comment structure.
interface Comment : CharacterData { };
The ProcessingInstruction
interface
represents a "processing instruction", used in XML
(and legal, though seldom supported, in HTML) as a way to keep
processor-specific information in the text of the document. The
content of the node is the entire content between the delimiters
of the processing instruction.
interface ProcessingInstruction : Node { readonly attribute wstring target; attribute wstring data; };
target
XML defines a target as the first token following the markup
that begins the processing instruction. This attribute value is that
name. For HTML, the value is null
.
data
The content of the processing instruction. In HTML this is from
from the character immediately after the <?
to the
character immediately preceding the >
. In XML this
is from the first non white space character after the target
to the character immediately preceding the ?>
.
On setting a NO_MODIFICATION_ALLOWED_ERR DOMException
is
raised when the node is readonly.
The interfaces defined here form part of the DOM Level 1 Core specification, but objects that expose these interfaces will never be encountered in a DOM implementation that deals only with HTML. As such, HTML-only DOM implementations do not need to have objects that implement these interfaces.
CDATA Sections are used to escape blocks of text containing characters that would otherwise be regarded as markup. The only delimiter that is recognised in a CDATA Section is the "]]>" string that ends the CDATA Section. CDATA Sections can not be nested. The primary purpose is for including material such as XML fragments, without needing to escape all the delimiters.
The wstring
attribute of the
Text
node holds the text that is contained by the CDATA
section. Note that this may contain characters
that need to be escaped outside of CDATA sections.
The CDATA Section inherits the CharacterData interface through the Text interface. Adjacent CDATA Sections are not merged by use of the Element.normalize() method.
interface CDATASection : Text { };
Each document has a (possibly null) attribute that
contains a reference to a DocumentType
object.
The DocumentType
class in the DOM Level 1 core
provides an interface to the list of entities that are defined
for the document, and little else because the effect of
namespaces and the various XML scheme efforts on DTD
representation are not clearly understood as of this writing.
interface DocumentType : Node { readonly attribute wstring name; readonly attribute NamedNodeMap entities; readonly attribute NamedNodeMap notations; };
name
The name
attribute is a wstring
that
holds the name of DTD; i.e., the name immediately
following the DOCTYPE
keyword.
entities
This is a NamedNodeMap
containing
the general entities, both external and internal,
declared in the DTD. For example in:
<!DOCTYPE ex SYSTEM "ex.dtd" [ <!ENTITY foo "foo"> <!ENTITY bar "bar"> <!ENTITY % baz "baz"> ]> <ex/>the interface provides access to
foo
and
bar
but not baz
. All objects supporting
the Node
interface that are accessed through this
attribute, also support the
Entity
interface. For HTML, this is always
null
.notations
This is a NamedNodeMap
containing the
notations declared in the DTD. Each node in this map also
implements the Notation
interface.
This interface represents a notation declared in the DTD. A notation either declares, by name, the format of an unparsed entity (see section 4.7 of the XML 1.0 specification), or is used for formal declaration of Processing Instruction targets (see section 2.6 of the XML 1.0 specification). The nodeName attribute inherited from Node is set to the declared name of the notation.
interface Notation : Node { readonly attribute wstring publicId; readonly attribute wstring systemId; };
publicId
The public identifier for the notation. If the
public identifier was not specified, this is
null
.
systemId
The system identifier for the notation. If the
system identifier was not specified, this is
null
.
This interface represents an entity, either parsed or
unparsed, in an XML document. Note that this models the entity
itself not the entity declaration. Entity
declaration modeling has been left for a later Level of the DOM
specification.
An XML processor may choose to completely expand entities before the structure model is passed to the DOM; in this case there will be no entity references in the document tree.
The nodeName
attribute that is inherited from
Node
contains the name of the entity.
The structure of the child list is exactly the same as
the structure of the child list for an
EntityReference
with the same nodeName
value.
Level 1 of the DOM API does not support editing Entity
declarations; if a user wants to make changes to the contents of an
Entity, the EntityReference node has to be replaced in the
structure model by a clone of the Entity
's contents. All the
nodes beneath the entity reference are readonly.
interface Entity : Node { readonly attribute wstring publicId; readonly attribute wstring systemId; readonly attribute wstring notationName; };
publicId
The public identifier associated with the entity, if
specified. If the public identifier was not specified, this
is null
.
systemId
The system identifier associated with the entity, if
specified. If the system identifier was not specified, this
is null
.
notationName
For unparsed entities, the name of the notation for the
entity. For parsed entities, this is null
.
EntityReference
objects may be inserted into the
structure model when an entity reference is in the source document,
or when the user wishes to insert an entity reference. Note that
character entities are considered to be expanded by the HTML or XML
processor so that characters are represented by their Unicode
equivalent rather than by an entity reference.
The replacement value of the referenced Entity
, if available,
appears in the child list of the
EntityReference
object. Alternatively, the XML
processor may completely expand references
to entities while building the structure model, instead of
providing EntityReference
objects.
XML does not mandate that a non-validating XML processor read and process entity declarations made in the external subset or declared in external parameter entities. This means that parsed entities declared in the external subset need not be expanded by some classes of applications, and that the replacement value of the entity may not be available.
The resolution of the children of the EntityReference
(the
replacement value of the referenced Entity
) may be lazily
evaluated; actions by the user (such as calling the
childNodes
method on the EntityReference
Node)
are assumed to trigger the evaluation.
interface EntityReference : Node { };