We'll start by considering the basic Node class. All the other DOM nodes -- Document, Element, Text, and so forth -- are subclasses of Node. It's possible to perform many tasks using just the interface provided by Node.
A note on attributes: In the DOM specification, attributes are defined as being accessible through a pair of accessor methods named get_*() and set_*(). For example, code would read the nodeValue attribute by calling the get_nodeValue() method, and set it by calling set_nodeValue(), passing the new value as the method's argument. This is simple to understand, but it's also a bit clumsy. Therefore, the DOM types all define the __getattr__/__setattr__ magic required to let you simply write node.nodeValue, as if it was simply an ordinary Python attribute.
All DOM nodes have a name, a value, and a type. These are available as the nodeName, nodeValue, and nodeTypeattributes. nodeType and nodeName are read-only; you can't mutate a node into another type of node, nor can you change its name. nodeValue may be changeable, depending on what the node is.
nodeType is used to determine the type of a given node, and is an integer from the following list of constants in the xml.dom.core module:
ATTRIBUTE CDATA_SECTION COMMENT DOCUMENT DOCUMENT_FRAGMENT DOCUMENT_TYPE ELEMENT ENTITY ENTITY_REFERENCE NOTATION PROCESSING_INSTRUCTION TEXT
nodeName and nodeValue are the name and value of the node. For some node types, one of the values may be set to some fixed value. For example, a Text node always has the same value for nodeName: "#text". On the other hand, an Element node has nodeName set to the element's name, such as title or body, and nodeValue is always None.
An instance of Node will usually (but not always) have a parent node. It may also have children, and, if the node's parent has several children, it will also have siblings. The following attributes are available for exploring a node's parent and children:
| Attribute | Value |
|---|---|
parentNode |
Parent of this node |
firstChild |
First child of this node |
lastChild |
Last child of this node |
previousSibling |
Node immediately preceding this node |
nextSibling |
Node immediately following this node |
childNodes |
List containing all the children of this node |
For each of the attributes in the above table, you can either directly access the attribute, as in node.parentNode, or you can call an accessor method such as node.get_parentNode(). Any of the above attributes can be None if no corresponding node exists; for example, if the node is an only child, both previousSiblingand nextSibling will be None.
Let's look at the values for the first title element in the example tree. That portion of the tree looks like this:
<Element 'folder'>
<Element 'title'>
<Text node 'XML bookmarks'>
<Element 'bookmark'>
<Element 'title'>
And the values for the various attributes are:
| Attribute | Value |
|---|---|
parentNode |
folder element |
firstChild |
Text node 'XML bookmarks' |
lastChild |
Text node 'XML bookmarks' |
previousSibling |
codeNone |
nextSibling |
bookmark element |
childNodes |
A 1-element list: [ Text node 'XML bookmarks' ] |
parentNode can be None if the node is the root of a DOM tree, or if it's a newly created node that hasn't been added to a tree yet.