John Resig - Comparing Document Position

Comparing Document Position

A great blog post, for me, was one written by PPK back about two years about in which he explained how the contains() and compareDocumentPosition() methods work in their respective browsers. I’ve, since, done a lot of research into these methods and have used them on a number of occasions. As it turns out they’re incredibly useful for a number of tasks (especially relating to the construction of pure-DOM selector engines).

DOMElement.contains(DOMNode)

Originally introduced by Internet Explorer this method determines if one DOM Node is contained within another DOM Element. This method can be especially useful when attempting to optimize CSS Selector traversals that look like “#id1 #id2”. With this method you could getElementById both elements then use .contains() to determine that #id1 does, in fact, contain #id2.

There’s one gotchya: .contains() will return true if the DOM Node and DOM Element are identical (even though, technically, an element cannot contain itself).

Here’s a simple implementation wrapper that works in Internet Explorer, Firefox, Opera, and Safari.

function contains(a, b){
  return a.contains ?
    a != b && a.contains(b) :
    !!(a.compareDocumentPosition(b) & 16);
}

Note that we use compareDocumentPosition, which we’ll be discussing next.

DOMNode.compareDocumentPosition(DOMNode)

This method is part of the DOM Level 3 specification and allows you determine where two DOM Nodes are, in relation to each other. This method is much more powerful to .contains(). One possible use of this method is to re-order DOM nodes to be in a specific order (as was also done by PPK).

With this method you can determine a whole slew of information pertaining to the position of an element. All of this information is returned using a bitmask.

For those who aren’t familiar with it, a bitmask is a way of storing multiple points of data within a single number. You end up turning on/off the individual bits of the number, giving you a final result.

Here are the results from NodeA.compareDocumentPosition(NodeB) along with all the information that you can access:

Bits	Number	Meaning
000000	0	Elements are identical.
000001	1	The nodes are in different documents (or one is outside of a document).
000010	2	Node B precedes Node A.
000100	4	Node A precedes Node B.
001000	8	Node B contains Node A.
010000	16	Node A contains Node B.
100000	32	For private use by the browser.

Now, this means that a possible result could be something like:

<div id="a"><div id="b"></div></div>
<script>
alert( document.getElementById("a")
  .compareDocumentPosition(document.getElementById("b")) == 20);
</script>

Since a node that contains another both “contains” it (+16) and precedes it (+4) the final result is the number 20. It might make more sense if you look at what’s happening to the bits:

000100 (4) + 010000 (16) = 010100 (20)

This, undoubtedly, makes for the single most confusing method of the DOM API – however it’s one whose worth will be well deserved.

Right now DOMNode.compareDocumentPosition is available in Firefox and Opera. However, there are some tricks that we can use to implement it completely in Internet Explorer, observe:

// Compare Position - MIT Licensed, John Resig
function comparePosition(a, b){
  return a.compareDocumentPosition ?
    a.compareDocumentPosition(b) :
    a.contains ?
      (a != b && a.contains(b) && 16) +
        (a != b && b.contains(a) && 8) +
        (a.sourceIndex >= 0 && b.sourceIndex >= 0 ?
          (a.sourceIndex < b.sourceIndex && 4) +
            (a.sourceIndex > b.sourceIndex && 2) :
          1) +
      0 :
      0;
}

Internet Explorer provides us with a couple methods and properties that we can use. To start, with the .contains() method (as we discussed before) so that gives us contains (+16) and ‘is contained by’ (+8). Internet Explorer also has a .sourceIndex property on all DOM Elements corresponding to the position of the element absolutely within the document. For example, document.documentElement.sourceIndex == 0. Because we have this information we can complete two more pieces of the compareDocumentPosition puzzle: preceded by (+2) and followed by (+4). Additionally, if an element isn’t currently located within a document it’s .sourceIndex will equal -1, which gives us an answer of 1. Finally, through process of deduction, we can determine if an element is equal to itself, returning an empty bitmask of 0.

This function will work in Internet Explorer, Firefox, and Opera. We’ll only have crippled functionality in Safari (since it only has .contains(), and no .sourceIndex, we’ll only get ‘contains’ +8 and ‘is contained by’ +16 – all other results will return ‘1’ representing a disconnect).

PPK provides a great example of how this new functionality can be used by creating a getElementsByTagNames method. Let’s adapt it to work with our new method:

// Original by PPK quirksmode.org
function getElementsByTagNames(list, elem) {
elem = elem || document;

var tagNames = list.split(‘,’), results = [];

for ( var i = 0; i < tagNames.length; i++ ) { var tags = elem.getElementsByTagName( tagNames[i] ); for ( var j = 0; j < tags.length; j++ ) results.push( tags[j] ); } return results.sort(function(a, b){ return 3 - (comparePosition(a, b) & 6); }); }[/js] We could now use this to construct an, in order, table of contents for a site: [js]getElementsByTagNames("h1, h2, h3");[/js] While both Firefox and Opera have taken some initiative to implement this method, I'm looking forward to seeing more browser get on board to help push this forward.

Note: In jQuery you can do $(“:header“) to select all header elements in order.

Posted: February 19th, 2008

Subscribe for email updates

12 Comments (Show Comments)

Daniel Glazman (February 19, 2008 at 4:06 am)

“This method can be especially useful when attempting to optimize CSS Selector traversals that look like “#id1 #id2″. With this method you could getElementById both elements then use .contains() to determine that #id1 does, in fact, contain #id2.”

This could be more expensive than the following : find #1 element based on id if it exists then climb up the tree from that point trying to find #2. In particular, if #1 and #2 are both at the traversal end of a WIDE AND DEEP tree, your two getElementById() calls are expensive. In general, matching selectors against the tree is faster if you process from the last selector in the chain of selectors to the first one. Testing simple selectors can also be optimized, see |SelectorMatches()| in nsCSSRuleProcessor.cpp.
Christian Johansen (February 19, 2008 at 4:08 am)

Very interesting read! I guess the last part of getElementsByTagNames really could be used with any selector code that returns more than one node.

Byt the way: You’ve got a small typo in your last example. You define getElementsByTagNames but call getElementsByTagName :)
Boris (February 19, 2008 at 4:26 am)

Daniel, in Gecko getElementById caches the result in a hashtable, so multiple gets of the same id are fast. And it only requires a traversal for the first several hundred calls. Once there are that many hashtable misses, we switch to just making the hashtable live, so all gets from that point on are just hashtable lookups.

We don’t do that for XML yet, but we will at some point.

I wouldn’t be surprised if other UAs have various optimizations here too…
Jake Archibald (February 19, 2008 at 6:53 am)

Unless I was doing something very wrong, I found Safari 2’s implementation of node.contains() to be very buggy. It seemed to be returning true/false on a whim, sometimes getting it right, sometimes getting it wrong.

Seemed fine in Safari 3 tho.

Jake.
Lars (February 19, 2008 at 10:30 am)

“We’ll only have crippled functionality in Safari (since it only has .contains(), and no .sourceIndex”

There’s a way to fake .sourceIndex in safari. It builds on the fact that document.all is in the right order. So if you really need sourceIndex, the following works, and is not horribly slow:

var all = document.all; for (var i = 0, j = all.length; i<j; i++) all[i].sourceIndex = i;
Maciej Stachowiak (February 19, 2008 at 10:48 am)

The standards-based way to compare document positions would be using the Range.compareBoundaryPoints method from DOM2 Range. It should be usable to implement compareDocumentPosition but it is also sufficient in itself to implement the getElementsByTagNames method above.
Jake Archibald (February 19, 2008 at 11:31 am)

@Maciej

compareDocumentPosition is part of the standard
http://www.w3.org/TR/DOM-Level-3-Core/core.html#Node3-compareDocumentPosition
Weston Ruter (February 19, 2008 at 3:27 pm)

It seems like the need for a new method getElementsByTagNames is not very strong since it is just a subset of querySelectorAll. Likewise, the new getElementsByClassName method is an even more limited subset and can be replaced with querySelectorAll. Is the rationale for introducing these other special-purpose subset methods that they can be optimized for greater speed?
Peter Goodman (February 20, 2008 at 7:40 am)

@Lars

That would require updating the whole dom tree, or at least much of it, in the event that you inject element(s) into it. Thus, that is only a temporary solution.
Searle (February 21, 2008 at 3:45 pm)

&& [int]?

Sure that shouldn’t read

& [int]

[something != 0] && [int]
will always return true imho…
alabaste (March 8, 2008 at 11:53 am)

http://www.codingforums.com/member.php?u=59792
Rowan Nairn (March 28, 2008 at 1:53 pm)

Hi John,

a little typo in the first code listing: arg should be b. Right?

Comments are closed.
Comments are automatically turned off two weeks after the original post. If you have a question concerning the content of this post, please feel free to contact me.