When working with the DOM .nodeName
property there are two hard-and-fast rules that most people abide by:
- The node names of HTML elements are always uppercase, even if they’re explicitly created using lowercase characters.
<html>
will result in a.nodeName === "HTML"
(see the HTML 5 draft). - The node names of XML elements are always in the original case, as specified when they’re created.
<data>
will result in a.nodeName === "data"
,<DATA>
will result in a.nodeName === "DATA"
.
Knowing these rules can be useful because it allows you to optimize your code. If you know that you’re in an HTML document you can avoid having to upper/lowercase your .nodeName
checks and you can just always assume that you’re dealing with a .nodeName
that’s uppercase. This results in faster selectors for Internet Explorer and other minor optimizations.
However recently I’ve been running across two cases that’ve been especially problematic and have bucked the trend.
Importing Nodes from XML
The first is for browsers that support the adoptNode
/importNode
DOM methods. These methods allow you to move (or clone) a node from one DOM document to another. In this way you can move an XML node from an XML document and insert it into an HTML document. Normally this shouldn’t matter much but, as it turns out, the original .nodeName
case sensitivity is preserved from the original XML-ness of the node.
Thus if you have a lowercase XML element (<data>
) and you use adoptNode
or importNode
to bring it into your HTML document the result will be .nodeName === "data"
— which completely bucks the trend for “all HTML element’s node names are always uppercase.” I consider this to be a bug, considering that the DOM element is now in an HTML document, not in an XML document, and should behave as such.
Unknown HTML 5 Elements
The second bit of weirdness comes from people attempting to use the new elements from HTML 5 in browsers that don’t support it. Most browsers behave perfectly well when using some of the new HTML 5 elements (in that they don’t freak out and support some level of styling). For Internet Explorer you must use the HTML 5 Shim technique – this will give unknown HTML 5 elements the ability to be styled and hold contents (such as a <section>
element).
However there is an additional gotcha: When Internet Explorer encounters an element that it doesn’t recognize it leaves the .nodeName
in its original case. Thus if you have a <section>
element in your HTML page the result will be .nodeName === "section"
— which directly contradicts the normal case sensitivity of the .nodeName
property in HTML documents.
To try and understand all of this I made a bunch of test cases using a number of doctypes and document styles.
- HTML 5 document – uses the HTML 5 Doctype.
- XHTML document served as text/html.
- HTML document served with no doctype.
- XHTML document served with correct mimetype.
The important part of the test page is quite simple:
<!DOCTYPE html> <html> <head> <title>Testing nodeName</title> </head> <body> <div id="test"> <div></div><DIV></DIV> <section></section><SECTION></SECTION> </div> </body> </html>
and the test cases are as follows:
HTML
Accesses the HTML elements that were originally included the page (should be case insensitive).
runTest("HTML", function(){ return document.getElementById("test").childNodes; });
HTML createElement
Creates new DOM elements using the same document as the page in which it was shipped (should be case insensitive).
runTest("HTML createElement", function(){ return [ document.createElement("div"), document.createElement("DIV"), document.createElement("section"), document.createElement("SECTION") ]; });
innerHTML
Attempts to inject the elements using .innerHTML
(should be case insensitive).
runTest("innerHTML", function(){ var test = document.getElementById("test"); test.innerHTML = "<div></div><DIV></DIV>" + "<section></section><SECTION></SECTION>"; return test.childNodes; });
For the remaining tests I grab a simple XML document:
<?xml version="1.0" encoding="UTF-8"?> <test> <div></div><DIV></DIV> <section></section><SECTION></SECTION> </test>
like so:
var xhr = window.XMLHttpRequest ? new XMLHttpRequest() : new ActiveXObject("Microsoft.XMLHTTP"); xhr.open("GET", "test.xml", false); xhr.send(null); var xml = xhr.responseXML;
XML
Test the elements in the XML document directly (should be case sensitive).
runTest("XML", function(){ return xml.documentElement.childNodes; });
XML createElement
Same as the HTML createElement but done using the XML document (should be case sensitive).
runTest("XML createElement", function(){ return [ xml.createElement("div"), xml.createElement("DIV"), xml.createElement("section"), xml.createElement("SECTION") ]; });
HTML via importNode
This clones the nodes from the XML document, using importNode
, and places them into the HTML document (should be case sensitive).
runTest("HTML via importNode", function(){ var test = document.getElementById("test"); while ( test.firstChild ) { test.removeChild( test.firstChild ); } var nodes = xml.documentElement.childNodes, node; for ( var i = 0; i < nodes.length; i++ ) { node = document.importNode( nodes[i], false ); test.appendChild( node ); } return test.childNodes; });[/js] <strong>HTML via adoptNode</strong> This moves the nodes from the XML document, using <code>adoptNode</code>, and places them into the HTML document (should be case sensitive). [js]runTest("HTML via adoptNode", function(){ var test = document.getElementById("test"); while ( test.firstChild ) { test.removeChild( test.firstChild ); } var nodes = xml.documentElement.childNodes, node; while ( nodes.length ) { node = document.adoptNode( nodes[0] ); test.appendChild( node ); } return test.childNodes; });
The Results
I ran the following tests in IE 6, IE 7, IE 8, Firefox 3.5, Safari 4.0.3, Chrome 3.0.195, and Opera 10.10. Additionally I tested against .tagName
in addition to .nodeName
and found no discernible difference (you can run your own .tagName
tests by appending a ?tagName to any test URL like so.)
Note: The HTML 5, XHTML (served as HTML), and no-doctype pages all behaved identically to each other in every browser – thus I’m just going to not display the XHTML (as HTML) and no-doctype results as there wouldn’t be anything interesting to show.
Firefox, Safari, and Chrome all yielded the same results here: Bringing in elements from an external document maintains the case sensitive nature of the .nodeName
property – which is unexpected.
<div> | <DIV> | <section> | <SECTION> | |
---|---|---|---|---|
HTML | DIV | DIV | SECTION | SECTION |
HTML createElement | DIV | DIV | SECTION | SECTION |
innerHTML | DIV | DIV | SECTION | SECTION |
XML | div | DIV | section | SECTION |
XML createElement | div | DIV | section | SECTION |
HTML via importNode | div | DIV | section | SECTION |
HTML via adoptNode | div | DIV | section | SECTION |
Internet Explorer fails in a different manner. To start, Internet Explorer doesn’t support importNode
or adoptNode
so those particular tests simply don’t run. However we can confirm that the case sensitivity of the unknown HTML 5 element is maintained in HTML, even though it shouldn’t be.
<div> | <DIV> | <section> | <SECTION> | |
---|---|---|---|---|
HTML | DIV | DIV | section | SECTION |
HTML createElement | DIV | DIV | section | SECTION |
innerHTML | DIV | DIV | section | SECTION |
XML | div | DIV | section | SECTION |
XML createElement | div | DIV | section | SECTION |
HTML via importNode | Error: Object doesn’t support this property or method | |||
HTML via adoptNode | Error: Object doesn’t support this property or method |
Opera ups the ante one further: Since it attempts to simultaneous follow web standards, and implement Internet Explorer’s weird quirks, it both fails the importNode
/adoptNode
and the HTML 5 unknown element cases.
<div> | <DIV> | <section> | <SECTION> | |
---|---|---|---|---|
HTML | DIV | DIV | section | SECTION |
HTML createElement | DIV | DIV | section | SECTION |
innerHTML | DIV | DIV | section | SECTION |
XML | div | DIV | section | SECTION |
XML createElement | div | DIV | section | SECTION |
HTML via importNode | div | DIV | section | SECTION |
HTML via adoptNode | div | DIV | section | SECTION |
XHTML (served with correct mimetype)
Nearly every browser that supported showing this page (Firefox, Safari, Opera, Chrome) displayed the same, expected, results:
<div> | <DIV> | <section> | <SECTION> | |
---|---|---|---|---|
HTML | div | DIV | section | SECTION |
HTML createElement | div | DIV | section | SECTION |
innerHTML | div | DIV | section | SECTION |
XML | div | DIV | section | SECTION |
XML createElement | div | DIV | section | SECTION |
HTML via importNode | div | DIV | section | SECTION |
HTML via adoptNode | div | DIV | section | SECTION |
An XHTML page served properly is just an XML document – thus the case of elements is sensitive (as to be expected).
… except in Opera. Opera apparently will treat div elements case insensitively, when injected using .innerHTML
, even if it’s being served within an XHTML document.
<div> | <DIV> | <section> | <SECTION> | |
---|---|---|---|---|
HTML | div | DIV | section | SECTION |
HTML createElement | div | DIV | section | SECTION |
innerHTML | DIV | DIV | section | SECTION |
XML | div | DIV | section | SECTION |
XML createElement | div | DIV | section | SECTION |
HTML via importNode | div | DIV | section | SECTION |
HTML via adoptNode | div | DIV | section | SECTION |
Update: XHTML as XML Tests
Based upon some suggestions in the comments I’ve run some additional tests. Namely I tested the loading of an XML document that has the correct XHTML namespace attached to it (specifically I used the same XHTML test page that I used for the other tests, just appending a .xml extension instead of .xhtml). The results are rather interesting – and promising, at least. (Note: Internet Explorer continues to fail as it doesn’t have an adoptNode/importNode method.)
Firefox continues to fail the importing of XML nodes, even when they’re coming from an XML document:
<div> | <DIV> | <section> | <SECTION> | |
---|---|---|---|---|
HTML | DIV | DIV | SECTION | SECTION |
HTML createElement | DIV | DIV | SECTION | SECTION |
innerHTML | DIV | DIV | SECTION | SECTION |
XML | div | DIV | section | SECTION |
XML createElement | div | DIV | section | SECTION |
HTML via importNode | div | DIV | section | SECTION |
HTML via adoptNode | div | DIV | section | SECTION |
XML (XHTML) | div | DIV | section | SECTION |
XHTML via importNode | div | DIV | section | SECTION |
As does Opera:
<div> | <DIV> | <section> | <SECTION> | |
---|---|---|---|---|
HTML | DIV | DIV | section | SECTION |
HTML createElement | DIV | DIV | section | SECTION |
innerHTML | DIV | DIV | section | SECTION |
XML | div | DIV | section | SECTION |
XML createElement | div | DIV | section | SECTION |
HTML via importNode | div | DIV | section | SECTION |
HTML via adoptNode | div | DIV | section | SECTION |
XML (XHTML) | div | DIV | section | SECTION |
XHTML via importNode | div | DIV | section | SECTION |
BUT both Safari and Chrome PASS on the importing of XHTML nodes, coming from an XML document:
<div> | <DIV> | <section> | <SECTION> | |
---|---|---|---|---|
HTML | DIV | DIV | SECTION | SECTION |
HTML createElement | DIV | DIV | SECTION | SECTION |
innerHTML | DIV | DIV | SECTION | SECTION |
XML | div | DIV | section | SECTION |
XML createElement | div | DIV | section | SECTION |
HTML via importNode | div | DIV | section | SECTION |
HTML via adoptNode | div | DIV | section | SECTION |
XML (XHTML) | div | DIV | section | SECTION |
XHTML via importNode | DIV | DIV | SECTION | SECTION |
This, in particular, is great news. It means that, at least, one browser understands the concept of loading in external (X)HTML into an HTML document and having it continue to work. It’s unfortunate that it doesn’t work in all browsers, though.
Conclusion
What can we learn from all of this? Unfortunately it appears as if we can’t really trust our “trusted” rules about .nodeName
case sensitivity for HTML documents. XML documents are completely safe and work as expected. XHTML (served with the correct mimetype) documents are nearly safe, save for the one bizarre Opera bug.
How will this change the code that we write? In short we can no longer trust the case insensitive nature of HTML documents – we need to assume that BOTH HTML and XML documents will be serving their content in a case sensitive nature – especially as more people start to adopt HTML 5 elements in their pages and expect some level of support in older browsers. This means that a number of selectors and DOM methods will take a performance hit as we can no longer take a case insensitive shortcut in our codebases.
There are a few outstanding jQuery tickets that are the result of these issues cropping up and now that I know the reasoning behind why they’re happening I can now strip out all the case-insensitive performance improvements from the codebase – which is really quite unfortunate but at least it’ll behave more consistently. I continue to stand by thesis from my earlier talk about the DOM: The DOM is a mess and every DOM method and property is broken in some way, in some browser.
Diego Perini (November 24, 2009 at 7:21 pm)
John here are two links that maybe will help in the process:
http://rakaz.nl/item/css_selector_bugs_case_sensitivity
the second one is the current HTML5 work:
http://www.whatwg.org/specs/web-apps/current-work/#selectors
you can find the table of the attributes that needs special handling.
You don’t have all the point about why this is needed, in addition to what you said for example it is necessary for SVG to work in HTML documents to use the namespaced API. So createElementNS() is needed for an SVG to work in HTML document.
You can look in my repo for other needed references on this quite intricated matter.
Hope these additions will be useful for your project as they were in mine, the overhead is not that much as you said. More of a lazy thing.
Mike Taylor (November 24, 2009 at 8:10 pm)
So does this mean if I type in shouty caps Sizzle will be happy?
Mike Taylor (November 24, 2009 at 8:34 pm)
Re: my last comment (should have tested before I commented), it appears that Sizzle does in fact with with uppercase unknown tags. Here’s a test page: http://miketaylr.com/test/html5pseudo_CAPS.html
Mook (November 24, 2009 at 9:00 pm)
But you’re importing nodes from the null namespace, aren’t you? What happens if your imported elements are in the XHTML namespace? What happens if they’re SVG elements instead? If you called createElementNS instead of createElement?
I have no idea what the _right_ answers are, just what things might bring different answers. I was also under the impression that HTML5 has.. _something_ to do with namespaces, but I can’t tell what section 9.3 is (it’s missing). That might affect things as well?
Nate Cavanaugh (November 25, 2009 at 12:50 am)
Heya John, just out of curiosity what kind of performance difference is there? Is it a significant impact or is it under loads on edge cases?
And is it a change that memoization would help or are the property lookups more expensive than the toUpper/toLower methods are? Namely would caching based on the most common result (uppercase), while still allowing for safety for the edge cases help at all or is it not practically helpful in real performance tests?
Antonello Pasella (November 25, 2009 at 3:31 am)
Why not insert a new propery in jQuery.support or directly in Sizzle?
$.support.nodeNameUppercase = (document.createElement(‘fOo’).nodeName == ‘fOo’);
Pete B (November 25, 2009 at 4:30 am)
For some reason I never trusted the nodeName property to always be the same case in all situations.
Neil Rashbrook (November 25, 2009 at 6:21 am)
If you create an XHTML div element in a XUL document then it is case sensitive, but if you import that node into an HTML document then it becomes upper case, the same as if you created it in the HTML document.
Antonello Pasella (November 25, 2009 at 7:06 am)
Correction
$.support.nodeNameUppercase = (document.createElement(‘fOo’).nodeName === ‘FOO’);
zcorpan (November 25, 2009 at 9:18 am)
The spec says
“Element.tagName and Node.nodeName
These attributes must return element names converted to ASCII uppercase, regardless of the case with which they were created.”
This means that a created in an XML document and moved to an HTML document, nodeName should return “DIV”.
Also note “This does not apply to … elements that are not in the HTML namespace despite being in HTML documents.”, which means that and in HTML will have lowercase nodeName.
John Resig (November 25, 2009 at 11:06 am)
@Mook, Neil, zcorpan: I did some more testing and found some interesting (at least for importing elements from an XML document). In that case you’ll have to specify the XHTML namespace on your XML. I’ve provided more information up in the blog post.
@Antonello: Yes, that would detect that specific problem, for Internet Explorer, but it wouldn’t detect the additional issues with the other browsers (when importing XML into the document). Since every browser has some issue here it’s probably best just to back off and not deal with the issue directly.
@Nate: Unfortunately caching doesn’t really make sense in this situation (and is bound to cause some problems, I’m sure). As it stands the overhead isn’t “that much” but it’s enough that it adds up in the end – running it across a couple hundred nodes will result in a couple millisecond performance hit.
Diego Perini (November 25, 2009 at 11:40 am)
John,
this is what I have in my NWMatcher:
// checks if nodeName comparisons need to be uppercased
TO_UPPER_CASE = typeof doc.createElementNS == 'function' ?
'.toUpperCase()' : '',
// filter IE gEBTN('*') results containing non-elements
SKIP_NON_ELEMENTS = BUGGY_GEBTN ?
'if(e.nodeName.charCodeAt(0)<65){continue;}' : '',
John Resig (November 25, 2009 at 11:50 am)
@Diego: Unfortunately, as you can see from the above test results, that check is not sufficient enough to handle the cases that I’ve outlined. It won’t handle the unknown HTML 5 elements in IE, for example.
Vincent D. (November 25, 2009 at 11:59 am)
Hi John,
and what about using “XHTML5 ?!”
means XML prolog + HTML5 doctype?
////
………
thank you very much,
John-David Dalton (November 25, 2009 at 5:11 pm)
John, even though these issues exist I don’t think its a framework problem. Unsupported elements in IE can be rendered incorrectly causing nodeNames like “/video”. IE doesn’t support XHTML with proper mime-types. Styling invalid elements won’t add functionality to things like <video> and other elements. So this boils down to trying to support unsupported tags (which is like chasing your tail) and proper XML usage. Because importing nodes isn’t cross browser it is pretty edge case as well. It comes down to the dev understanding XML is case-sensitive, HTML is not.
John Resig (November 25, 2009 at 6:13 pm)
@John-David: In order to avoid the bogus tags in IE you need to use the HTML 5 Shiv technique that I mentioned above – once you do that the elements begin to render more-correctly (note that I use the technique on the test pages, as well).
I’m not entirely sure where the comments about XHTML (with proper mimetype) is coming up though. The problem that I outlined occurs in normal HTML pages in browsers that use adoptNode/importNode – which has nothing to do with XHTML, per se.
I do agree that it can be frustrating to try and support this from a framework level but the greater amount of use cases that you support the more likely it’ll be that someone will use the framework for all of their development, rather than just for some. If you can use jQuery to develop a Firefox extension, a web site, an SVG app, an embedded widget, and a mobile app – without changing any code – that’s incredibly enticing.
John-David Dalton (November 25, 2009 at 7:29 pm)
The IE-XHTML comment was showing that IE can be taken out of that use-case example. Sure using the shim you can style some elements but it won’t add functionality to those like the video element so it’s still not a __real__ option for IE. So removing unsupported elements from the equation the nodeName issue isn’t really a concern. You can treat HTML as case-insensitive and XML as case-sensitive. So devs will know to use `doc.getElementsByTagName(‘MiXedCase’)` when in XHTML instead of `doc.getElementsByTagName(‘mixedcase’);`. The framework won’t have much to do with it.
I don’t think it’s frustrating to support this, I think it’s unnecessary. Avoiding a ton of `.nodeName.toUpperCase() === …` is ideal.
Diego Perini (November 26, 2009 at 9:56 am)
W3C browser also have ‘getElementsByTagNameNS’ to help in these tasks.
IE doesn’t recognize ‘application/xhtml+xml’ headers and make it just dependent on the filename extension, mostly like Opera.
Firefox and Safari work correctly with the “xhtml.xhtml” test and Opera only has a bug with innerHTML as showed by your tests other extensions are also accepted if correct headers are sent from the server.
HTML is not the type of document to be used to do such imports even if some browser seems to accept that, parsed code and final representation of the fragments may differ from source.
So the black ship in this game is surely IE, and that is not something new. IE8 is quite fresh but has the exact same problems, this is Microsoft responsibility not frameworks failing/missing.
Henri Sivonen (November 27, 2009 at 5:02 am)
In browsers that have HTML5-compliant Element.localName (Firefox 3.6 and various WebKit-based browsers but *not* Firefox 3.5), you may rely on .localName returning in the true DOM-internal case. Also, in browsers that have HTML5-compliant .namespaceURI, if you import nodes from XML to HTML, the nodes need to have the namespace URI http://www.w3.org/1999/xhtml in order to have HTMLness.
Diego Perini (November 27, 2009 at 6:51 am)
Henri,
thank you for the extra details.
I noticed FF 3.5.5 also has a working “.localName” property on elements but since you said Firefox 3.6, maybe these lower versions still have bugs.
Henri Sivonen (November 27, 2009 at 10:32 am)
Diego, .localName and .namespaceURI in text/html exist in Firefox 3.5 but they aren’t HTML5-compliant. In Firefox 3.6 they have been fixed to be HTML5-compliant.
John-David Dalton (November 28, 2009 at 3:33 am)
I typo’d by last comment, should have read “…ame(‘MiXedCase’)` when in XML instead”.
Something else interesting is XHTML should always use lowercase element names so import/adopting nodes that aren’t is technically invalid XHTML.
http://www.w3.org/TR/xhtml1/#h-4.2
http://www.w3.org/TR/DOM-Level-2-HTML/html.html#ID-5353782642-h2
Also import/adopting elements from non-like-type documents should be avoided because of various bugs/spec vagueness. Import/adopting between like-type documents (ex: HTML file A and HTML file B) should be OK though.
http://tinyurl.com/yg3uqc4 (reference.sitepoint.com)
For importing into html/xhtml from non-like-type documents you might try something based on the import solution by Anthony Holdener. http://gist.github.com/244442
John-David Dalton (November 29, 2009 at 3:00 am)
In addition Brad Neuberg points out that the IE shim hack for unsupported elements can’t support nested tags.
Brad’s Post:
http://ajaxian.com/archives/adding-custom-tags-to-internet-explorer-the-official-way
Example:
http://jsbin.com/iwexu
Also the IE shim can produce nodeNames in uppercase if written in uppercase.
http://jsbin.com/ehoto
So again this should not be a framework concern. The nodeName code optimizations can stay :D
Michael Bolin (December 9, 2009 at 12:29 pm)
This is some great work! Have you considered porting your tests over to Steve Souders’s Browserscope project (http://www.browserscope.org/)? Then you could distribute the cross-browser testing issue!
Sherwin Shine (January 27, 2010 at 2:52 pm)
XML documents are safe. DOM methods broken in many web browsers
khs4473 (March 6, 2010 at 9:58 am)
This is an interesting question from a library design standpoint. In any case, the nodeName comparison can be optimized:
// NOTE: "test" is always input as uppercase! e.g. tagEQ(e.nodeName, "DIV")
function tagEQ(nodeName, test)
{
return nodeName === test || (
nodeName.length === test.length &&
(nodeName.charCodeAt(0) & 0x5F) === test.charCodeAt(0) &&
nodeName.toUpperCase() === test
);
}
I’ve only tested the function is Safari 4 but the only case where the performance is as bad as always converting to lower case is when the input values are the same length and start with the same character but are actually different, e.g. tagEQ(“TD”, “TH”).
khs4473 (March 6, 2010 at 9:42 pm)
Using some dynamic programming will probably give us better results, though:
var tagEQ2 = (function()
{
var uc = {};
return function(nodeName, test)
{
return nodeName.length === test.length && (
nodeName === test ||
(uc[nodeName] || (uc[nodeName] = nodeName.toUpperCase())) === test
);
};
})();
John-David Dalton (March 21, 2010 at 10:56 am)
I noticed my previous example referencing a claim by Brad Neuburg at
http://jsbin.com/iwexu was invalid.
Brad also raises concern on another post:
http://remysharp.com/2009/01/07/html5-enabling-script/comment-page-2/#comment-142631
But I haven’t been able to reproduce an issue at the moment.
The IE createElement() technique will have issues if any new elements have an empty content model.
http://bit.ly/9ACzSy
seo (May 19, 2010 at 1:12 pm)
Hi John,
and what about using “XHTML5 ?!”
means XML prolog + HTML5 doctype?
just bookmarked it for later reference
Michael (June 22, 2010 at 10:23 am)
Possibly dumb questions, but:
Would it be possible to implement both case-insensitive and case-sensitive selector engines, and allow users to opt-in to the faster one? And/or detect the doctype+browser and default to the case-insensitive when possible?
What happens with attribute names?
inlux (July 12, 2010 at 1:28 pm)
Awesome blog post. I found your blog while searching on Google. Its really awesome having fully informative posts. I love read such type of valuable post. I’m gonna bookmark your blog. Thanks for all up to date information.