John Resig - Thoughts on querySelectorAll

Thoughts on querySelectorAll

I don’t think there’s a single JavaScript developer who isn’t excited about the new Selectors API specification (and the upcoming implementations). I’ve been following the progress of the specification (and implementations) and have been asked to provide some feedback to the Web API working group. What follows is an email that I sent to the public-webapi mailing list.

I just wanted to quickly pull together some of my thoughts concerning querySelectorAll. I’ve been asked by a number of people to provide my feedback here. Please forgive me if I’ve missed some previous discussions on the subject matter.

There’s three major points that I wanted to discuss:

DOMElement.querySelectorAll returning incorrect elements

This is the most critical issue. As it stands DOM Element-rooted queries are borderline useless to libraries – and users. Their default behavior is unexpected and confusing. Demonstrated with an example, using Dojo:

  <div><p id="foo"><span></span></p></div>
  <script src="http://o.aolcdn.com/dojo/1.1.0/dojo/dojo.xd.js"></script>
  <script>
  var foo = document.getElementById("foo");
  // should return nothing
  alert( dojo.query('div span', foo).length );
  // will return the SPAN (booo!)
  alert( foo.querySelectorAll('div span').length );
  </script>

The demo can be run online here:
https://johnresig.com/files/bugs/qsa-root/dojo.html

This is due to the fact that element-rooted queries are handled by “finding all the elements that match the given selector — rooted in the document — then filtering by the ones that have the specified element as an ancestor.” This is completely unacceptable. Not only is it not intuitive (finding elements that don’t match the correct expression) but it goes against what every single JavaScript library provides. If there behavior were persisted then there would be serious ramifications for the usefulness of this function in the wild.

I asked some of the other library developers what their thoughts were and they agreed with my conclusion.

Andrew Dupont (creator of Prototype’s selector engine):

My issue with this is that it violates principle of least surprise and bears no resemblance to the APIs in the wild.

Alex Russell (creator of Dojo’s selector engine):

This is a spec bug.

Combinator-rooted Queries

I read about some prior discussion concerning this (especially in relation to DOMElement.querySelectorAll-style queries). This is an important part of most libraries, as it stands. Maciej’s proposed solution of using :root to allow for front-leading combinators is perfectly acceptable to me (where :root is made equivalent to the element, not the document element).

  // jQuery
  $("#foo").find("> span");
  
  // DOM
  document.getElementById("foo").querySelectorAll(":root > span")

This is something that a library can easily detect and inject.

Error-handling

I’m perfectly fine with the proposed try/catch solution however there must be a way of easily determining what the invalid portion of the selector was. Currently the following occurs in Safari:

  try {
    document.querySelectorAll("div:foo");
  } catch(e) {
    alert(e); // "Error: SYNTAX_ERR: DOM Exception 12"
  }

If there were extra properties to point to what the inappropriate selector was, that’d be fundamentally important. Probably the best solution (for both implementors and JavaScript library authors) would be to simply provide a character index, working something like the following:

  var selector = "div:foo";
  try {
    document.querySelectorAll(selector);
  } catch(e) {
    alert(selector.slice(e.position)); // ":foo"
  }

The resulting solution in most libraries would then look something like (of course different caching could take place, as well):

  try {
    results = document.querySelectorAll(selector);
  } catch(e) {
    results = filterQuery(
      document.querySelectorAll( selector.slice(0, e.position) ),
      selector.slice(e.position)
    );
  }

There will be some form of a performance hit here but I think, if done correctly, it’ll be negligible (especially in comparison to the benefits that are being received).

I hope these proposed changes work well for the members of this group as they will greatly benefit general web developers – and especially library developers.

Posted: April 30th, 2008

Subscribe for email updates

15 Comments (Show Comments)

Paul Bakaus (April 30, 2008 at 4:17 pm)

Makes perfect sense to me.
Borgar (April 30, 2008 at 5:10 pm)

A while back I wrote a selector engine.

During testing and adding querySelectorAll support I filed a bug against this behavior with WebKit. For the obvious reason that it was doing something unexpected.

I had to read the spec a few more times until it finally sunk through. Why why why would they do this? (I would still like to know why and how this conclusion was reached).

I foresee that this will cause a great many more bug reports.

Also ironically, since we’re basically stuck with mutating the selector (I use selector.replace(/(^|,)\s*/g,'$1#'+context.id+' ');) error reporting back to user becomes slightly presumptuous.

Buy hey, on the bright side, MSIE followed to the spec on this one.
John Resig (April 30, 2008 at 5:48 pm)

Note: There’s an active discussion occurring over on the webapi mailing list concerning my suggestions:
http://lists.w3.org/Archives/Public/public-webapi/2008Apr/thread.html#msg251
Sean Hogan (April 30, 2008 at 8:22 pm)

I suspect that finding querySelectorAll intuitive or not depends on how you read css selectors.

If “div span” is read as “find div elements and for each of those find descendant span elements (exclude duplicates)” then you might expect foo.querySelectorAll(“div span”) to find div descendants of #foo and then span descendants of the divs.

If “div span” is read as “find span elements that are descendants of div elements” then you might expect foo.querySelectorAll(“div span”) to find descendants of #foo that are span elements and descendants of div elements.

The second way matches the way the w3 selectors spec is written.

None of this is to say the current way is right (or even preferable).
voracity (April 30, 2008 at 9:45 pm)

HTML et al. seem to have major problems with modularity — and it harms our code’s simplicity, security and independence.

The global scope is not special — it’s just another node in the hierarchy. Same with the document scope.
Jake Archibald (May 1, 2008 at 8:37 am)

I believe that Firefox processes selectors in reverse when it applies CSS rules. I could be completely wrong, but if this is the trend of the browser manufacturers it might explain why the spec went that way.

Have there been any experiments writing a selector query in JS this way? Could it be faster?
Borgar (May 1, 2008 at 3:54 pm)

@Jake Archibald:

I’m pretty sure NWMatcher processes selectors left-to-right, though I could be wrong. It is very fast on most browsers.
Johan SundstrÃ¶m (May 1, 2008 at 3:55 pm)

From my perspective, the W3C-suggested behaviour is useful too, though quite likely for a much less frequent use case – finding all nodes below your context node, that gets CSS rules from the given selector applied.

Don’t knock that use case as completely useless just because it is not the problem you to use querySelectorAll for. You are most likely quite correct it is of much more marginal interest, relative to the use case of finding nodes where all parts of the selector are found below the context node, however, and that the API consumer should rather have to walk that extra mile for the uncommon use case, instead of making it the default behaviour.
Rob (May 2, 2008 at 7:37 am)

I may well be missing something here but why not just use xpath?
John Resig (May 2, 2008 at 1:07 pm)

@Rob: querySelectorAll is orders-of-magnitude faster than XPath.
Tom (May 2, 2008 at 1:07 pm)

Rob, in IE (6, at least), XPath doesn’t work for plain HTML. Also, it’s nice to use the same query language from scripts as for CSS.
Mariusz Nowak (May 2, 2008 at 3:05 pm)

@Rob
It think it went that way because developers are much more familiar with Selectors (cause of CSS) and it’s not that easy to dive into XPath for novice.
But indeed XPath is much more powerful and I miss ability to query in any direction with Selectors – it’s a big drawback.
Concerning speed difference between both I’m not sure whether it really counts at that level of implementation (?)

@John
When I read above article at first I agreed with every word but after I read mailing list I’m not sure. After all the way they made it makes sense, just adding ‘:scope’ (not using ‘:root’) would help.
Due to limitation of Selectors there always will be some limits, like – get first previous div sibling – sounds simple but we can’t do that with Selectors.
By the way – how do you deal with such limits in your work? Have you thought of some ‘reverse Selectors’ engine for use in JavaScript? or you just iterate DOM tree for such cases.
Tino Zijdel (May 3, 2008 at 8:05 pm)

John, wasn’t it you that criticized other libraries on paying to much attention on performance of very uncommon uses of query-selectors? Are you now seeking for means to rebute them through this new API?

Although selector-performance may be an issue in some cases I’d say that focussing on something that is usually not an issue at all in common pages, or focussing on core JS performance in general when it’s not an issue compared to the real performance bottle-neck (being rendering of DOM-changes) is just not very usefull.

I’m working on some benchmarks that test actual browser dynamic-render-performance; if you have some ideas to share I’m all ears.
Rasmus (May 6, 2008 at 8:17 am)

The naming seems a bit Monty Pythonesque:

“And now for somwthing completely different…”

Given getElementsByTagName and getElementById you’d think something like getElements and getElement would be next.

“querySelectorAll” doesn’t even make sense; 3 words thrown together! When I first came across getElementsByTagName I thought “what a long-winded method name” – but I immediately understood what it was for!

Naming a method should optimally result in a short descriptive name (in a perfect world). There are lots of examples of short abstractly named methods out there and examples of long paint-me-a-picture named methods. But why would you choose the worst of both worlds – short and abstract?

It’s like getting a porsche – but pink and brown striped… but it’s a porsche though.;)
AsbjÃ¸rn Ulsberg (May 8, 2008 at 8:19 am)

Rasmus, I completely agree. querySelectorAll is a horrible name. But it was discussed for a long while and finally an internal W3C-member only vote was cast which declared querySelectorAll as the winner. I don’t think it won by large margin, but it was more like the option that got less negative votes.

I also agree with your comments, John. The only way I think querySelectorAll to make sense when rooted at an Element, is how it’s implemented in jQuery. The algorithm proposed by W3C isn’t consistent with anything deployed in the wild and not even with W3C’s own CSS selector proposals. Thus, foo.querySelectorAll('span a') should only return a elements that are descendants of span elements that are descendants of the element foo.

How this is implemented in browsers is pretty uninteresting and whether this is less optimal than the existing CSS algorithms (that I presume are all rooted in the document object before evaluation) is of less importance than whether it will be faster than a JavaScript-only solution. As long as it’s faster than what we have, who cares whether it’s 53ms slower than the optimal implementation or not?

Comments are closed.
Comments are automatically turned off two weeks after the original post. If you have a question concerning the content of this post, please feel free to contact me.