No escape() from JavaScript

A couple of days ago, we got caught out by a few encoding issues in a site at $WORK. The Perl related ones were fairly self explanatory and I’d seen before (e.g. not calling decode_utf8() on the query string parameters). But the JavaScript part was new to me.

The problem was that we were using JavaScript to create an URL, but this wasn’t encoding some characters correctly. After a bit of investigation, the problem comes down to the difference between escape() and encodeURIComponent().

input escape(…) encodeURIComponent(…)
a&b a%26b a%26b
1+2 1+2 1%2B2
café caf%E9 caf%C3%A9
Ādam %u0100dam %C4%80dam

The last is particularly troublesome, as no server I know of will support decoding that %u form.

The takeaway is that encodeURIComponent() always encodes as UTF-8 and doesn’t miss characters out. As far as I can see, this means you should simply never use escape(). Which is why I’ve asked Douglas Crockford to add it as a warning to JSLint.

Once we switched the site’s JavaScript from escape() to encodeURIComponent(), everything worked as expected.

Comments 2

  1. David Smiley wrote:

    FWIW, IntelliJ IDEA warns me of this problem (encode being deprecated). Consequently, I haven’t used encode() in a long time because of it.

    Posted 22 Oct 2009 at 05:10
  2. dom wrote:

    Thanks — that’s useful to know, particularly now IntelliJ is freely available. But I really do feel that it should be part of JSLint as so many people rely on it…

    Posted 22 Oct 2009 at 11:36