Categories
Uncategorized

No escape() from JavaScript

A couple of days ago, we got caught out by a few encoding issues in a site at $WORK. The Perl related ones were fairly self explanatory and I’d seen before (e.g. not calling decode_utf8() on the query string parameters). But the JavaScript part was new to me.

The problem was that we were using JavaScript to create an URL, but this wasn’t encoding some characters correctly. After a bit of investigation, the problem comes down to the difference between escape() and encodeURIComponent().

#escaper {
border: 1px dotted black;
text-align: center;
}
#escaper thead {
background: #eee;
}
#escaper thead th {
border-bottom: 1px solid black;
}
#escaper td, #escaper th {
padding: .25em;
}

input escape(…) encodeURIComponent(…)
a&b a%26b a%26b
1+2 1+2 1%2B2
café caf%E9 caf%C3%A9
Ādam %u0100dam %C4%80dam

The last is particularly troublesome, as no server I know of will support decoding that %u form.

The takeaway is that encodeURIComponent() always encodes as UTF-8 and doesn’t miss characters out. As far as I can see, this means you should simply never use escape(). Which is why I’ve asked Douglas Crockford to add it as a warning to JSLint.

Once we switched the site’s JavaScript from escape() to encodeURIComponent(), everything worked as expected.

2 replies on “No escape() from JavaScript”

FWIW, IntelliJ IDEA warns me of this problem (encode being deprecated). Consequently, I haven’t used encode() in a long time because of it.

Thanks — that’s useful to know, particularly now IntelliJ is freely available. But I really do feel that it should be part of JSLint as so many people rely on it…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s