Closures

One of the most common questions asked on StackOverflow is one about Closures.

The posted code looks like this:

for (var i = 0; i < 5; i += 1) {
    document.getElementById("button" + i").onclick = function (event) {
        alert(i);
    };
}

And the question is: why does it alert 6 for all the buttons?

The simple answer is: you must bind the value of i to the onclick function, since we’re in fact creating 5 functions. But, we don’t bind the value of i, they all share the same value – and by the time this onclick handler runs the value is 6.

for (var i = 0; i < 5; i += 1) {
    document.getElementById("button" + i").onclick = (function (bound_i) {
        return function (event) {
            alert(bound_i);
        };
    }(i));
}

With this anonymous function creator we bind the value of i. As you can see it doesn’t get in the way of the event argument.

If you use a framework that has a map or each function I recommend you use that instead:

array_map(buttons, function (button_id, index) {
    document.getElementById(button_id).onclick = function () {
        alert(index);
    };
});

The function creator is now implicit with the help of the map function.

Error handling in JavaScript: a better way

Let’s not avoid the elephant in the room: error handling in JavaScript is quite abysmal. I’ll try to outline what is wrong with it and present a better way.

The largest problem by far is that it’s hard to find a good example on how to do error handling in JavaScript. If you search for it you’ll probably find something to do with the window.onerror event or a script called stacktrace.js.

window.onerror

onerror is an event on the global object (window). Any uncaught error will appear as an onerror event. onerror has some very large problems:

  • onerror doesn’t give you much information about the error. It gives you the message, the file name and the line number. It doesn’t give you a stack, it doesn’t give you any kind of context
  • onerror triggers for all JavaScript errors occuring in that window. If a visitor of your site uses a browser plugin that runs some JavaScript, which throws an error, you will get that error in onerror. If you’re using a 3rd party script, such as an analytics tool or an ad supplier, you will get their uncaught errors too. There are of course ways to filter these errors but it’s not a very robust way of dealing with it.

These are strong indications that onerror is not the tool you should be using.

stacktrace.js

stacktrace.js takes a different approach to error handling. It doesn’t use onerror. Instead it uses the try-catch language feature. It tries to build a stacktrace by using a very crude form of reflection. It walks the caller/callee stack and toStrings functions to find their name and signature.

Again, this is a strong indication that we’re not using the right tool.


So what is the right tool? What are we missing?

We’re missing proper errors. “Exceptions” as in Java and PHP are pretty good, can we have those? Yes, actually, we can. JavaScript has the Error class. Let me explain what it is and why you should use it.

In JavaScript you can throw anything, since everything is an Object and there is no native concept like Java’s “Throwable” interface. That’s not necessarily a good thing. This is one of those cases where too much freedom can be a bad thing. I don’t necessarily believe JavaScript should enforce it, but you should at least enforce it yourself. My recommendation is: always throw things that are Errors, or something that inherits from it.

throw "Unexpected input"; // bad
throw new Error("Unexpected input"); // good

Inheritance in JavaScript is terribly messy so use a framework and get it out of the way. Inheriting from the Error class is a good idea if you want to add more information to your errors, or if you want to distinguish between different types. Example:

var ContextError = extend(Error, function (message, context) {
    this.message = message;
    this.context = context;
})
ContextError.prototype.getContext = function () {
    return this.context;
};

try {
    try {
        throw new ContextError("foo", { bar: "baz" });
    } catch (e) {
        if (e instanceof ContextError) {
            console.log(e.message, e.getContext());
        } else {
            throw e; // rethrow
        }
    }
} catch (e) {
    console.log(e.message);
}

It’s not rocket science. This is what you should be doing.

Another massive advantage of using Errors is that browser run-times are starting to augment it with all the tools we want. Errors have a stack property in modern browsers – I’ve confirmed it exists in Chrome, Firefox and IE10. This is a huge boon. Generating a readable stacktrace from the stack property is fairly trivial. I expect support for the Error class to grow in the future.

Chrome/v8

Chrome’s v8 engine goes one step further and tries to give you even better stacktraces. Error.prepareStackTrace allows you specify what the Error.stack property will look like.

Error.prepareStackTrace = function (error, stack) {
	return stack;
};

This gives you access to the full StackFrames that v8 supplies. http://code.google.com/p/v8/wiki/JavaScriptStackTraceApi

By default Chrome limits the length of the stacktrace to 10. You can increase the limit by:

Error.stackTraceLimit = 50;

There is one more thing I want to share.

If you’re using asynchronous functions and events (which you should) then you’ll notice that errors thrown in other events aren’t caught by your try-catch and if you do catch them your stacktraces are cut-off at the start of each event. This is a limitation in the way the try-catch works, it doesn’t automatically work with a language that has an event-loop. If do the following:

try {
    document.getElementById("a_link").onclick = function () {
        throw new Error("catch me if you can (you can't)");
    }
} catch (e) {
    console.log(e); // :<
}

The error thrown in the click event will not be caught by that try-catch statement. This is because it occurs in a different event.

My recommendation: wrap all event handlers in your own try-catch error handler. I’ve found there are actually only 3 places where new events happen: setTimout, setInterval and addEventListener.

function handle_error(error) {
    console.log("Gotcha: ", error.message);
}
function on (eventname, object, handler) {
    object.addEventListener(eventname, function (event) {
        try {
            handler(event);
        } catch (e) {
            handle_error(e);
        }
    });
}
on("click", document.getElementById("a_link"), function () {
    throw new Error("catch me if you can");
}

This allows you to catch all errors, but what about getting the stack from the parent event?

I have a solution for this, but it’s not a perfect solution. It leaks memory. The more events you register and the ‘deeper’ you go the more memory it needs. It does work though and luckily in most cases the memory requirement doesn’t cause any issues. That is the disclaimer, here is the code:

var wrap_try_catch = function () {
	var exception_stack = [];
	return function (func) {
		var live_exception_stack = exception_stack.concat([]);    // clone context_stack
		live_exception_stack.push(new Error("Capturing context before asynchronous call"));
		return function try_catch() {
			exception_stack = live_exception_stack;
			try {
				return func.apply(this, arguments);
			} catch (e) {
				var exception;
				if (typeof e === "string") {
					exception = new Error("String thrown: `" + e + "` throw a 'new Error' to get a better stacktrace");
				} else {
					exception = e;
				}
				handle_exception(exception_stack.concat([ exception ]));
				throw e;
			}
		};
	};
}());

// Usage:

function on(eventname, object, handler) {
    object.addEventListener(event_type, wrap_try_catch(handler));
}

wrap_try_catch makes an Error when you call it so you have access to a stack up to that point. For each new event it adds another Error on to the stack. When an Error occurs it combines that Error with the stack of Errors it has collected. This allows you to get a full stacktrace. Again, this leaks memory of course because it has to keep track of all these Error objects. It’s also not super fast as making an Error object is non-trivial (it takes some time). I’ve found no other/faster way to get a stack.

Cross Site Origin Requests aka Cross Origin Resource Sharing

Everybody knows you can’t do AJAX requests to other domains .. well actually, you can! This where CORS (Cross Origin Resource Sharing) comes in.

Normally, when you try to do a cross domain request your browser will catch it and throw an Error instead. This restriction is voluntarily imposed by the browser, technically you could do it. You’re not allowed to because there is a possibility for abuse and security. Operating on a whitelist basis is easier. By sending the rights headers you can convince the browser that: yes indeed, I am allowed to make this request.

Before an AJAX request is made to another domain your browser will initiate a pre-flight OPTIONS request. This request is part of the HTTP standard (the foresight these guys had was amazing). If the reply is favorable (ie. the right headers are present) it will then proceed to do the actual AJAX request.

Lets say your webservice is located on webservice.com/api and you are calling from my.domain.com. The header you should emit on webservice.com/api is:

Access-Control-Allow-Origin: http://my.domain.com

Note: Keep in mind that http and https are considered to be different domains.

Note: If you are using https make sure you certificate is valid. That is, it’s signed and verified or you’ve added it to your local whitelist. If not, the pre-flight request will silently fail.

Implementation

The first thing you might think of is emitting a header that just whitelists everything:

    header("Access-Control-Allow-Origin: *");

But do you really want to allow everything? No, of course not. You want a whitelist:

    header("Access-Control-Allow-Origin: http://my.domain.com https://my.domain.com http://my.otherdomain.com");

But now anyone doing an OPTIONS request can see what domains we support, ie. our whole whitelist. And especially if the whitelist grows, the header will be very very long. We can do better!

Look at the domain that’s calling, and emit just that domain:


if ($_SERVER["REQUEST_TYPE"] === "OPTIONS") { // special CORS track
    $allowed_domains = array("http://my.domain.com", "https://my.domain.com", "http://my.otherdomain.com");
    $calling_domain = get_calling_domain($_SERVER);
    if (in_array($calling_domain, $allowed_domains)) {
        header("Access-Control-Allow-Origin: " . $calling_domain);
    }
    exit; // no need to do anything else for OPTIONS request
}

The browser just uses the OPTIONS request to check the capabilities of your server, you don’t actually need to return any content.

Headers

Another useful header is:

Access-Control-Allow-Headers: Content-Type, X-Custom-Header

This allows you to pass extra headers. The cross site AJAX requests are very limited by default. If you want to send even a simple Content-Type header (like application/json) you need to explicitly whitelist it. You can send any header you want, but you must whitelist it. Content-Type is one you’ll want if you’re doing something with JSON or content that isn’t text/plain.

Cookies

If you have authentication on your webservice you’ll need to send some cookies, or at least a session identifier. You must specify this as well.

To allow sending of cookies emit the header:

Access-Control-Allow-Credentials: true

Additionally in your XMLHTTPRequest (JavaScript, on the calling domain) you must set:

xml_http_request.withCredentials = true;

This will allow you to use sessions and cookies.

Note: these cookies are considered third party cookies. If your visitors have disabled third party cookies this approach wont work and you must first coax the user into allowing third party cookies on your domain.

CORS for XMLHTTPRequest works in the latest version of Firefox and Chrome and since Internet Explorer version 8. The old ActiveXObject and XDomainRequest don’t support extra headers and cookies. JSONp is a decent fallback in this case, but beware of the limitations.

How to cancel events in JavaScript

If you’re writing client-heavy interactive JavaScript applications you’ll have to deal with events a lot. Most people who have come this far will have learned that you can return false at the end of an event to cancel it. But this is not the best way to go about it.

Sample code:

function my_onclick (event) {
	do_some_processing();
	return false;
}

The largest downside is: if an Error occurs in your event handler, the function will not return but throw an Error instead. This means your event is not cancelled. This is particularly embarrassing if the default action is something that you don’t want to have happen, such a submit on a form with an invalid action. You could surround your code with a try-catch statement but since there is a better solution, let’s not get into that.

Another disadvantage is that return false is not equivalent to event.preventDefault(). It also triggers event.stopPropagation(). Most of the time you only need event.preventDefault(). It’s worth noting that some event related features can not be built if you don’t distinguish between these two (see below).

So how do you do it?

Well, you call the right function for the right job. You can use event.preventDefault() to cancel the default behaviour. Prevent a page change when you click on a link. Prevent a submit on a form (AJAX anyone?). Prevent a focus on an input. Prevent a check on a checkbox.

You can use event.stopPropagation() to trap events; to prevent them from going up the DOM-tree. For instance: if you have a tooltip that you want to close when you click anywhere on the page, except not when you click on the tooltip itself. Add an onclick handler that does an event.stopPropagation() on the element that contains the tooltip and you’re done. This doesn’t break any functionality. Links still work in the tooltip, and other things with click events such as input elements. I bet you didn’t think it was that simple huh? Well, it is. This is why events are awesome.

Where did this come from? Why aren’t people just calling the right function? Honestly this isn’t so complex.

I think this might stem from a browser incompatibility. Internet Explorer has event.cancelBubble where other browers use event.stopPropagation. return false works in all browsers.

foreach in JavaScript

I assume you know about JavaScript’s for (key in object) loop and the function hasOwnProperty.

hasOwnProperty filters inherited properties. You might write your for-in-loop like this:

var my_obj = {
	foo: "bar"
};

function my_map(my_obj) {
	var key;
	for (key in my_obj) {
		if (my_obj.hasOwnProperty(key)) {
			console.log(key, my_obj[key]);
		}
	}
}

But there is a case that will fail. Can you see it? What do you think happens if I give it this object?

var my_obj = {
	hasOwnProperty: function () {
		return false;
	},
	"nothing to see here": "move along"
};

So what’s the solution?

function my_map(my_obj) {
	var key;
	for (key in my_obj) {
		if (Object.prototype.hasOwnProperty.call(my_obj, key)) {
			console.log(key, my_obj[key]);
		}
	}
}

Now isn’t that cool?

Readability vs Performance

A question that pops up often on StackOverflow is: which method is better x or y?

For example, which is better?

lazy-or

if ($a === "foo" || $a === "bar" || $a === "baz") {
    do_something();
} else {
    do_something_else();
}

in_array

if (in_array($a, array("foo", "bar", "baz")) {
    do_something();
} else {
    do_something_else();
}

There are several ways to approach this. A clever programmer might say: Faster is better right? Let’s benchmark! So he writes a script to accurately time the difference.
That’s not so easy as it may seem. The performance characteristics vary case-by-case.
What if you have not 3 but 10 values? What if you have 100? What if the first case is the most likely case and the other 2 only rarely get hit?

Let’s take a step back. In this example, does it really matter which method is faster? I would argue: no. Imagine your program has a bottleneck in it, you’re tasked with finding it and resolving it. If you saw either of these code samples, would you bother trying to see if the bottleneck is here? No, of course not. If you saw this code and you had to add a value would you take the time to convert one form to the other? Probably not (I wouldn’t).

That is why in almost all cases performance doesn’t matter. If your input is really really small (< 10) don't bother with optimization.
What you should bother with is readability. Which of these methods is more readable? I would say, in this case, it doesn’t really matter. If the list gets a bit bigger to 4 or 5 values then maybe the in_array one is preferred because it will fit on one line. If your list of allowed values is a dynamic list, because of a clever algorithm, then sure, go for the in_array variant. If you see the lazy-or variant and you need to add a 4th case, I wouldn’t object if you added another || case.

Most of the arguments I use are arguments that come from experience. Seeing a lot of code, seeing code that works and code that doesn’t. The most common way to get a bug is to write code that is easy to misunderstand or misuse. And even as a novice programmer you know this. What you don’t know is whether the code you’re currently writing is well readable or not. You haven’t yet learnt the warning signs of smelly code. A general rule of thumb: simpler is better. Simpler code is easier to write, easier to read, easier to debug, faster to write and it probably performs really well.