JavaScript: Asynchronous abortable long running tasks

First, each browser window has a single thread it uses to render the page, process events and execute JavaScript code. So when programming in JavaScript, you’re pretty much stuck in a single threaded environment.

This might sound surprising considering that everybody is using AJAX nowadays, which is actually short for  Asynchronous JAvascript + Xml. And we all know that pages using AJAX actually let you work on the page while the AJAX calls are executed.

What happens is that the server calls run in the background but as soon as you process the results, you’re back in the single per window thread. This means that you will only be able to process the results of the AJAX call if this thread is currently not busy doing something else. And it also means that even though the callback is executed asynchronously, it will block the thread and if what you process in there takes too long, the UI will become unresponsive.

unresponsive javascript

This all works with a queue. Every piece of JavaScript code is executed in the single thread. If other pieces of code have to be executed during this time, they will be queued and executed once the thread is free. This affects the AJAX callbacks, event listeners and code scheduled for execution using settimeout.

So what do we do it we need to process large pieces of data or perform some CPU intensive processing and still do not want the UI to freeze ? There are basically 3 types of solutions.

Perform heavy processing on the server

Since you can call the server and do something else while the browser waits for a response, the obvious choice is to move as much of the heavy processing to the server. The programming language you use for your server-side logic is probably able to perform this processing just as well as JavaScript would anyway. This is the solution you should use in all scenarios where it is possible. Unfortunately there are a few classes of problems which cannot be solved by moving processing from the client to the server:

  1. You are in an environment where the server has limited resources and you want to take advantage of client resources.
  2. Moving data to the server for processing and back to the client is prohibitively expensive.
  3. The server cannot perform this kind of processing because you need access to something which is only available on the client.

The use case I just had is related to point 3. My heaving processing involved DOM manipulation, cloning many HTML elements on the fly and setting many listeners. This is typically something you cannot do on the server because it has no access to the DOM on the client.

Using Web Workers

Web Workers allow you to run scripts in background threads. They run in an isolated OS-level thread. Isolated means here that they can only communicate with our single thread over messages passing serialized objects.

In order to avoid concurrency problems, the Web Workers have no access to non-thread safe components or the DOM.

I’ll only show a simple example of how to work with dedicated Web Workers. For more information, just ask Google.

Creating a web worker is very easy:

var worker = new Worker('myscript.js');

Normally once the worker is created, you will start it by posting a message to it:

worker.postMessage();

postMessage() is the mechanism used to communicate from your code to the worker. You can use it to send data to your worker:

worker.postMessage('How are you?');

For the communication from the worker back to your code, you have to register a listener (callback) which will be called when the worker posts a message:

worker.addEventListener('message', function(e) {
    console.log('Got a message from the worker: ', e.data);
}, false);

From the worker side, it works the same way. The worker can trigger your listener by using postMessage and receive the messages you’ve sent by registering a listener.

Of course in most cases the data posted in the message will not be simple strings like “How are you?” but be structured JavaScript objects (using a Command pattern).

So this is all great, but why did I mention that there is a third possibility if Web Workers provide all I need ? Because in my scenario, it was not possible to use them. So when does it happen:

  1. First, Web Workers are relatively new and not supported in all Browsers out there: Internet Explorer 8 and 9 do not support them. Unfortunately there are still a few companies out there which haven’t moved to newer and better browsers. So in some intranet scenarios, using Web Workers might not be an option.
  2. Second, and this is the reason why I couldn’t use them, Web Workers work in an isolated way and cannot perform any DOM manipulation. If your heavy processing is exactly that, then they won’t help you.

Splitting your processing in batches

In cases where you can neither move the processing to the server nor use web workers, you’ll have to split your processing in small batches and push each of these batches to the processing queue. Each batch should be fast enough that it doesn’t block the single JavaScript thread for a noticeable time. This means that having a batch take 300 milliseconds or more is not an option. Actually I try to keep the processing time for a batch around 50 milliseconds.

In my scenario, I had two processing steps which were both long running. Gathering a list for the first step didn’t take long, so this could be done immediately and the list for the second step was determined by the first processing step.

So these are the steps to be performed:

  1. Fetch data with an AJAX call to the server
  2. Process the data in a first step and gather a list for the second step
  3. Process the list in a second step

So the code for the first step looks like this:

 

function startProcessing() {
    if (window.firstStepAjax != null) {
        window.firstStepAjax.abort();
    }
    window.firstStepAjax = $.ajax({
        type: "POST",
        url: "xxx",
    }).done(function(data) {
        window.firstStepAjax = null;
	window.firstStepInput = data.GetObjectResult;
	if (window.firstStepTimer != null) {
	    clearTimeout(window.firstStepTimer);
	}
	window.firstStepTimer = setTimeout(processFirstStep, 0);
    });
    return false;
}

First we abort any existing AJAX call. If there is already a call going on, it means that the user changed some parameters before the call finished and we can discard the previous one.
Then we make the AJAX call and register a callback. In the call back we reset the reference to the AJAX call and get the data. Next step is to trigger the first processing step asynchronously and before that cancel any step which might be ongoing. We use a timeout of 0 milliseconds to start the processing as soon as this queue entry is reached. processFirstStep is a function performing the first processing step:

function processFirstStep() {
    for (var i = 0; i < 5; i++) {
        var data = window.firstStepInput.shift();
        if (data != null) {
            //process this piece of data
            //and gather a list of data to be processed in window.secondStepInput
        }
    }
    if (window.firstStepInput.length == 0) {
        if (window.secondStepTimer != -1) {
            clearTimeout(window.secondStepTimer);
        }
        window.secondStepTimer = setTimeout(processSecondStep, 0);
        window.firstStepTimer = null;
    } else {
        window.firstStepTimer = setTimeout(processFirstStep, 0);
    }
    return false;
}

Since I know that it takes around 7 milliseconds to process one entry in window.firstStepInput, I process 5 entries at a time, so that I only block the single JavaScript thread for about 35 milliseconds every step. Once the 5 entries are processed (if less than 5 entries are available data will be null and nothing will happen), I check whether I’m done. If not, then the next batch is queued using setTimeOut. Otherwise, I stop a possible second step processing still running and trigger the second step processing.

The second step processing basically works the same way:

function processSecondStep() {
    for (var i = 0; i < 5; i++) {
        var data = window.secondStepInput.pop();
        if (data != null) {
            //do some processing
        }
    }
    if (window.secondStepInput.length == 0) {
        window.secondStepTimer = -1;
    } else {
        window.secondStepTimer = setTimeout(processSecondStep, 0);
    }
}

So using this mechanism, you can perform multiple steps of long running processing and abort each of them by using abort for AJAX calls and clearTimeout for functions.

Of course, you can only use this technique if you are able to split you processing in small enough batches that the JavaScript thread is only blocked for a short time.

Conclusion

So depending on what kind of scenario you are handling and what’s required for you processing, you’ll have to choose between the three methods shown above:

  1. Move the processing to the server if possible
  2. Use Web Workers if you do not need to support Internet Explorer 9 or lower and your processing does not require DOM manipulation.
  3. Use batches triggered by setTimeout if you can split your processing into small batches.

If none of them apply, well I guess you’re doing something wrong 😉

4 thoughts on “JavaScript: Asynchronous abortable long running tasks

  1. I was thinking of a do while between the JS and controller. Just send an AJAX to server, and server does partial work, and then return to browser with another call of AJAX. Hence it becomes like a while loop.

    1. It all basically depends on what kind of logic you have in this long running task. If you can process a part on the server, then you can get some asynchronous behavior but issuing multiple call to your controller on the server (since the network calls are indeed done in parallel). But e.g. in my case, most of the time is spent doing DOM manipulation (which doesn’t work on the server), and each step is pretty fast (less than 10 milliseconds) but has to be executed over a list of thousands of entries, so in such cases you have to be careful that the overhead introduced by the AJAX calls is not prohibitively high.

    1. I think a discussion about “real programming languages” would require an article of its own. But my point of view is that for each task given some limiting factors (e.g. existing code, deployment, knowledge…), you end up having a limited choice of programming languages. When running in a browser probably even more. All programming languages are real. Each of them was developed to solve a specific class of problems. Using them to solve a different class of problems is wrong. Ruling out a programming language which would be a good fit, is only ok if there are some limiting factors making it not a good fit after all. Ruling out a programming language because it’s not a real programming language sounds a little childish to me.

Leave a Reply

Your email address will not be published. Required fields are marked *