Thursday, September 3, 2009

Gmail for Mobile HTML5 Series: Reducing Startup Latency

On April 7th, Google launched a new version of Gmail for mobile for iPhone and Android-powered devices. We shared the behind-the-scenes story through this blog and decided to share more of what we've learned in a brief series of follow-up blog posts. This week, I'll talk about how modularization can be used to greatly reduce the startup latency of a web app.

To a user, the startup latency of an HTML 5 based application is critical. It is their first impression of the application's performance. If it's really slow, they might not even bother to wait for the app to load before navigating away. Even if your application is blazing fast after it loads, the user may never get the chance to experience it.

There are several aspects of an HTML 5 based application that contribute to startup latency:
  1. Network time to fetch the application (JavaScript + HTML)
  2. JavaScript parse time
  3. Code execution time to fetch the data and render the home page of your application
The third issue is up to you! The first two issues, however, are directly correlated with the size of the application. This is a tricky problem since as your application matures, it will have more features and the code size will get bigger. So, what to do? Modularize your application! Split up your code into independent, standalone modules. Consider splitting each view/screen of your application and implement each new feature as its own module. This is only half the story. Now that you have your code modularized, you need to decide which subset of these modules are critical to load your application's home page. All the non-core modules should be downloaded and parsed at a later time. With a consistent code size for your startup code, you can maintain a consistent startup time. Now, let's go into some nitty gritty details of how we built an application with lazy-loaded modules.

How to Split Your Code into Modules

Splitting an application into individual modules might not be as simple as you think. Code that serves a common purpose/functionality should be grouped together and form a module (comparable to a library). As mentioned earlier, we selected which modules are critical to the home page of the app and which modules can be lazy-loaded at a later time. Let's use a Weather application as an example:

High Level Functionality:
  • A "Weather in my Favourite Cities" home page
  • Click on a city to view the cities entire week forecast
  • Weather data comes from an external web service
Possible Module Separation:
  • Weather data model
  • Weather web service API
  • Common UI widgets (buttons, toolbars, navigation, etc)
  • Favourite Cities page
  • City Weather Forecast page
Now let's say your users want a "breaking news" feature. No problem: just put the page, the news data API and the data model into a new module.

One thing to keep in mind is the dependency order of your modules. For modules that have many downstream dependencies, it might make sense to include them as part of the core modules.

How to Lazy Load the Modules

Option 1: Script as DOM

This method uses JavaScript to insert SCRIPT tags into the HEAD's DOM.
<script type="text/JavaScript">
  function loadFile(url) {
    var script = document.createElement('SCRIPT');
    script.src = url;
    document.getElementsByTagName('HEAD')[0].appendChild(script);
  }
</script>
Option 2: XmlHttpRequest (XHR)

This method sets up XmlHttpRequests to retrieve the JavaScript . The returned string should be evaluated in the XHR callbacks (using the eval(string) method). This method is a little more complicated but it gives you more control over error handling.
<script type="text/JavaScript">
  function loadFile(url) {
     function callback() {
      if (req.readyState == 4) { // 4 = Loaded
        if (req.status == 200) {
          eval(req.responseText);
        } else {
          // Error
        }
      }
    };
    var req = new XMLHttpRequest();
    req.onreadystatechange = callback;
    req.open("GET", url, true);
    req.send("");
  }
</script>
The next question is, when to lazy load the modules? One strategy is to lazy load the modules in the background once the home page has been loaded. This approach has some drawbacks. First, JavaScript execution in the browser is single threaded. So while you are loading the modules in the background, the rest of your app becomes non-responsive to user actions while the modules load. Second, it's very difficult to decide when, and in what order, to load the modules. What if a user tries to access a feature/page you have yet to lazy load in the background? A better strategy is to associate the loading of a module with a user's action. Typically, user actions are associated with an invocation of an asynchronous function (for example, an onclick handler). This is the perfect time for you to lazy load the module since the code will have to be fetched over the network. If mobile networks are slow, you can adopt a strategy where you prefetch the code of the modules in advance and keep them stored in the javascript heap. Only then parse and load the corresponding module on user action. One word of caution is that you should make sure your prefetching strategy doesn't impact the user's experience - for example, don't prefetch all the modules while you are fetching user data. Remember, dividing up the latency has far better for users than bunching it all together during startup.

For an HTML 5 application that takes advantage of the application cache to reduce startup latency and to serve the application offline, there are a few caveats one should be aware of. Mobile networks have decent bandwidth, but poor round trip latency, so listing each module as a separate resource in the manifest incurs quite a bit of extra startup latency when the application cache is empty. Also, if one of the module resources fails to be downloaded by the application cache (e.g. disconnected from network), additional error handling code needs to be written to handle such a case. Finally, applications today have no control when the application cache decides to download the resources in the manifest (such a feature is not defined in the current specification of the draft standard). Typically, resources are downloaded once the main page is loaded, but that's not an ideal time since that's when the application requests user data.

To work-around these caveats, we found a trick that allows you to bundle all of your modules into a single resource without having to parse any of the JavaScript. Of course, with this strategy, there is greater latency with the initial download of the single resource (since it has all your JavaScript modules), but once the resource is stored in the browser's application cache, this issue becomes much less of a factor.

To combine all modules into a single resource, we wrote each module into a separate script tag and hid the code inside a comment block (/* */). When the resource first loads, none of the code is parsed since it is commented out. To load a module, find the DOM element for the corresponding script tag, strip out the comment block, and eval() the code. If the web app supports XHTML, this trick is even more elegant as the modules can be hidden inside a CDATA tag instead of a script tag. An added bonus is the ability to lazy load your modules synchronously since there's no longer a need to fetch the modules asynchronously over the network.

On an iPhone 2.2 device, 200k of JavaScript held within a block comment adds 240ms during page load, whereas 200k of JavaScript that is parsed during page load added 2600 ms. That's more than a 10x reduction in startup latency by eliminating 200k of unneeded JavaScript during page load! Take a look at the code sample below to see how this is done.
<html>
...
<script id="lazy">
// Make sure you strip out (or replace) comment blocks in your JavaScript first.
/*
JavaScript of lazy module
*/
</script>

<script>
  function lazyLoad() {
    var lazyElement = document.getElementById('lazy');
    var lazyElementBody = lazyElement.innerHTML;
    var jsCode = stripOutCommentBlock(lazyElementBody);
    eval(jsCode);
  }
</script>

<div onclick=lazyLoad()> Lazy Load </div>
</html>
In the future, we hope that the HTML5 standard will allow more control over when the application cache should download resources in the manifest, since using comments to pass along code is not elegant but worked nicely for us. In addition, the snippets of code are not meant to be a reference implementation and one should consider many additional optimizations such as stripping white space and compiling the JavaScript to make its parsing and execution faster. To learn more about web performance, get tips and tricks to improve the speed of your web applications and to download tools, please visit http://code.google.com/speed.

Previous posts from Gmail for Mobile HTML5 Series

HTML5 and Webkit pave the way for mobile web applications
Using AppCache to launch offline - Part 1
Using AppCache to launch offline - Part 2
Using AppCache to launch offline - Part 3
A Common API for Web Storage
Suggestions for better performance
Cache pattern for offline HTML5 web application


No comments:

Post a Comment