Saturday, April 14, 2012

High Performance Webapps. Use Data URIs. Theory.

I continue to write tips for perfomance optimization of websites. The last post was about jQuery objects. This post is about data URIs. Data URIs are an interesting concept on the Web. Read "Data URIs explained" please if you don't know what it does mean. Data URIs are a technique for embedding resources as base 64 encoded data, avoiding the need for extra HTTP requests. It gives you the ability to embed files, especially images, inside of other files, especially CSS. Not only images are supported by data URIs, but embedded inline images are the most interesting part of this technique. This technique allows separate images to be fetched in a single HTTP request rather than multiple HTTP requests, what can be more efficient. Decreasing the number of requests results in better page performance. "Minimize HTTP requests" is actually the first rule of the "Yahoo! Exceptional Performance Best Practices", and it specifically mentions data URIs.

"Combining inline images into your (cached) stylesheets is a way to reduce HTTP requests and avoid increasing the size of your pages... 40-60% of daily visitors to your site come in with an empty cache. Making your page fast for these first time visitors is key to a better user experience."

Data URI format is specified as

data:[<mime type>][;charset=<charset>][;base64],<encoded data>

We are only interesting for images, so that mime types can be e.g. image/gif, image/jpeg or image/png. Charset should be omitted for images. The encoding is indicated by ;base64. One example of a valid data URI:
<img src="
        AAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO
        9TXL0Y4OHwAAAABJRU5ErkJggg==" alt="Red dot">
HTML fragments with inline images like the above example are not really interesting because they are not cached. Data URIs in CSS files (style sheets) are cached along with CSS files and that brings benefits. Some advantages describing in Wikipedia:
  1. HTTP request and header traffic is not required for embedded data, so data URIs consume less bandwidth whenever the overhead of encoding the inline content as a data URI is smaller than the HTTP overhead. For example, the required base64 encoding for an image 600 bytes long would be 800 bytes, so if an HTTP request required more than 200 bytes of overhead, the data URI would be more efficient.
  2. For transferring many small files (less than a few kilobytes each), this can be faster. TCP transfers tend to start slowly. If each file requires a new TCP connection, the transfer speed is limited by the round-trip time rather than the available bandwidth. Using HTTP keep-alive improves the situation, but may not entirely alleviate the bottleneck.
  3. When browsing a secure HTTPS web site, web browsers commonly require that all elements of a web page be downloaded over secure connections, or the user will be notified of reduced security due to a mixture of secure and insecure elements. On badly configured servers, HTTPS requests have significant overhead over common HTTP requests, so embedding data in data URIs may improve speed in this case.
  4. Web browsers are usually configured to make only a certain number (often two) of concurrent HTTP connections to a domain, so inline data frees up a download connection for other content.
Furthermore, data URIs are better than sprites. Images organized as CSS sprites (many small images combined to one big) are difficult to be maintained. Maintenance costs are high. Imagine, you want to change some small images in the sprite, their position, size, color or whatever. Well, there are tools allowing to generate sprites, but later changes are not easy. Especially changes in size cause a shift of all positions and a lot of CSS changes. And don't forget - a sprite still requires one HTTP request :-).

What browsers support data URIs? Data URIs are supported for all modern browsers: Gecko-based (Firefox, SeaMonkey, Camino, etc.), WebKit-based (Safari, Google Chrome), Opera, Konqueror, Internet Explorer 8 and higher. For Internet Explorer 8 data URIs must be smaller than 32 KB. Internet Explorer 9 does not have this 32 KB limitation. IE versions 5-7 lack support of data URIs, but there is MHTML – when you need data URIs in IE7 and under.

Are there tools helping with automatic data URI embedding? Yes, there are some tools. The most popular is a command line tool CSSEmbed. Especially if you need to support old IE versions, you can use this command line tool which can deal with MHTML. Maven plugin for web resource optimization, which is a part of PrimeFaces Extensions project, has now a support for data URIs too. The plugin allows to embed data URIs for referenced images in style sheets at build time. This Maven plugin doesn't support MHTML. It's problematic because you need to include CSS files with conditional comments separately - for IE7 and under and all other browsers. How does the conversion to data URIs work?
  1. Plugin reads the content of CSS files. A special java.io.Reader implementation looks for tokens #{resource[...]} in CSS files. This is a syntax for image references in JSF 2. Token should start with #{resource[ and ends with ]}. The content inside contains image path in JSF syntax. Theoretically we can also support other tokens (they are configurable), but we're not interested in such kind of support :-) Examples:
    .ui-icon-logosmall {
        background-image: url("#{resource['images/logosmall.gif']}") !important;
    }
    
    .ui-icon-aristo {
         background-image: url("#{resource['images:themeswitcher/aristo.png']}") !important;
    }
    
  2. In the next step the image resource for each background image is localized. Images directories are specified according to the JSF 2 specification and suit WAR as well as JAR projects. These are ${project.basedir}/src/main/webapp/resources and ${project.basedir}/src/main/resources/META-INF/resources. Every image is tried to be found in those directories.
  3. If the image is not found in the specified directories, then it doesn't get transformed. Otherwise, the image is encoded into base64 string. The encoding is performed only if the data URI string is less than 32KB in order to support IE8 browser. Images larger than that amount are not transformed. Data URIs looks like
    .ui-icon-logosmall {
        background-image: url(" ... ASUVORK5CYII=") !important;
    }
    
    .ui-icon-aristo {
        background-image: url(" ... BJRU5ErkJggg==") !important;
    }
    
Configuration in pom.xml is simple. To enable this feature set useDataUri flag to true. Example:
<plugin>
    <groupId>org.primefaces.extensions</groupId>
    <artifactId>resources-optimizer-maven-plugin</artifactId>
    <configuration>
        <useDataUri>true</useDataUri>
        <resourcesSets>
            <resourcesSet>
                <inputDir>${project.build.directory}/webapp-resources</inputDir>
            </resourcesSet>
        </resourcesSets>
    </configuration>
</plugin>
Enough theory in this post. The next one will describe a practice part. I will expose some measurements, screenshots and give tips how large images should be, where CSS should be placed, what is the size of CSS file with data URIs and whether a GZIP filter can help here. Stay tuned.

3 comments:

  1. Thanks Oleg.
    Great article.
    Waiting for the next part.

    ReplyDelete
  2. Perfect and simple way, maven is best suitable for such kind of tasks, thanks Oleg for explanation!

    ReplyDelete
  3. Thanks for a great post, thanks ! All these tips are the most promising and profitable ways to increase the conversion rate of a website. I just do recommend your post to everyone I know.

    ReplyDelete