Knowing when to change

May 18th 16

Sometimes it is worth starting over. Knowing when to conceive is important as we all have restrictions and time is ultimately the most valuable.

When I started creating this blog I wanted it to be a fully interactive web app. I wanted to practice my expanding web app toolkit. It worked well and I had lots of ideas for extending it, making it really interactive and engaging.

Soon after I had the initial version launched however, I realised that I have massively overlooked SEO. Optimising content for search on a blog is crucial. The whole point of sharing information is defeated if the site cannot be crawled and indexed.

Fortunately there are some ways to workaround this. These revolve around the way the search engines deal with JavaScript.

Escaped Fragments

Google bots will, if requested, follow different paths via routes called escaped fragments.

The idea is that you generate a separate static HTML version of your single page app and redirect the crawler to it. It doesn’t have to interpret the JavaScript and therefore can index your site correctly.

In order to notify the bot you have to either:

  • Use #! in your site’s hash fragments
  • Add a trigger to the head of the HTML of a page without a hash fragment:
<meta name="fragment" content="!">

For the complete details see the documentation.

Serving up your static HTML is straightforward using a headless browser such as PhantomJS. It can be configured to automatically create HTML snapshots of your pages. I used the nice module Grunt HTML Snapshot for this and it worked really well. The great thing about including this in Grunt is that you can combine this in your automated build process.

The config setup is really easy to use:

htmlSnapshot: {
    all: {
      options: {
        //that's the path where the snapshots should be placed
        //it's empty by default which means they will go into the directory
        //where your Gruntfile.js is placed
        snapshotPath: 'public/static/',
        //This should be either the base path to your index.html file
        //or your base URL. Currently the task does not use it's own
        //webserver. So if your site needs a webserver to be fully
        //functional configure it here.
        sitePath: '//domain.com',
        //you can choose a prefix for your snapshots
        //by default it's 'snapshot_'
        fileNamePrefix: 'index',
        //by default the task waits 500ms before fetching the html.
        //this is to give the page enough time to to assemble itself.
        //if your page needs more time, tweak here.
        msWaitForPages: 1000,
        //if you would rather not keep the script tags in the html snapshots
        //set `removeScripts` to true. It's false by default
        removeScripts: false,
        //he goes the list of all urls that should be fetched
        urls: [
          'index.html',
          'about.html',
          'contact.html'
        ]
      }
    }
}

While it does work, it is not a clean solution and certainly not DRY. The whole process of generating dynamic content to create static versions just for SEO felt rather backwards.

Looking at the current implementation there was no real advantage in having the site so JavaScript based.

At that realisation I decided to serve all of the content statically and switched to Octopress.

You can see the single page blog source here

Introduction to Mac Shortcuts

Next