As mentioned in my last post about Web SQL databases, each Web SQL database consists of a unique name and version.  Each database, however, can only have a single version at any given time, so there’s no possibility of older and newer versions running concurrently.  To quote the spec, this is provided to “…allow authors to manage schema changes incrementally and non-destructively, and without running the risk of old code (e.g. in another browser window) trying to write to a database with incorrect assumptions.”   In other words, instead of trying to manage the hassle of temporarily offloading data, rebuilding the data structure from scratch, and then reloading data when code changes, versioning allows for the detection of the client’s current db version, with the option of programmatically upgrading them to a different (read “new’) version of the database schema.  Of course, you could conceivably also use this to detect old versions and point to different code entirely, but that would be a lot of code to manage…

My idea on this is to use the versioning as a way to push incremental database schema changes in a systematic way.  For example, you could easily define a complex series of schema updates that take the user’s database version (whatever it might be) directly to the most current version.  However, this seems like an undue amount of work to me.  Rather, I think that maintaining a procedural update history of each version makes more sense.

Here’s the idea.  Let’s say that we’re building a blog.  Perhaps version 1 of our database is very simple: just a “posts” table that captures title, content, publishdate, and author.

VERSION 1
posts
      postid
      title
      content
      publishdate
      author

But let’s say in version 2, we add two more tables (comments and categories, for example).

VERSION 1          --->    VERSION 2
posts                      posts
      postid                    postid
      title                     title
      content                   content
      publishdate               publishdate
      author                    author
                           comments
                                commentid
                                content
                                author
                                publishdate
                                postid
                           categories
                                categoryid
                                category

Now our approach is increasing in complexity, for not only do we need to give new users the full version 2 schema, but we also need to upgrade version 1 users to the new schema without destroying their existing data.

But let’s push this out into the future a bit.  Perhaps by version 3, we’ve added a few more tables.  Now we have to give new users the version 3 schema and upgrade BOTH version 2 and potential version 1 users to the new schema, taking into the account the discrepencies between version 1 and 2, and between both 1 and 3 and 2 and 3.

VERSION 1          --->    VERSION 3
posts                      posts
      postid                    postid
      title                     title
      content                   content
      publishdate               publishdate
      author                    author
                           comments
                                commentid
                                content
                                author
                                publishdate
                                postid
                           categories
                                categoryid
                                category
                           tags
                                tagid
                                tag
                           bloginfo
                                title
                                owner
                                tagline

VERSION 2          --->    VERSION 3
posts                      posts
      postid                    postid
      title                     title
      content                   content
      publishdate               publishdate
      author                    author
comments                   comments
      commentid                 commentid
      content                   content
      author                    author
      publishdate               publishdate
      postid                    postid
categories                 categories
      categoryid                categoryid
      category                  category
                           tags
                                tagid
                                tag
                           bloginfo
                                title
                                owner
                                tagline

Oh, and we just realized we need to add an additional column to our original “posts” table, so version 3.1 increases the complexity yet again.

How do we manage this?  Do we write a full complement of schema updates to get each version (1, 2, and 3) to the current version (3.1)?  If we do, this will be a lot of code.  Do we simply write a bunch of clunky “if exists” procedures in our sql to account for gaps in our schemas at any stage in the versioning?  It’s a better solution, but is still lacking, IMO, because it doesn’t really show how the changes are occurring across versions.  If something goes wrong at some point, it’s not immediately obvious where the discrepancy originated.

So my solution is to manage the version updates through incremental upgrades to the newest version by way of any versions in between.  Therefore, if our current version is 3.1, a user with the 1.0 version will be upgraded to 3.1 by first upgrading to 2, then to 3, then to 3.1.  While this requires a bit more effort than the “if exists” SQL-based solution, it provides an immediate version history, providing great introspection into how each version is related to the previous, and to the next.  Let’s look at how this plays out:

First, I create an array representing the complete version history of the database:

this.apphistory    =    ['1.00','2.00','3.00','3.10'];
this.appversion    =    '3.10';

Then, upon each opening of the database, I invoke a custom method “setSchema” which checks the current version of the database against the current “official” version of the database, and handles the incremental updates if needed:

setSchema = function() {
     // set array for holding our sql statements; sql lite can only process one statement at a time
     var sql    = new Array();
     // get history object-- returns current version and whether version is equal to the "official" or current version
     var history = this.getHistory();
     // if client db version is not equal to current appversion, start iterative update
     var ver = history.iscurrent ? this.appversion : history.current;
     if(!history.iscurrent) {
          switch(ver) {
               case '1.00':
                    sql[0]    =    "create table  comments(commentid,content,author,publishdate,postid);";
                    break;
               case '2.00':
                    sql[0]    =    "create table categories(categoryid,category)";
                    sql[1]    =    "create table tags(tagid,tag)";
                    break;
               case '3.00':
                    sql[0]     =   "alter table posts add created";
                    sql[1]    =    "alter table posts add updated";
                    break;
               // default will be a snapshot of the current version of the schema;
               default:
                    sql[0]    =    "create table posts(postid integer primary key  autoincrement,title,publishdate,author,content,created,updated);";
                    sql[1]    =    "create table  comments(commentid,content,author,publishdate,postid);";
                    sql[2]    =    "create table categories(categoryid,category)";
                    sql[3]    =    "create table tags(tagid,tag)";
                    sql[4]    =    "create table bloginfo(name,owner,tagline)";
          }
          this.database.transaction(function (t) {
          for(var i=0;i<sql.length;i++) {
               t.executeSql(sql[i]);
          }
      });
      this.setVersion();
   }
}

What’s basically happening here is that the “default” case will always have what the “official” version of the database should be.  In this case, this is the “3.10” version of the database.  So if any new users come to the site, they will get the full-on copy of this database written to their client.  If, however, they have a previous version of the database, they don’t need a full upgrade to the 3.1.  Rather, they only need incremental changes to get them to the current database version.  Therefore, if the client has a “2.00” database version, they will execute SQL statements to get them to the “3.00” version, and on the next pass of this method, they will get the final incremental change to “3.10.”

And in the future, when additional tables or columns are added/modified, the “default” case can be updated to include the changes, and a case for “3.10” can be added to manage the incremental change to the newest version.

As I mentioned before, this is perhaps not the prettiest way to manage this.  The thing I like, however, is how explicitly it defines the relationships between the versions.  With a quick glance, we can see what changes are occurring from version to version.  And, because this is an iterative process, we don’t have to worry about updating the “version 2.00” case to get the client database directly to the newest version (imagine doing that for 50 versions…yikes!). Rather, because the upgrade from 2.00 to 3.00 will ALWAYS be the same, we only have to worry about keeping the newest and most recent versions synched, as this process will take any previous version iteratively through to the official current version.