Saturday, February 27, 2010

SQL Azure Pros and Cons

A friend of mine is working at a start-up and they are using Azure to host their website and database. I’ve been discussing ways that they can manage the database development cycle specifically as it relates to SQL Azure.

For traditional SQL Server development, my preferred development process leverages Visual Studio Database Edition. We generally only use RTM products for client development, so I have been using Visual Studio 2008 Database Edition with SQL Server 2008 Database Projects for some time now. I am looking forward to seeing what new features are in Visual Studio 2010.

Generally, members of my development team have their own development databases where they use whatever tools they prefer – SQL Server Management Studio, Visual Studio, etc., to make schema changes and add or update stored procedures, functions, views, etc. Then they use the Schema Compare functionality to sync those changes back to our project and then to source control, where other members of the team can get to them and we can keep version history, validate the builds before deploying, and either deploy new instances or use schema compare to help create migration scripts for existing databases. These tools are fantastic, and although there is definitely a learning curve for the team, I highly recommend using them.

So my friend was asking me what I recommend for their development process for the database against SQL Azure. They are in dev/QA right now and are expecting to start getting real data into the SQL Azure database that they are going want to persist, which means moving from a simple “drop the entire database and then run the big create script” to a managed approach where copies of the deployed database can be migrated to the latest version.

In this scenario, here is generally what I would want to do in an ideal world:

  1. Take a backup/snapshot of the SQL Azure database that is deployed.
  2. Restore that backup/snapshot to another location – it could be in the cloud or could be local.
  3. Run a Schema Compare from the Database Project to the restored snapshot and create a migration script.
  4. Test the migration script against the restored snapshot and verify that everything went as expected.
  5. Take the web site offline (so there are no more changes to the database) and take another snapshot of the SQL Azure DB just in case.
  6. Update the deployed SQL Azure database with the migration script, and take another snapshot post-migration.
  7. Bring up the new website (without getting into detail, assume you deploy the new code to Staging, test it, then swap it into Production).

This would all be great, but there are several gaps in the toolset for SQL Azure:

  1. VSDB does not support database projects for SQL Azure (see this thread). Apparently, neither does VS 2010 (yet – I hope they are planning this). The workaround here would be to create you DB project for SQL 2008. Then you have to run any scripts you generate through tools / update them to be SQL Azure compliant. The SQL Azure Migration Wizard can help you do this, but I still consider this all a big hack that is going to take time away from raw development tasks.
  2. SQL Azure does not have a mechanism to take a snapshot or a backup. This is a HUGE shortcoming. I wouldn’t consider putting a production database up in SQL Azure without the ability to take snapshots or backups. I wouldn’t be comfortable running a migration script against it if I didn’t have a backup and if I haven’t been able to test that script against the real database. Fortunately, this feature is in the works. You won’t be able to restore a backup to a local machine, though, at least that doesn’t appear to be planned. See this thread on MSDN. Also, the SQL Migration Wizard can help you copy schema and data, but that’s no replacement for a real backup mechanism.

Sure there are ways to get around all this. You can use “Generate Scripts” in SSMS for SQL 2008 R2 from a SQL Azure database, and you can use BCP. But none of that replaces a real backup. You have make sure nothing is changing the database – e.g. what if you get inconsistent data which causes problems on the other end do to changing data, foreign keys, etc.? I’d rather have a real backup. Also, once I have the scripts and data I need to restore them to a local database somehow through this manual process and then sync that back to my project / source control system. Some folks out there have written some great tools to help do this, and even RedGate has some SQL Azure tools in beta – but here is what I would like to see:

  1. Backup/restore support in SQL Azure with the ability to restore a backup to another SQL Azure instance
  2. Staging/Production support in SQL Azure, similar to how you do it with Azure web sites.
  3. Ability to take a backup (snapshot) from SQL Azure and restore it to a local development environment.
  4. A “SQL Azure” mode in a local SQL Server environment that has all of the same rules/restrictions as SQL Azure to simulate SQL Azure locally
  5. Full support for a SQL Azure Database Project in Visual Studio 2008 Database Edition as well as the equivalents in VS 2010 (Database Projects AND Data Tier Projects).

I’m sure we’ll see all or most of this eventually – until then we will have lower productivity and we will have to rely on hacks and community / 3rd party tools.

Big SQL Server Log File and Recovery Mode

Occasionally I will be working on a project where we have some database operations that delete/update/insert large amounts of data by design. Every now and then we get caught by surprise when one of our development or QA servers runs low on disk space, and then we discovery that the log files for the databases are HUGE.

In most cases it turns out that we had the databases set to FULL RECOVERY MODE. You can find many articles about this on the Internet and on MSDN, but the bottom line is that for our purposes SIMPLE recovery mode is more than adequate and it will greatly reduce the size of these log files. Full recovery would allow us to roll the database back to point in time – generally we don’t need this, especially on our development and QA databases.

I have some cool backup/restore scripts and we simply added an ALTER DATABASE statement to end of the restore script to make sure that the database is in SIMPLE recovery mode. Then I got to thinking, why not just create a one-liner that will set all of the user databases on a given server to SIMPLE recovery mode? Well – here it is. Enjoy!

EXECUTE sp_msforeachdb 'IF ''?'' NOT IN (''tempdb'', ''master'', ''model'', ''msdb'') exec(''ALTER DATABASE [?] SET RECOVERY SIMPLE'')'

In case you are wondering why the ALTER above is in an EXEC, apparently SQL server will parser the ALTER statement and stop on tempdb without even executing the script, so even though tempdb would be excluded in the IF statement, SQL still sees the ALTER code and doesn’t like it. Turning this into a dynamic query avoids the problem.