rake db:test:purge and rake db:test:clone

With an increasing number of Rails developers adopting DB2 as their database of choice, and the welcoming approach towards suggestions and requests on our end, it comes as no surprise that the feedback is rolling in faster than ever. We are very pleased about this, and in this short article I’d love to address two of the most requested features.

rake db:test:purge

Running rake db:test:purge generates the following error:

  rake aborted!
  Task not supported by 'ibm_db'

db:test:purge is a rake task defined in the Rails gem within the file rails-1.2.3/lib/tasks/databases.rake. This file takes care of, amongst other things, defining the specific behavior for each database adapter known to rails, when the user requests a database purge. It is essentially a big case statement with specific ruby code which drops all the user objects from the supported database.

ibm_db is currently shipped independently from Rails, therefore it is missing from that case statement, and as a result the task will appear to not be supported by ibm_db.

If you are using the ibm_db adapter for DB2, you don’t want to miss the opportunity to use rake db:test:purge and take advantage of other tasks that rely on this in order to work. What you can do is, manually insert a snippet of code for the ibm_db case in databases.rake.

An easy and straightforward implementation of this would be dropping the database and recreating it from scratch. I’m not allowed into the other adapters implementations, but I assume this could be what some of the other adapters may be doing. I believe this is not the right thing to do with DB2 though. In fact, this would have two main drawbacks. Firstly, creating a database every time, implies that the task becomes quite slow, because the creation of a database in DB2 is a “magical process” that can take up to a minute (for good reason, and that minute can save you lots of money in the years to come as you use the database). Secondly, the database that you have created in the first place may have many options and parameters configured, and collecting them all and reapplying them may not be the easiest or the smartest thing to do.

A different approach would be to handle this the right way, by dropping all the user schemas and the objects contained within the database. The code needs to be placed within the case statement we mentioned above:

    desc "Empty the test database"
    task :purge => :environment do
      abcs = ActiveRecord::Base.configurations
      case abcs["test"]["adapter"]
        #...
      end
   end

With the current (Rails 1.2.3) databases.rake you can practically just copy and paste the following code at line 145 in the file:

when "ibm_db"
  ActiveRecord::Base.establish_connection(:test)
  conn = ActiveRecord::Base.connection.connection

  begin
    # Required for the stored procedure ADMIN_DROP_SCHEMA
    ActiveRecord::Base.connection.execute("CREATE TABLESPACE SYSTOOLSPACE")
    systool_existing = false
  rescue
    # The SYSTOOLSPAGE already exists
    systool_existing = true
  end

  # Collects all the user defined schemas
  user_schemas_sql = "SELECT SCHEMANAME FROM SYSCAT.SCHEMATA WHERE DEFINER <> 'SYSIBM' AND 
  SCHEMANAME NOT IN ('NULLID', 'ERRORSCHEMA', 'SYSTOOLS')"
  schemas = ActiveRecord::Base.connection.select_all(user_schemas_sql)

  unless schemas.empty?
    errortabschema = 'ERRORSCHEMA'
    errortab = 'ERRORTABLE'

    # Drop each schema and all its objects
    schemas.each do |schema|
      schema_name = schema["schemaname"].strip.upcase
      sql = "CALL SYSPROC.ADMIN_DROP_SCHEMA('#{schema_name}', NULL, ?, ?)"
      stmt = IBM_DB::prepare(conn, sql)
      IBM_DB::bind_param(stmt, 1, "errortabschema", IBM_DB::SQL_PARAM_INPUT)
      IBM_DB::bind_param(stmt, 2, "errortab", IBM_DB::SQL_PARAM_INPUT)
      IBM_DB::execute(stmt)
    end

    # If the tablespace SYSTOOLSPACE didn't exist initially, it gets dropped  
    ActiveRecord::Base.connection.execute("DROP TABLESPACE SYSTOOLSPACE") unless systool_existing

    # Drops the remaining schema "ERRORSCHEMA"
    ActiveRecord::Base.connection.execute("DROP SCHEMA ERRORSCHEMA RESTRICT")
  end

This can also be used do define a method on its own (e.g. purge_database(:mydb)), should you require that functionality somewhere in your code. In that case, just make sure to modify the first lines in order to use the already established connection or define your own.

rake db:test:clone

Now that rake db:test:purge is working, you will be able to successfully run rake db:test:clone. Out of the box, there are two limitations though. Rails doesn’t acknowledge tablespaces and foreign keys (the lack of the first is understandable as it is strongly related to DB2, but the lack of the second is much less justifiable).

This doesn’t affect many developers but if it does affect you, it is both annoying and problematic. Suppose in fact, that you have created the database objects in your development db through migrations. You may have specified a certain tablespace name by passing :options => “IN mytablespace” to the create_table method. Running rake db:test:clone will generate the table in the test database within the default tablespace (USERSPACE1) rather than in the one that you’ve specified for the development database. Not only this, but if you manually defined foreign keys by executing sql statements in the migrations or directly, these will not appear in the cloned test database.

This is not an issue specific to DB2, it is just the way it works in Rails at the moment. In fact there are third party plugins that attempt to introduce these and other features that are shortcoming of Rails’ core (e.g. Foreign Key Schema Dumper Plugin for MySQL and PostgreSQL).

In order to address those two concerns when using DB2, you don’t have to operate on the definition of the task directly. In fact, the db:test:clone loads the dumped schema in the test environment. The culprit is therefore ActiveRecord SchemaDumper which doesn’t know anything about DB2 Tablespaces and about foreign keys. Changes to this will also affect the db:schema:dump task, which will in turn produce more correct and “database aware” db\schema.rb files.

The file schema_dumper.rb within the ActiveRecord gem (activerecord-1.15.3\lib\active_record\schema_dumper.rb) can be directly modified for your specific needs. At line 21 the dump method becomes:

    def dump(stream)
      header(stream)
      tables(stream)
      # Foreign keys are added for DB2 only
      if @connection.adapter_name == "IBM_DB"
        foreign_keys(stream)
      end
      trailer(stream)
      stream
    end

At line 89 within the table method, we need to specify code to handle the possibility of a non-default tablespace:

# Options to retrieve the right tablespace are enabled for DB2 only
if @connection.adapter_name == "IBM_DB"
  tbspace_sql = "select TBSPACE from syscat.tables where tabname='#{table.upcase}'"
  table_space = @connection.select_one(tbspace_sql)["tbspace"]
  if table_space != "USERSPACE1"
    # A different tablespace was defined       
    tbl.print %Q(, :options => "IN #{table_space}")
  end
end

At this point, just after the index method, we need to define the method foreign_keys:

def foreign_keys(stream)
  references = @connection.select_all("SELECT * FROM SYSCAT.REFERENCES")
  for reference in references
    constraint = reference["constname"]
    schema = reference["tabschema"]
    table = reference["tabname"]
    cols = reference["fk_colnames"]
    ref_schema = reference["reftabschema"]
    ref_table = reference["reftabname"]
    ref_cols = reference["pk_colnames"]
    if reference["updaterule"] == "R"
      update_action = "RESTRICT"
    else
      update_action = "NO ACTION"
    end

    delete_action = case reference["deleterule"]
                          when "A"
                            "NO ACTION"
                          when "R"
                            "RESTRICT"
                          when "C"
                            "CASCADE"
                          when "N"
                            "SET NULL"
                          end

    foreign_key_sql = "  execute("ALTER TABLE #{schema}.#{table} add 
CONSTRAINT #{constraint} FOREIGN KEY (#{cols.strip})n  
REFERENCES #{ref_schema}.#{ref_table} (#{ref_cols.strip}) 
ON UPDATE #{update_action} ON DELETE #{delete_action}n  
ENFORCED ENABLE QUERY OPTIMIZATION")n  "
    stream.print foreign_key_sql
    stream.puts
  end
end

As you can imagine, it is possible to prevent the need to modify the file directly, by simply extending the SchemaDumper class in, for example, a plugin. You would have to overwrite the original methods within the SchemaDumper class:

module ActiveRecord
  class SchemaDumper

    # ...

    def dump(stream)
      #...
    end

    private

    # ...

    def table(table, stream)
      # ...
    end

    # ...

    def foreign_keys(stream)
    # ...
    end

  end
end

It would be beneficial to aggregate several improvements in a “DB2 PowerPack” plugin of some sort, and it’s very likely that we will eventually work on publishing something like this.

[Slashdot] [Digg] [Reddit] [del.icio.us] [Facebook] [Technorati] [Google] [StumbleUpon]

Posted by Antonio Cangiano | June 20 2007 02:30 pm | How-to

One Response to “rake db:test:purge and rake db:test:clone”

  1. steven_parkes@esseff.org on 12 Jul 2007 at 9:23 pm #

    So, a few thoughts:

    Does is make sense to be able to map a db2 schema into the rails
    concept of a database? I can always choose to use two different db2
    databases, but it would be nice to have the choice. It’s a lot less
    costly for me (in terms of resources) if I can just use different
    schemas in the same database.

    Along the same lines, is there an easy way to “copy” a schema to
    another? I find myself wanting to take a test database and copy it to
    do some experiments that might corrupt the data. The database has
    enough live-ish data that I can’t generate it synthetically.

    Finally, should this kind of discussion go on somewhere else? Not
    everybody would know about the blog/and or it’s not easiest thing to
    track? Some kind of forum? I guess there’s the gem forum, but it’s not
    the easiest thing to read (e.g., as far as I can tell, I can’t get an
    atom feed from rubyforge). The mailing list? I guess this could be
    considered a dev thing, but there’s probably the need for user stuff
    as well.