I’m pleased to announce that the migrate_d2d module, long in sandbox mode, is now a full-fledged drupal.org project, and the first release candidate is now available. This module extends the Migrate framework with a framework of Migration classes for importing into a Drupal 7 installation from existing Drupal 5, Drupal 6, or Drupal 7 sites.
Motivation
There were 2 primary motivations for the development of this module:
- Taking the opportunity to refactor. It’s difficult to refactor a Drupal site in-place - there’s no simple way to change a blog node into a forum node, or change the type of an existing field. At Acquia we see many situations like this - for the project where the basic structure of migrate_d2d was first developed, in the course of upgrading from Drupal 6 to Drupal 7 the customer wanted to consolidate a structure that had grown out of control (136 content types with 916 distinct CCK fields) to something more manageable. In another case, a customer needed to migrate several Drupal 6 and Drupal 7 sites with similar but not identical content structures to a standard set of content types on Enterprise Drupal Gardens.
- Proof-of-concept for using migration for core upgrades. It has been proposed that for upgrades to Drupal 8 a migration approach (creating the new site and importing content from the original site) could replace the traditional upgrade-in-place approach. The current Migrate module would be the basis for the underlying framework for this - migrate_d2d gives us an opportunity to work out how the upgrade scenario might work.
Basic usage
The migrate_d2d module provides a set of Migration classes that understands the core schemas for Drupal version 5 through 7, as well as the CCK schema for Drupal 5 and Drupal 6. From that knowledge, it provides default mappings from any of those Drupal versions to Drupal 7 core content. A basic migration to Drupal 7 can be implemented simply by registering a batch of migrations with appropriate arguments.
If you’re already familiar with the Migrate module, you know that if you define a class derived from Migration and clear your cache, you will have an instance of that class automatically registered and available for importing. Migrate also allows you to define “dynamic” migration classes (derived, naturally, from DynamicMigration) - simply defining the class does not result in any usable instance being registered, but you can explicitly create more than one instance of such a class, giving each one a distinct machine name and array of arguments. migrate_d2d is built on the dynamic migration concept - rather than defining distinct classes for, say, Drupal 6 forum and blog nodes, you can simply instantiate DrupalNode6Migration twice with distinct arguments for the two content types. Let’s start with a simple example - we’re going to implement the module example_d2d to do a very basic migration of core data from Drupal 6 to Drupal 7. Here’s a first draft of example_d2d.module, registering a user migration:
<?php
/**
* Implementation of hook_flush_caches().
*
* There needs to be some point at which your migrations are registered, or
* previously-registered migrations are updated with changed arguments. We choose to
* do this on a cache clear.
*/
function example_d2d_flush_caches() {
/**
* Each migration being registered takes an array of arguments, some required
* and some optional. Start with the common arguments required by all - the
* source_connection (connection key, set up in settings.php, pointing to
* the Drupal 6 database) and source_version (major version of Drupal).
*/
$common_arguments = array(
'source_connection' => 'legacy',
'source_version' => 6,
);
// The description and the migration machine name are also required arguments,
// which will be unique for each migration you register.
$arguments = $common_arguments + array(
'description' => t('Migration of users from Drupal 6'),
'machine_name' => 'ExUser',
);
// We just use the migrate_d2d D6 migration class as-is.
Migration::registerMigration('DrupalUser6Migration', $arguments['machine_name'],
$arguments);
}
?>
With just an example_d2d.info file and the above .module, all you have to do is define a ‘legacy’ connection in settings.php pointing to a Drupal 6 database, and you can migrate all the users and their core data (the contents of the users table) from your legacy site to your Drupal 7 site.
Taxonomy terms take a couple more arguments - source_vocabulary, identifying what vocabulary you’re drawing from on the legacy system, and destination_vocabulary, identifying which vocabulary will hold the imported terms. With taxonomy, it’s important to note that before Drupal 7 vocabularies did not have machine names, so you must use the numeric vocabulary ID (vid) in those case.
<?php
$arguments = $common_arguments + array(
'description' => t('Migration of category terms from Drupal 6'),
'machine_name' => 'ExCategories',
'source_vocabulary' => '2', // vid of the Drupal 6 “Categories” vocabulary
'destination_vocabulary' => 'categories',
);
// First argument is the migration class name
Migration::registerMigration('DrupalTerm6Migration', $arguments['machine_name'],
$arguments);
?>
Let's consider nodes next. As you may anticipate, similar to vocabularies we will have source_type and destination_type arguments. There is an additional wrinkle, however - if you want to maintain node authorship information, you need to map old user IDs (uids) to new user IDs. For example, the user account 'janeroe' may have had uid 45 on the legacy system, but it is created with uid 28 by the user migration defined above. The legacy nodes created by janeroe thus have a uid of 45 assigned. If you're experienced with the Migrate module, you probably know that we can have the uid translated to its new value by using sourceMigration('user migration machine name') on the uid field mapping. That mapping is buried inside migrate_d2d, so how can it know what the machine name of our user migration is? The answer is the user_migration argument:
<?php
$arguments = $common_arguments + array(
'description' => t('Migration of core blog nodes from Drupal 6 to custom blog nodes'),
'machine_name' => 'ExBlog',
'source_type' => 'blog',
'destination_type' => 'custom_blog_type',
'user_migration' => 'ExUser', // Machine name of our user migration above
);
Migration::registerMigration('DrupalNode6Migration', $arguments['machine_name'],
$arguments);
?>
The user_migration argument both establishes the sourceMigration() call to properly map uids, and adds the ExUser migration to the dependency list to be sure this migration runs after the user migration.
Now, the standard migrate_d2d classes know how to map all the core fields - besides the obvious (the legacy node title maps to the destination node title), it knows that the Drupal 6 comment timestamp field should be mapped to the created and changed fields in Drupal 7. But, it doesn't know how you want to map any of your custom fields - you need to tell it how you want them mapped. You can do this by extending the drupal_d2d classes and adding your own field mappings. For example, consider this code added to a node.inc file (which, of course, should be listed in example_d2d.info):
<?php
class ExArticleMigration extends DrupalNode6Migration {
public function __construct(array $arguments) {
parent::__construct($arguments);
$this->dependencies[] = 'ExCategories';
// We're replacing the legacy field_published, which was a text field, with a new date field
$this->addFieldMapping('field_publish_date', 'field_published');
// Assigned terms are represented in Drupal 6 by their vid, migrate to the new term
// reference field, translating the tid from the legacy value.
$this->addFieldMapping('field_article_category', '2')
->sourceMigration('ExCategories');
}
}
?>
Add this to your registration code:
<?php
$arguments = $common_arguments + array(
'description' => t('Migration of article nodes from Drupal 6'),
'machine_name' => 'ExArticle',
'source_type' => 'article',
'destination_type' => 'article',
'user_migration' => 'ExUser', // Machine name of our user migration above
);
Migration::registerMigration('ExArticleMigration', $arguments['machine_name'],
$arguments);
?>
Of course, you can do much more when you override the migrate_d2d classes - add a prepareRow() implementation to manipulate the data, for example (don't forget to call parent::prepareRow()!). We'll discuss this further in the next installment.
Conclusion
In part 2, we’ll discuss details on the architecture of migrate_d2d, and how to extend it for more advanced projects.