Cross-posted with permission from Blue Coda
With enterprise adoption of Drupal increasing at a rapid rate, many companies are interested in the options available to migrate from legacy systems. For companies using Microsoft’s SharePoint as an external web platform (check out our post about using SharePoint as a CMS), the thought of migrating thousands of pieces of content can seem daunting. SharePoint often requires substantial resources from both a human and capital perspective. This leaves companies feeling that their version of SharePoint is customized to the point of being inoperable by outside groups. There are usually a handful (or less) of company-specific specialists, and without them, the learning curve is too high for efficient use of external resources.
Luckily, this is not the case. With some organization and planning, migrating from SharePoint to Drupal can be an efficient process.
The high level steps you’ll need to figure out are:
- Get organized by determining which content should be moved to the new site, where it should go within the new site structure, and how the content should be re-tagged
- Retrieve the content from the SharePoint site
- Import your content into the new Drupal site, stripping any undesired elements in the process to ensure it assumes the look of the new site
- Review and cleanup
Step 1: Getting Organized
Regardless of how the content will be physically moved and imported, you can start tackling step one immediately. We recommend creating an Excel spreadsheet that details all of the content you wish to move, specifies the location in which it can be found on the old site, notes how it should be tagged on the new site as well as the new location, and includes any new metadata to be applied. Try and avoid the temptation to move everything over wholesale - just as when moving to a new house, whereby you may have many boxes of belongings hidden away in your basement that you never seek out, you likely have a significant percentage of content that is old, never read, or just plain out of date. Focusing on the high priority content will greatly simplify the overall migration and shorten the project timeline. As for the rest of the content, you can always move it later, and may even want to consider a rewrite beforehand.
Step 2: Retrieving Your Content
SharePoint web services can be utilized to retrieve the content and all of the associated metadata. This is by all means the sleeker, sexier way to go about the effort. However, what is important to keep in mind is that much of the tagging information is likely to change during the effort. And unless you are simply replatforming the site but otherwise keeping everything (e.g. design, navigation, and layout) the same, you won’t be simply moving the content over as is. We’ll assume you are in the 99% majority and taking this opportunity to redesign and otherwise improve upon your site (taking advantage of what a true external web platform such as Drupal has to offer). Of course, the further you stray from a wholesale as-is migration, the less value there is to be gained from a fancy SharePoint retrieval.
For these reasons and more, the approach we often take the most is to use a programmed script to systematically retrieve the body content from each page through HTTP requests and store the content in a more desirable format in a temporary location. The good news is that the detailed spreadsheet you created in step 1 provides our systemized process everything it needs to find the content to be moved. Also worth keeping in mind is the fact that this approach works when migrating from just about any content management system that exists.
Step 3: Migrating to Drupal
When it comes to the actual import into Drupal, the technically inclined might initially toy with the idea of writing a bunch of SQL scripts and doing a direct import into the database. Our recommendation: absolutely, positively do not waste any time on this approach. The complexity of a CMS database structure negates any advantage you might gain and ensures you’re likely to miss population of key fields, leading to a long term mess. Furthermore, systems like Drupal that provide their own browser-based UI for defining new content types impose a whole other level of complexity onto the table structure.
Proper Drupal migrations most typically involve one of the following:
- Using the Drupal Feeds and Feed Tamper modules to import content directly to the content types you’ve created within the Drupal site. The Feeds module provides assistance with the initial import whereas the Tamper module is very useful in perfecting the process.
- The Drupal Migrate module, which provides a flexible framework for migrating content into Drupal from sources such as SharePoint. Migrate provides out-of-the-box support for creating core Drupal objects including nodes (i.e. the name for all Drupal content), users, files, terms, and comments. Note Migrate can easily be extended for migrating other kinds of content.
Step 4: Review and Cleanup
One-off issues can easily and quickly be handled by hand. As such, the goal here is to identify a pattern to any issues you might discover, and use said pattern to fine tune the migration process. Finding issues is to be expected and is not by any means an emergency situation early on. To be clear, it will be essential to repeat steps 2-4 at least a few times to not only finetune the process, but to retrieve the latest content right before you launch the new site.
Closing Thoughts
The hurdle standing in front of you may have seemed at first insurmountable, but hopefully if you’ve taken anything away from article, it is the fact that you need not remain stuck on SharePoint forever. There is indeed a light at the end of the tunnel, and you can start working towards it right away.