Data migrations into Gutenberg Blocks WordPress Editor

As an increasing number of publishers acquire or merge with other titles, it poses a major technical question: “How do you consolidate two gigantic websites that have totally different styles, layouts and architecture?”

Fortunately, solving complex publishing problems like this is what we love most at The Code Company. And if you’re looking for strategic advice, read this guide to creating sensible publishing architecture.

However, if you’re seeking something more technical, here’s a step-by-step guide to migrating content from WordPress Classic or another CMS into Gutenberg Blocks.

Let’s start with the basics…

What is the Gutenberg WordPress Editor (and why should you use it)?

It’s been three years since WordPress 5.0 introduced a new editor called Gutenberg, which now has over 60 million users.

Otherwise referred to as the WordPress Block Editor, Gutenberg is more powerful than its predecessors and allows publishers to easily generate incredibly rich, yet consistent content.

Gutenberg content is divided into reusable “blocks”. Blocks are essentially sections of content that follow a set pattern/structure. Some Gutenberg Block examples are a heading, an embedded video, a photo gallery or a quote.

The best part about Gutenberg is that you can also create your own blocks.

These blocks can be customised to your site’s specific publishing and content needs. This is great news for niche publishers as it enables them to create custom blocks for things like ingredients lists for a cooking recipe, a rating breakdown for a customer review, or a collection of related posts with titles and thumbnails.

Anatomy of a Gutenberg Block

Some blocks are simply comprised of basic HTML. However, others can be more complex and consist of HTML comments and HTML content. The HTML comments are used to store data related to the block that isn’t presented as-is. The HTML content makes up the visible content and structure of the block. The structure of the native “paragraph” block in WordPress is shown below as an example:

<!-- wp:paragraph -->
<p>Welcome to WordPress. This is your first post. Edit or delete it, then start writing!</p>
<!-- /wp:paragraph -->

The content for a block is stored in the post content, so there is no additional metadata required or saved. Everything that’s needed by the block is available right here.

Migrating into a Gutenberg Block

To migrate content from an existing CMS (including WordPress classic content) into WordPress Gutenberg blocks, follow these steps:

1. Establish which blocks you’re using

It’s important to focus on the Gutenberg blocks you actually intend to use. It might be tempting to convert everything over into blocks, but it’s rarely necessary. If you’re migrating from WordPress classic content or standard HTML content from another CMS, you’ll find that the WordPress “classic” block will often be good enough to handle the majority of content like paragraphs, inline images, lists, etc.

If your new system uses bespoke Gutenberg blocks, make sure they’ve been built (or are mostly complete) before you commit to migrating the content for them. One challenge of iterative development is that things often change. This approach reduces the need for you to repeat tasks unnecessarily.

2. Understand the structure of the block

In order to create a block programmatically, you have to understand the block’s structure so you can replicate it. The easiest way to do this is hands-on, in the WordPress Gutenberg editor.

  • Create a new post and insert the block into the post’s content (you can select a block by clicking the + icon in the top left corner of your screen and then selecting the desired block from the list).

  • Add some content to your block
  • Once you’re happy, click the three dots and select “Edit as HTML” to see the actual markup.

Click the three dots and select “Edit as HTML” to view the block’s content.

Not all blocks can be edited this way so you may have to save the post and then retrieve the post_content from the wp_posts table (search by ID of the post). Doing this will require you to have access to the WordPress database and a basic knowledge of SQL.

/* Use this SQL query to get the content for the post in the database. */
SELECT `post_content` FROM `wp_posts` WHERE `ID` = 1

Let’s look at the structure of the “Media with Text” block as an example.

“Media with Text” block with no content.

Media with Text” block with an image and content added via Gutenberg.

<!-- wp:media-text {"mediaId":10,"mediaLink":"http://wp.local/2020/09/23/hello-world/miguel-bruna-ddianwock0c-unsplash/","mediaType":"image"} -->
<div class="wp-block-media-text alignwide is-stacked-on-mobile"><figure class="wp-block-media-text__media"><img src="http://wp.local/wp-content/uploads/2020/09/miguel-bruna-DdiaNwoCK0c-unsplash.jpg" alt="" class="wp-image-10 size-full"/></figure><div class="wp-block-media-text__content"><!-- wp:paragraph {"placeholder":"Content…","fontSize":"large"} -->
<p class="has-large-font-size">Example content</p>
<!-- /wp:paragraph --></div></div>
<!-- /wp:media-text -->

“Media with Text” block content that is saved in the post content in WordPress.

You can see that the block has an opening comment <!– wp:media-text which includes some additional data. The block otherwise is mostly made up of simple HTML.

3. Understand the data from the old system

Now that you understand the structure of the Gutenberg block in WordPress, you need to take stock of the data you have available from the old system that you’re migrating from.

For simplicity’s sake let’s use the pull quote block. Let’s imagine you’re migrating from a CMS that has three separate fields for this content:

  • A string containing a quote
  • A string containing the quote’s citation
  • A string containing a URL to more related quotes

Below is an example of such data. Note you will need to manage how you acquire this data in your PHP code.

// Example data from the old CMS.
$example_old_cms_data = array(
  'title'        => 'Article title from old CMS',
  'content'      => 'Main body content from the old CMS.',
  'quote'        => 'The truth is rarely pure and never simple.',
  'by'           => 'Oscar Wilde',
  'citation'     => 'The Importance of Being Earnest',
  'citation_url' => 'https://www.goodreads.com/work/quotes/649216',
);

For more complex or bespoke blocks, you need to be wary of when your data doesn’t match the new design. Your bespoke blocks should ideally be designed to handle such discrepancies. Otherwise, you may need to make the call as to how you’ll deal with this.

Here are your options:

  • Allow the block to work with the limited data from the old CMS
  • Don’t convert the content to a block seeing as you don’t have the data for it anyway
  • Look for ways to intuit what the data would be based on what you have

You should have a good understanding of the static and dynamic components of your Gutenberg block and where these values are coming from – pulled from your old CMS or computed dynamically

4. Build the block programmatically

By now you should understand the blocks you are working with, their structure in WordPress and the data you are trying to migrate into them from your old system. So it’s time to start building the block in code.

A simple way of doing this is to take the structure of the block in WordPress and paste it into your IDE as a PHP comment and then follow it line by line.

You can create a string variable to hold the block content and continuously concatenate it as you move through the block HTML line by line. The example below uses simple PHP concatenation but use whatever method you prefer for readability. Follow the structure as closely as you can – including whitespace.

Since you’ll likely be creating these blocks a lot, you should use a function to store the logic. This keeps it reusable and easy to read.

/**
 * Generate the HTML content for a Gutenberg 'pullquote' block.
 *
 * @param $data
 *
 * @return string
 */
function generate_pullquote_block_content( $data ) {
  $block_content = '';
 
  // For readability, extract the data into variables.
  $quote        = $data['quote'];
  $by           = $data['by'];
  $citation     = $data['citation'];
  $citation_url = $data['citation_url'];
 
  // Only generate block content if we have sufficient data.
  if ( ! empty( $quote ) ) {
 
    // Define EOL characters for readability.
    $eol = "\r\n";
 
    // The content we want to generate looks like this:
    // <!-- wp:pullquote -->
    // <figure class="wp-block-pullquote"><blockquote><p>The truth is rarely pure and never simple.</p><cite><strong>Oscar Wilde, </strong><a href="https://www.goodreads.com/work/quotes/649216">The Importance of Being Earnest</a></cite></blockquote></figure>
    // <!-- /wp:pullquote -->
 
    // Open the block.
    $block_content .= '<!-- wp:pullquote -->' . $eol;
 
    // Open the block's content.
    $block_content .= '<figure class="wp-block-pullquote"><blockquote>';
 
    // Add the quote.
    $block_content .= '<p>' . esc_html( $quote ) . '</p>';
 
    // Open the citation.
    $block_content .= '<cite>';
 
    // Add the 'by' information.
    $block_content .= '<strong>' . esc_html( $by ) . ', </strong>';
 
    // Add the citation link.
    $block_content .= '<a href="' . esc_url( $citation_url ) . '">' . esc_html( $citation ) . ', </a>';
 
    // Close the citation and the remainder of the block's content.
    $block_content .= '</cite></blockquote></figure>' . $eol;
 
    // Close the block.
    $block_content .= '<!-- /wp:pullquote -->';
  }
 
  return $block_content;
}

5. Save the post content

As mentioned earlier, blocks are simply saved in the post’s content in WordPress. You can call wp_insert_post to save the content for a new or existing post (include the ID arg if updating an existing one).

Using our example, here’s some code to handle the generation of the post content.

/**
 * Accept an array of data from the old CMS and migrate it as a post into WordPress.
 *
 * @param array $data Old CMS data.
 *
 * @return int|WP_Error
 */
function migrate_post( $data ) {
  $post_content = $data['content'];
 
  $pullquote_content = generate_pullquote_block_content( $data );
 
  if ( ! empty( $pullquote_content ) ) {
    $post_content .= $pullquote_content;
  }
 
  $post_data = [
    'post_title' => $data['title'],
    'post_content' => $post_content,
  ];
 
  $post_id = wp_insert_post( $post_data );
 
  return $post_id;
}

The migrated code as it appears in Gutenburg WordPress editor:

Full code example

To make this process easier to follow, you can now see all the code written together:

<?php
 
// Example data from the old CMS.
$example_old_cms_data = array(
  'title'        => 'Article title from old CMS',
  'content'      => 'Main body content from the old CMS.',
  'quote'        => 'The truth is rarely pure and never simple.',
  'by'           => 'Oscar Wilde',
  'citation'     => 'The Importance of Being Earnest',
  'citation_url' => 'https://www.goodreads.com/work/quotes/649216',
);
 
// Migrate the example data to WordPress.
$post_id = migrate_post( $example_old_cms_data );
 
if ( is_wp_error( $post_id ) ) {
  // Handle error.
} else {
  // Report success.
}
 
/**
 * Generate the HTML content for a Gutenberg 'pullquote' block.
 *
 * @param array $data Array of data from the old CMS.
 *
 * @return string
 */
function generate_pullquote_block_content( $data ) {
  $block_content = '';
 
  // For readability, extract the data into variables.
  $quote        = $data['quote'];
  $by           = $data['by'];
  $citation     = $data['citation'];
  $citation_url = $data['citation_url'];
 
  // Only generate block content if we have sufficient data.
  if ( ! empty( $quote ) ) {
 
    // Define EOL characters for readability.
    $eol = "\r\n";
 
    // The content we want to generate looks like this:
    // <!-- wp:pullquote -->
    // <figure class="wp-block-pullquote"><blockquote><p>The truth is rarely pure and never simple.</p><cite><strong>Oscar Wilde, </strong><a href="https://www.goodreads.com/work/quotes/649216">The Importance of Being Earnest</a></cite></blockquote></figure>
    // <!-- /wp:pullquote -->
 
    // Open the block.
    $block_content .= '<!-- wp:pullquote -->' . $eol;
 
    // Open the block's content.
    $block_content .= '<figure class="wp-block-pullquote"><blockquote>';
 
    // Add the quote.
    $block_content .= '<p>' . esc_html( $quote ) . '</p>';
 
    // Open the citation.
    $block_content .= '<cite>';
 
    // Add the 'by' information.
    $block_content .= '<strong>' . esc_html( $by ) . ', </strong>';
 
    // Add the citation link.
    $block_content .= '<a href="' . esc_url( $citation_url ) . '">' . esc_html( $citation ) . ', </a>';
 
    // Close the citation and the remainder of the block's content.
    $block_content .= '</cite></blockquote></figure>' . $eol;
 
    // Close the block.
    $block_content .= '<!-- /wp:pullquote -->';
  }
 
  return $block_content;
}
 
/**
 * Accept an array of data from the old CMS and migrate it as a post into WordPress.
 *
 * @param array $data Old CMS data.
 *
 * @return int|WP_Error
 */
function migrate_post( $data ) {
  $post_content = $data['content'];
 
  $pullquote_content = generate_pullquote_block_content( $data );
 
  if ( ! empty( $pullquote_content ) ) {
    $post_content .= $pullquote_content;
  }
 
  $post_data = [
    'post_title' => $data['title'],
    'post_content' => $post_content,
  ];
 
  $post_id = wp_insert_post( $post_data );
 
  return $post_id;
}
 
?>

Important considerations when migrating data into Gutenberg Blocks

While these steps are relatively simple, it’s no secret that data migrations can be fraught with risk. Migration problems can cause your site to crash, or data can be compromised or lost. This all impacts your reputation and, ultimately, your bottom line.

To mitigate these risks, we typically recommend continuous migration, which you can learn more about below.

READ MORE: How to avoid data migration issues

Scott Commins

Scott was previously a Senior WordPress and backend engineer at The Code Company.