Mass mutation of nodes - best approach

We’re investigating how to best do a mass mutation of content in XP, and we’re curious to hear experiences from others who have done something similar.

Basically, we need to replace every occurrences of a particular word in inputfields/textareas with a new way of spelling.

Summary of requirements:

  • Date stamps cannot change as a result of the mutation process. Our planned approach so far is using the Node Library and modify the node directly to avoid updates.
  • We’ll query and traverse every single node in draft and master and then do the relevant search and replace on each field in the node.
  • We have a lot of content types with dynamic content and metadata. Rather than looking through all keys in a node, we’re thinking of creating a library of keys where the find&replace should apply.

While we are investigating our own suggested approach, any hints and experiences are very welcome!

Hi Terje.

You are heading in the right direction. There are some caveats and tricks:

  • Build a node-level editor that is capable of identifying and replacing the phrase
  • The schema API (beta) may be used to identify fields of certain types in every content type, x-data and component automatically

Running the update

  • Query draft branch for all node type content items with “ready” state containing the phrase
    ** Apply the “patch” editor on these items, and use the “node push” to move the change into master branch (without changing any dates)
  • Next, query all items in master branch that still contain the phrase
    ** Apply the patch to these items directly on master (keep the list of keys)
  • Finally, update items that are being edited
    ** Query draft branch for items that are “not ready”
    ** Apply the patch one final time on these (this will cover any content currently being edited, or not yet published).

Also, if you are using layering, you must perform a similar step - but also consider the localization state of an item.

FYI: In the upcoming XP8, we are introducing a new API function that simplifies this update process significantly, which may come in handy the next time you will need to upgrade your content.

Thanks, that’s great advice! :pray: We’ll look into the strategy of applying changes to draft nodes and then pushing them to master, that seems like a better way to do it.

Good to know that there’s a new API coming for XP8 - I’m sure this won’t be the last time we have to make bulk changes like this…

Quick follow-up question:

When using Node Lib, how can we get the content item into a “Published” state? If I use the “push” method, it pushes the node to master, but it’s only marked as “Edited”.

We would like the content item to remain in its original state after our mutation is done so not to confuse the owners of the content.

Generally, when manipulating content, you should use content API.

To get the “published” entry in version history, you need to use “content.publish” which is a higher level variant of the node.push.

I thought so… However from what I understand, using publish from Content API will also update the dates of the content, which is something we need to avoid in this case. That’s why I’ve been focusing on the Node API.

We’ll might have to re-think our strategies and solve this in another way :sunny:

Publish from content API will not change any dates, unless this is the first time the content is published. In which case you probably don’t want to publish this item anyway

My mistake - I thought the Content API publish method would update the field modifiedTime… We’ll dig around and see if there is in fact a way of solving this.