How will 7.0 handle versions?


#1

As there are some big issues with how versions are handled in 6.x. It would be interesting to hear about how this will be changed in 7.0?

For us the big issues are:

  • Every change results in a new version. With a version comes a new blob file and 1-2 extra folders. Even on low level repo content you create outside Content Studio
  • Users are versioned. Every time you log in you get a new timestamp (last login) and that results in a new version.
  • Vacuum only removes files and not folders…

#2

Hi Tommy!

Here is an update of whate we are working on, and will hopefully also be part of the 7.0 release.

NB! The description below relates to how XP behaves under the hood in the low level storage, and how this relates to versions created by the modify operation.

In 6.x XP is greedy. It creates a version every time a node is modified (this includes for every save in content studio as well), and all versions are stored forever as long as the content is not deleted. A single version contains a lot of info (the node data, indexconfig and metadata such as permissions). If a content is deleted, and vacuum is executed, the underlying files in the blobstore will be removed. Vacuum is currently a slow process which has to go through the entire system before anything can be deleted.

With 7.0 we are changing this rather radically, but with minimal impact on the API’s.

  1. Every modify will still create an underlying version to avoid “updating” existing items
  2. However, the default assumption is now that no versions will be stored forever. Only versions that are “committed” will avoid being vacuumed. (more on this later)
  3. We are also optimizing the blobstore heavily, splitting the current “nodeversion” blob into 3 different files: nodeversion, indexconfig and permissions. This reduces the size of the files dramatically, and allows for heavy re-use across node versions. For instance, the current indexconfig + permissions dataset is typically 3-5 times the size of the actual version data - so each modify will use less space, and be much faster to update.
  4. Commits will be a new concept, where developers at will may flag a version to be kept. This will also be implemented into “content publish” method for instance. As such, a single commit will be created on publish - and hold all affected versions. We will work automate commits for content studio as far as we can. In short, so we keep relevant versions, but trow away as much as possible. This functionality is inspired by Git.

A new vacuum tool will be released after 7.0 (likely 7.1), and will be optimized and run much faster. Only versions over a certain age will be vacuumed. Naturally versions that are in a branch will not be vacuumed. Eventually, vacuuming will run as a background task, sweeping away unused “versions” continuously.

We will also look into cleaning unused blobstore folders!

Hope this helps, and let us know if you have more questions!


Last week at Enonic - 2018, week 49