Sunday, June 21, 2009

JBPM Migrator


JBPM Migrator

Overview

We are using the JBPM workflow library (version 3.2.2) on a project here at Intelliware. After some analysis, we chose JBPM as our process modelling tool because it was open source and it was easy to integrate into our technology stack (Java 4, Hibernate for persistence).

Over time, a process definition needs to change. Usually these changes reflect new business requirements, but they can also be related to a bug fix or an improvement in the existing process. So when we release a new version of a software product, we may need to update the older process instances in the database. Rather than provide a mechanism to migrate process instances, the JBPM library supports multiple versions of a process definition simultaneously:

Process instances always execute to the process definition that they are started in. But JBPM allows for multiple process definitions of the same name to coexist in the database. So typically, a process instance is started in the latest version available at that time and it will keep on executing in that same process definition for its complete lifetime. When a newer version is deployed, newly created instances will be started in the newest version, while older process instances keep on executing in the older process definitions.1

This wasn't very appealing to us. Our application has processes that can be resumed at any point in the future (potentially years later) so by following the JBPM prescribed approach our developers would have to support - and the QA folks would have to test - outdated process instances for years to come.

What we wanted was the ability to migrate outdated process instances to the current process definition. Indeed, the JBPM documentation addresses this approach:

An alternative approach to changing process definitions might be to convert the executions to a new process definition. Please take into account that this is not trivial due to the long-lived nature of business processes. Currently, this is an experimental area so for which there are not yet much out-of-the-box support.

As you know there is a clear distinction between process definition data, process instance data (the runtime data) and the logging data. With this approach, you create a separate new process definition in the JBPM database (by e.g. deploying a new version of the same process). Then the runtime information is converted to the new process definition. This might involve a translation cause tokens in the old process might be pointing to nodes that have been removed in the new version. So only new data is created in the database. But one execution of a process is spread over two process instance objects...
JBPM doesn't provide a tool to do this, but the code is open source and well documented, so we build a jbpm-migrator ourselves.

The Migrator

Given an old process instance, the migrator is responsible for transferring data to the latest process instance. The migrator transfers:


  1. All of the tokens. This is facilitated through the use of mappings .


  2. All persistent and transient variables.


  3. It adds a migration memo (a String in the persistent variables map) to the new process instance which records info about the migration (the current date, the old process definition version#, the old process instance id , etc).
The migration is performed recursively, so each sub-process is migrated according to the above steps.

Mapping Token Nodes

One of the key challenges when migrating a process instance is the renaming or removal of wait state nodes. Wait state nodes (the green boxes in the diagram below) are where tokens reside when a process instance is persisted. Determining where a token should be placed is facilitated through a migration. A Migration contains a map that tells the Migrator where to put a token from a deprecated wait state node in the current process. Consider the following three versions of a Process called 'Application':





















For the Application Process, two migrations would be written2:


  1. Migration #1 (maps tokens from version #1 to version #2):
    {'init' => 'start', 'invalid' => 'Requires Review', 'end' => 'application completed'}


  2. Migration #2 (maps tokens from version #2 to version #3):
    {'start' => 'application received'}
Our migrator takes these individual migration maps and creates a composite map. With the composite map, the migrator can map a token from any outdated wait-state node directly to the appropriate node in the current process. The composite map of these two migrations would be written as:
{'init' => 'application received', 'start' => 'application received', 'invalid' => 'Requires Review', 'end' => 'application completed'}
Note that the map only needs to explain what to do with tokens on deprecated nodes (e.g. init, start, invalid, and end). No mapping is required for non-deprecated nodes (e.g. managerial audit, application completed, and Requires review). By default, if no mapping exists for a wait state node (i.e. it is not deprecated) the migrator will attempt to move the token to a node with the same name in the new version.

The example I am using includes a discrete migration for each version of the process but this is not always required. Depending on the changes being made to the Process Definition, it is possible that the developer will not be required to include a migration at all. This is a good thing. It means that we don't have to write a migration for every single process definition change and there is almost no configuration required on behalf of the developer.

But there is a cost. Deprecated nodes can never be used in future definitions of your process. So with our 'Application' Process, the init, start, invalid, and end nodes can never be used again in the process definition (as wait states). Doing so would break the migrator.

Defining a Migration

Migrations are written as Java classes. The class must implement the Migration interface, it must not be abstract, and it must contain a default constructor. The Migration interface declares one method that must be implemented:
public StateNodeMap createNodeMap();
Here is how you would express the first migration for 'Application' Process Definition example:
public class ApplicationProcessMigration001 implements Migration{
public StateNodeMap createNodeMap() {
return new StateNodeMap(new String[][]{
{"init", "start"}, {"invalid", "Requires Review"}, {"end", "application completed"}
});
}
}
Defining a Migrator

How do we create the migrator and use it to perform a migration? Like this:
Migrator migrator = new Migrator(“ApplicationProcess”, jbpmContext, “com.foobar.ApplicationProcessMigration”);
ProcessInstance newProcess = migrator.migrate(oldProcess);
The parameters used to create the Migrator instance are:


  1. The name of the Process Definition that it will be migrating.


  2. A JbpmContext instance. The migrator requires this to look up the latest Process Definition.


  3. The Migration base class name. The migrator assumes that your migrations use the pattern package.ClassName{migration#}. For the base Class name “com.foobar.ApplicationProcessMigration”, the migrator will attempt to load and instantiate classes named “com.foobar.ApplicationProcessMigration001”, “com.foobar.ApplicationProcessMigration002”, etc, until it can’t find any valid classes.
The third point is another example of convention over configuration, resulting in less maintenance and tedium on behalf of the developer.

Unit and Integration Testing

I'm putting this section last, but it was one of our top concerns when considering an approach to migrations. We debated a number of approaches to testing and most of them were deemed to be too complex and error prone.

We already unit tested our JBPM process definitions to make sure that transitions point to valid nodes and that all actions declared in the process were available on the Classpath. With regards to the migrations, we have a base test that asserts that:


  1. A developer has not introduced a deprecated node into the current process definition.


  2. All current nodes in the composite map exist in the process definition.


  3. All current nodes in the composite map are valid wait state nodes.
Our application is deployed several times a week to a test environment where QA folks test functionality and validate stories. This provides a great opportunity to discover problems with our application early, and the migration code is no exception. We write migrations throughout the entire development cycle (not just when we release) to make sure, so if a QA person loads up an outdated process from from last week, the migrator will be run. This gives us a chance to find runtime bugs that the unit tests can't locate.




  1. http://docs.jboss.com/jbpm/v3.2/userguide/html/jpdl.html#processversioning
  2. I'm using the Ruby literal syntax for a Hash because there isn't one in Java.

19 comments:

  1. This is really cool !

    We plan to build such a thing for a long time. You've done it even more extensively that we aimed for. Great work. Kudo's

    Any chance that you want to contribute this ?

    If yes, post a confirmation and a link to this blog on our developer forum : http://www.jboss.org/index.html?module=bb&op=viewforum&f=219

    Then we'll guide you through the contribution process.

    Keep up the good work !

    regards, tom.

    ReplyDelete
  2. I'd like that. I will need to confirm some Intellectual Property issues first. I will be in touch shortly.

    ReplyDelete
  3. Would be nice to have a ui for this as well. I've seen one in a another system where you get two lists and can do the mapping on screen.

    ReplyDelete
  4. This looks really great. I'd love to this included in jBPM. Congratulations for this great piece of work.

    ReplyDelete
  5. This would be incredibly helpful to us (we're a nonprofit that chose to use jBPM, but we're now stuck with a lot of obsolete process instances). Is there any likelihood you might release this migrator?

    ReplyDelete
  6. @Synthetic Zero

    When I originally wrote this post, my employer wasn't planning to open source the tool. We are currently in the planning stages of a new project with one of our client's, so my effort's to open-source the jbpm-migrator are competing with some other priorities.

    Things are taking a little longer than I would like, but it won't take forever. I will follow up on the status of my open-source efforts with a new post very soon.

    Stay tuned and please keep the comments coming!

    ReplyDelete
  7. this is going to be so useful my current project. looking forward to seeing the code asap.

    ReplyDelete
  8. Thanks for contributing your code, we've been looking for something like this to aid us in our current project. I did download the source from jboss's svn repository, and I do have some questions. I did not see how the current code handles existing taskintances. This is a requirement for the current project I'm working on, we need to preserve existing processes' task instances and assignments. Is there a plan to include this feature, moving forward?

    ReplyDelete
  9. @Anonymous

    There is no plan as of yet. I have asked Tom and some of the other developers to take a look at the code and get back to me. I will include this request in whatever features are deemed necessary.

    ReplyDelete
  10. Can anyone point me in the right direction in finding this piece of code?

    Where is it released to? Can anyone give me a link?

    I have pretty tight deadline, and would very much like to use this tool, in stead of hacking away on the database tables :)

    br Kim

    ReplyDelete
  11. Hi,

    I apologize, but I haven't had a chance to post the software anywhere for download. When we contributed it to the jBPM project, they took the source and adapted it for the 4.2 release.

    That said, I will follow up and find out if there is a location on the JBoss servers to post this (if not, I will look at sourceforge or something similar).

    In the meantime, send me an email (caleb dot powell at gmail dot com) and I will email you the binary and src artifacts so that you can start using them right away.

    Cheers!

    Caleb

    ReplyDelete
  12. I've put the initial binary and source release up on google code:

    http://code.google.com/p/jbpm-instance-migrator/downloads/list

    Cheers!

    ReplyDelete
  13. Great!
    Thank you very much for the quick response. I've downloaded the sources and will look into it right away.
    We're using the 3.2.2 version, is it much work in making it work for this version?

    Thanks again!

    ReplyDelete
  14. It was written to work with 3.2.2 so you should be in good shape. There is no documentation yet, so you can refer to this blog. Send me an email if you have any questions.

    ReplyDelete
  15. Good Job,

    I have a question: Does the JBPM Migrator migrate the related tasks of the migrated process instance?
    I tried it but it seems not happen... or I use it incorrectly.

    We have exactly your problems but I thought the migrator migrates anything related to the previous process instance containing the actors. Am I right?

    Thank You.
    Alireza

    ReplyDelete
  16. Hi Alireza,

    no, we don't currently migrate Task nodes. This appears to be an oversight on our part (we don't currently use Task nodes so we left them out).

    I would be happy to make the modification though. Please join the project and submit your Issue (http://code.google.com/p/jbpm-instance-migrator/issues/list)

    ReplyDelete
  17. Hi Caleb,
    And thank for your rapid reply,

    I joined the project and submitted an issue in it:
    "Migrate Task Nodes"

    I hope the issue will be implemented ASAP, otherwise I may participate in implementing this feature.

    Best Wishes
    Alireza

    ReplyDelete
  18. Hi Celeb,
    i am in serious close deadline.
    we want to create the same stuff(to migrate the old process instance to new one).

    Can you please send the the code.
    my email id is - pravintarte@gmail.com

    ReplyDelete
  19. Hi Celeb
    I am from Toronto as well and currently fighting with the same issue. My email id is qheider@gmail.com. Is it possible to reach you by any means (phone or email) other than blog please let me know.
    kind regards
    Quazi Heider

    ReplyDelete