2014/02/10

Reuse your business assets with jBPM 6

As described in the article about deployment model in jBPM 6, business assets are included in so called knowledge archives - kjars. Kjar is nothing more than regular jar file with knowledge descriptor - kmodule.xml that is under control of Apache Maven. But there is more in this...

Since kjars are managed by maven one kjar can declare other kjar as its dependency. Which means that assets included in the dependent one will be available for execution as well. That is all available when working with jbpm console (aka kie workbench). So to provide more information on this let's look at an example.

Use case definition

There is a need to prepare a simple registration process that will ask user who starts the process for basic personal information like name, age. Then there will be some business rules that will evaluate if that person is adult or a teenager. Once completed it will be presented to reviewer to see the details of the evaluation. Last but not least is to proceed with actual registration in the system. So we can see that there is part of this logic that might be a very well considered a reusable - part that is responsible for gathering information about a person.

So let's design it this way:

  • first project - reusable project - will actually deal only with gathering personal information and presenting that to verifying personnel after business rules were evaluated.
           As you can see, besides business assets data model is included in reusable-project so it can
           be used by projects that declare it as dependency, same as with any other Maven based project.
  • second project - top project - will provide additional process logic on top of the common collect info procedure and do registration stuff.
So, this is the structure of the projects we are going to use to support the case described.

What must be actually done to make this work? First of all the reusable project needs to be created as it will be a dependency of the top project so it must exists. In the reusable project we need to define knowledge base and knowledge session to disable auto deploy, as we don't want to have it on runtime as a standalone project but included in top project.  With that said we create:
  • one knowledge base (kbase) - ensure it's not marked as default
  • one stateful knowledge session (ksession) - ensure it's not marked as default
  • include all packages - use * wildcard for it
Note: we do this to illustrate what configuration options we have here and to ensure that auto deployment to runtime environment will not be possible - no default knowledge base and knowledge session.

Let's create this collect info process that could look like this:
a simple process, two user tasks and business rule task. So what will it do:
  • Enter person details will collect personal information from a user - name and age
  • Evaluate will execute business rules that are responsible for marking given person as adult if (s)he is older than 21
  • Verify simply presents the results of the process
Both rule and user tasks operate on data model, to be more specific org.jbpm.test.Person class. It was created using Data Modeler and places inside the kjar.
Next process and tasks forms are generated and altered to ask for the right information only. Person class includes three properties:
  • name - string
  • age - integrer
  • adult - boolean
Since we have business rules for evaluating if user is adult or teenager we don't want to ask for it via forms. So these are removed from the "Enter person details" task.
With all that we are ready to build the project so it can be used as dependency. So, just hit the build and deploy button in Project Editor and observe Problems Panel. If everything went ok, it will display single error saying that deployment failed because it cannot find defaultKBase. That is expected as we defined knowledge base and knowledge session that is not marked as default and thus auto deploy fails. But the kjar is available in maven repository so it can be used as dependency.

Next we create top project and add single dependency to it of reusable-project. This is done in Project Editor in Dependencies list section. You can add it from repository as it's already built. Next we need to define knowledge base and knowledge session:
  • one knowledge base (kbase-top) - ensure it's marked as default
  • one stateful knowledge session (ksession-top) - ensure it's marked as default
  • include all packages - use * wildcard for it
  • include kbase defined in reusable project - kbase
Note: make sure that names do no collide between kjars as that will result in failing compilation of knowledge base.

Now we are ready to create our top level process that could look like this:

Again simple process, that will:

  • log incoming request for registration using script task - Log registration
  • invoke common part to collect info - Collect Info - by invoking the reusable project process, rules, forms etc
  • and lastly will show the outcome of the collection process for approval
The most important part here is that Collect Info activity will use process (and other assets) from another kjar thanks to usage of maven dependencies and kbase inclusion.

To examine this example in details you can clone the repository in your jbpm console (kie workbench) and build both projects - first reusable project and then top project. 

This illustrates only the top of the mountain that is provided by maven dependencies and knowledge inclusion in jBPM 6. I would like to encourage you to explore this possibilities and look at options to reuse your knowledge in structured and controlled way - remember this is all standardized by maven so things like versioning are supported as well.

This is a feature that will be available in 6.1.0 so feel free to jump into the wild right away by making use of it in nightly builds. Comments and feedback are welcome.


2014/02/06

jBPM 6 - store your process variables anywhere

Most of jBPM users is aware of how jBPM stores process variable but let's recap it here again just for completeness.

NOTE: this article covers jBPM that uses persistence as without persistence process variables are kept in memory only.

 jBPM puts single requirement on the objects that are used as process variables:
  • object must be serializable (simply must implement java.io.Serializable interface)
with that jBPM engine is capable to store all process variables as part of process instance using marshaling mechanism that is backed by Google Protocol Buffers. That means actual instances are marshaled into bytes and stored in data base. This is not always desired especially in case of objects that are actually not owned by the process instance. For example:
  • JPA entities of another system
  • documents stored in document/content management system 
  • etc
Luckily, jBPM has a solution to that as well called pluggable Variable Persistence Strategy. Out of the box jBPM provides two strategies:
  • serialization based, mentioned above that actually works on all object types as long as they are serializable (org.drools.core.marshalling.impl.SerializablePlaceholderResolverStrategy)
  • JPA based that works on objects that are entities (org.drools.persistence.jpa.marshaller.JPAPlaceholderResolverStrategy)
Let's spend some time on the JPA based strategy as it might become rather useful in many cases where jBPM is used in embedded mode. Consider following scenario where our business process uses entities as process variables. The same entities might be altered from outside of the process and we would like to keep them up to date within the process as well. To do so, we need to use JPA based strategy for variable persistence that is capable of storing entities in data base and then retrieving them back.
To configure variable persistence strategy you need to place it into the environment that is the used when creating knowledge sessions. Note that the order of the strategies is important as they will be evaluated which one will be used in the order they are given. best practice is to always set the serialization based strategy to be the last one. 
An example how you can use it with RuntimeManager:


// create entity manager factory
EntityManagerFactory emf = Persistence.createEntityManagerFactory("org.jbpm.sample");

RuntimeEnvironment environment = 
     RuntimeEnvironmentBuilder.Factory.get().newDefaultBuilder()
     .entityManagerFactory(emf) 
     .addEnvironmentEntry(EnvironmentName.OBJECT_MARSHALLING_STRATEGIES, 
          new ObjectMarshallingStrategy[]{
// set the entity manager factory for jpa strategy so it 
// know how to store and read entities     
               new JPAPlaceholderResolverStrategy(emf),
// set the serialization based strategy as last one to
// deal with non entities classes
               new SerializablePlaceholderResolverStrategy( 
                          ClassObjectMarshallingStrategyAcceptor.DEFAULT  )
         })  
     .addAsset(ResourceFactory.newClassPathResource("cmis-store.bpmn"), 
               ResourceType.BPMN2)
     .get();
// create the runtime manager and start using entities as part of your process  RuntimeManager manager = 
     RuntimeManagerFactory.Factory.get().newSingletonRuntimeManager(environment);

Once we know how to configure it, let's take some time to understand how it actually works. First of all, every process variable on the time when it's going to be persisted will be evaluated on the strategy and it's up to the strategy to accept or reject given variable, if accepted only that strategy will be used to persist the variable, if rejected other strategies will be consulted.

Note: make sure that you add your entity classes into persistence.xml that will be used by the jpa strategy

JPA will accept only classes that declares a field with @Id annotation (javax.persistence.Id) that allows us to ensure we will have an unique id to be used when retrieving the variable.
Serialization based one simply accepts all variables by default and thus it should be the last strategy inline. Although this default behavior can be altered by providing other acceptor implementation.

Once the strategy accepts the variable it performs marshaling operation to store the variable and unmarshaling to retrieve the variable from the back end store (of the type it supports).

In case of JPA, marshaling will check if entity is already stored entity - has id set - and:

  • if not, it will persist the entity using entity manager factory that was assigned to it
  • if yes, it will merge it with the persistence context to make sure up to date information is stored
when unmarshaling it will use the unique id of the entity to load it from the database and provide as process variable. It's that simple :)

With that, we quickly covered the default (serialization based) strategy and JPA based strategy. But the title of this article says we can store variables anywhere, so how's that possible?
It's possible because of the nature of variable persistence strategies - they are pluggable. We can create our own and simply add it to the environment and process variables that meets the acceptance criteria of the strategy will be persisted by that given strategy. To not leave you with empty hands let's look at another implementation I created for purpose of this article (although when working on it I believe it will become more than just example for this article).

Implementing variable persistence strategy is actually very simple, it's a matter of implementing single interface: org.kie.api.marshalling.ObjectMarshallingStrategy

public interface ObjectMarshallingStrategy {
    
    public boolean accept(Object object);

    public void write(ObjectOutputStream os,
                      Object object) throws IOException;
    
    public Object read(ObjectInputStream os) throws IOException, ClassNotFoundException;
    

    public byte[] marshal( Context context,
                           ObjectOutputStream os,
                           Object object ) throws IOException;
    
    public Object unmarshal( Context context,
                             ObjectInputStream is,
                             byte[] object,
                             ClassLoader classloader ) throws IOException, ClassNotFoundException;

    public Context createContext();
}

the most important methods for us are:

  • accept - decides if this strategy will be responsible for persistence of given object
  • marshal - performs operation to store process variable
  • unmarshal - performs operation to retrieve process variable
the other remaining are for backward compatibility reasons with old marshaling framework prior to protobuf, so it's not mandatory to be implemented but it's worth to put the logic there too as most likely it will be same as for marshal (write) and unmarshal (read).

So the mentioned example implementation is for storing and retrieving process variables as document from Content/Document management systems that support access to the repository using CMIS. I used Apache Chemistry as the integration component that can easily talk to CMIS enabled systems like for example Alfresco.


So first bit of requirements:

  • process variables must be of certain type to be stored in the content repository
  • documents (process variables stored in cms) can be:
    • created
    • updated (with versioning)
    • read
  • process variables must be kept up to date
so all these sounds simple and of course that's the point to keep it simple at this point. CMS can be used for much more but we wanted to get started and then enhance it if needed. So the implementation of strategy org.jbpm.integration.cmis.impl.OpenCMISPlaceholderResolverStrategy supports following:
  • when marshaling
    • create new documents if it does not have object id assigned yet
    • update document if it has already object id assigned
      • by overriding existing content
      • by creating new major version of the document 
      • by creating new minor version of the document
  • when unmarshaling
    • load the content of the document based on given object id
So you can actually use this strategy for:
  • creating new documents from the process based on custom content
  • update existing documents with custom content
  • load existing documents into process variable based on object id only
These are very high level details but let's look at the actual code that does that "magic", let's start with marshal logic - note that is bit simplified for readability here and complete code can be found in github.


public byte[] marshal(Context context, ObjectOutputStream os, Object object) throws IOException {
 Document document = (Document) object;
 // connect to repository
 Session session = getRepositorySession(user, password, url, repository);
 try {
  if (document.getDocumentContent() != null) {
   // no object id yet, let's create the document
   if (document.getObjectId() == null) {
    Folder parent = ... // find folder by path
    if (parent == null) {
     parent = .. // create folder
    }
    // now we are ready to create the document in CMS
   } else {
      // object id exists so time to update     
   }
  }
 // now nee need to store some info as part of the process instance
 // so we can later on look up, in this case is the object id and class
 // that we use as process variable so we can recreate the instance on read
     ByteArrayOutputStream buff = new ByteArrayOutputStream();
     ObjectOutputStream oos = new ObjectOutputStream( buff );
     oos.writeUTF(document.getObjectId());
     oos.writeUTF(object.getClass().getCanonicalName());
     oos.close();
     return buff.toByteArray();
 } finally {
  // let's clear the session in the end
  session.clear();
 }
}

so as you can see, it first deals with the actual storage (in this case CMIS based repository) and then save some small details to be able to recreate the actual object instance on reading. It stores objectId and fully qualified class name of the process variable. And that's it. Process variable of type Document will be stored inside content repository.

Then let's look at the unmarshal method:


public Object unmarshal(Context context, ObjectInputStream ois, byte[] object, ClassLoader classloader) throws IOException, ClassNotFoundException {
 DroolsObjectInputStream is = new DroolsObjectInputStream( new ByteArrayInputStream( object ), classloader );
 // first we read out the object id and class name we stored during marshaling
 String objectId = is.readUTF();
 String canonicalName = is.readUTF();
 // connect to repository
 Session session = getRepositorySession(user, password, url, repository);
 try {
  // get the document from repository and create new instance ot the variable class
  CmisObject doc = .....
  Document document = (Document) Class.forName(canonicalName).newInstance();
  // populate process variable with meta data and content
  document.setObjectId(objectId);
  document.setDocumentName(doc.getName());   
  document.setFolderName(getFolderName(doc.getParents()));
  document.setFolderPath(getPathAsString(doc.getPaths()));
  if (doc.getContentStream() != null) {
   ContentStream stream = doc.getContentStream();
   document.setDocumentContent(IOUtils.toByteArray(stream.getStream()));
   document.setUpdated(false);
   document.setDocumentType(stream.getMimeType());
  }
  return document;
 } catch(Exception e) {
  throw new RuntimeException("Cannot read document from CMIS", e);
 } finally {
  // do some clean up...
  is.close();
  session.clear();
 }
}

nothing more that the logic to get ids and class name so the instance can be recreated and load the document from cms repository and we're done :)

Last but not least, the accept method.


public boolean accept(Object object) {
    if (object instanceof Document) {
 return true;
    }
    return false;
}

and that is all that is needed to actually implement your own variable persistence strategy. The only thing left is to register the strategy on the environment so it will be evaluated when storing/retrieving variables. It's done the same way as described for JPA based on.

Complete source code with some tests showing complete usage case from process can be found here. Enjoy and feel free to provide feedback, maybe it's worth to start producing repository of such strategies so we can have rather rich set of strategies to be used...

2014/02/01

how to deploy processes in jBPM 6?

After release of 6.0 of jBPM, there were number of questions coming from community about how processes can be deployed into the new and shiny jbpm console?

So let's start with short recap on how the deployment model look like in jBPM 6. In version 5.x processes were stored in so called packages produced by Guvnor and next downloaded by jbpm console for execution using KnowledgeAgent. Alternatively one could drop their process files (bpmn2 files) into a predefined directory that was scanned on the jbpm console start. That was it.

That enforces users to always use Guvnor when dynamic deployment was needed. Although there is nothing wrong with it, actually that was recommended approach but not everyone was happy with that setup.

Version 6, on the other hand moves away from proprietary packages in favor of, well known and mature, Apache Maven based packaging - known as knowledge archives - kjar. What does that mean? First of all, processes, rules etc (aka business assets) are now part of a simple jar file built and managed by Maven. Along the business assets, java classes and other file types are stored in the jar file too. Moreover, as any other maven artifact, kjar can have defined dependencies on other artifacts including other kjars.
What makes the kjar special when compared with regular jars? It is a single descriptor file kept inside META-INF directory of the kjar - kmodule.xml. That descriptor allows to define:

  • knowledge bases and their properties
  • knowledge sessions and their properties
  • work item handlers
  • event listeners
By default, this descriptor is empty (just kmodule root element) and is considered as marker file. Whenever a runtime component (such as jbpm console) is about to process kjar it looks up kmodule.xml to build its runtime representation. See documentation for more details about kmodule.xml and kjars.

Alright, now we know bit more about what is actually placed on runtime environment - kjar. So how we can deploy kjar into running jbpm console? There are several ways:

Design and build your kjar inside jbpm console

The easiest way is to actually use jbpm console to completely build the kjar. For that purpose there is entire perspective available - Authoring perspective - that consist of quite big set of editors tailored for various asset types.

First, you have to have repository created where your projects (after they are built they become kjars) will be stored. When running the demo setup of jbpm console (installed by jbpm installer), you will have two repositories already available - jbpm-playground and uf-playground. You can use any of these or create new repository.
Once you have repository available, create new item - a project - you need to specify GAV (GroupId, ArtifactId, Version) to name your project.
Next you create business assets in it, like business processes, rules, data model, forms, etc. And now we are at the stage where we should build and deploy our project into runtime. Nothing simpler than that - just press "Build & Deploy" button and you're ready to rock!

Is that really that simple?! In most of the cases, yes, it is really that simple. But you need to be aware of several rules (convention over configuration) that drive the build and deploy. First rule is that everything needs to be properly designed - processes, rules, etc - that is the build phase that catches any compilation or validation errors and provides feedback to the user via Problems Panel.
Assuming, all assets are built successfully, the deploy phase comes into the picture. Deploy phase is actually a two step process:
  • Maven phase - 
    • it installs build project (now it's kjar already) into Maven local repository (usually ~/.m2/repository but it can be configured with standard maven way - settings.xml)
    • deploys built project into jbpm console embedded Maven repository - it's a remote repository accessible over http and can be declared in pom.xml, settings.xml as any other maven repository
  • Runtime phase
    • once Maven phase is completed successfully, jbpm console will attempt to deploy the kjar into runtime environment for execution. Here are few requirements to make this happen:
      • kmodule.xml needs to be empty - which it is by default unless you edited it via Project Editor
      • kmodule.xml must have at least one knowledge base and stateful knowledge session defined and marked as default
When both phases are successfully completed, your kjar is deployed to runtime environment and ready for execution. Simply go to Process Management --> Process Definitions to examine what's there and start executing your just deployed processes.

So that's first and the easiest way to get started with deployments in jBPM 6.

Build project in IDE and push to console for build and deploy

Another approach would be when you prefer to do the heavy work in your IDE like Eclipse (since the modeling capabilities - bpmn2 modeler - is only available in Eclipse).  So you do pretty much similar steps, although no need to create repository here but clone an existing one from jbpm console instead.  So you first start with cloning of an existing repository. 

git clone ssh://{jbpmconsole-host}:{port}/{repository-name}

Then create Maven project - you can actually do that with jBPM Project wizard in eclipse, that creates simple Maven project with sample business process and executable class in it to get you started much faster. 

Note: make sure you place the project in the cloned repository structure so it can be later on pushed back.

It declares dependencies to jbpm-test module to be able to execute the sample process.
Once you have a mavenized project, you're ready to start working on your business assets, data model and more. 
When done, you're ready to push your project into jbpm console so it can be built and deployed to runtime environment. To do so, you need to use any GIT tool that allows you to pull and push changes from your working copy into the master repository. To add all files in your working copy into commit index:

git add -A

then examine if you haven't added too much like the target folder, if so create or edit .gitignore file to exclude unneeded files. And commit:

git commit -m "my first jbpm project"

once committed, push it to origin

git push origin master

now go into jbpm console Authoring perspective, and you should see you project in the repository, it's ready to be build and deployed. Just follow same step from the first approach to build and deploy it. 
That was second approach to deploying business assets into jbpm console version 6. Somehow in between developers and business users. Might also be seen as collaboration way - where initially business users create high level processes, rules etc and then developers step in and add implementation details and some "glue code" to make it fully executable.

Build and deploy to Maven from IDE

This one focuses completely on developers and allows to actually do the work without being too much aware of jbpm console. So here developers build regular maven projects that include business assets, java classes, forms and then add the kmodule.xml to make the jar become kjar. Everything is done in IDE of developer choice. Same goes for version control system, it does not have to be git any more, in this case. That is because, jbpm console won't be used as source management tool for these projects but it will be used only for pure execution capabilities.

Once you're done with development, you simply build the project with maven (mvm clean install). That makes it directly available for any other components that would like to use it on your local machine. So if you're running jbpm console on your machine you can directly skip to section deployment (three paragraphs below ;))

When jbpm console is running on remote host, you have two ways to make it aware of your artifacts built externally:
  • deploy (using maven) your projects into jbpm console Maven repository - as this is like any other repository you can use maven deploy goal after you have defined that repository either in your pom.xml or settings.xml
  • make jbpm console maven installation aware of any external maven repositories it should consider while deploying kjars
The first one, deploy to maven repository, does not have anything special, it's as simple as defining the repository in settings.xml so it can be successfully contacted and the artifact can be stored when running mvm clean install deploy.
Then the second approach is again standard maven way. On the machine (and user) that jbpm console is running, you need to add your main maven repository into settings.xml so whenever jbpm console will attempt to deploy the kjar it will look up for it in your maven repository.

With all these steps, jbpm console is now capable of finding the kjars that are outside of it's local maven repository so it can download them when needed. You can now go to jbpm console Deploy --> Deployments perspective where you can add new deployment units manually.
It's as simple as providing GAV of the project you want to deploy and optionally knowledge base and knowledge session names if you defined more than what is there by default.
In addition to that, you can select runtime strategy that fits your requirements for that given kjar - chose one from Singleton, Per Request or Per Process instance.

That concludes deployment options available in jBPM version 6. It promotes well known standards defined by maven and allow various ways of integrating with the jbpm console functionality. You, as a user, are in the control how you work with the tooling where you can leverage it's full power to do everything over the web, integrate with GIT server, to do everything externally and use it only for execution.

Hope that puts some light on the way you can use jBPM 6 out of the box and empowers your day to day work. As usual, ending with the same sentence: all comments and ideas are more than welcome.