2015/09/10

Unified KIE Execution Server - Part 2

This blog post is continuation of the first of the series about KIE Execution Server. In this article KIE Server Client will be introduced and used for basic operations on KIE Execution Server.

In the first part, we have went through the details of installation on Wildfly and verification with simple REST client to show it's actually working. This time we do pretty much the same verification although we expand it with further operations and make it via KIE Server Client instead.

So let's get started. We are going to use same container project (hr - org.jbpm:HR:1.0) that includes hiring process, that process has set of user tasks that we will be creating and working with. To be able to work on tasks our user (kieserver) needs to be member of the following roles used by the hiring process:

  • HR
  • IT
  • Accounting
So to add these roles to our user we again use add-user script that comes with wildfly to simply update already existing user


NOTE: don't forget that kieserver user must have kie-server role assigned as well.

With that we are ready to start the server again

KIE Server Client

KIE Server Client is a lightweight library that custom application can use to interact with KIE Execution Server when is written in Java. That library extremely simplifies usage of the KIE Execution Server and make it easier to migrate between versions because it hides all internals that might change between versions. 

To illustrate that it is actually lightweight here is the list of dependencies needed on runtime to execute KIE Server Client


[INFO]
[INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ kie-server-client ---
[INFO] org.kie.server:kie-server-client:bundle:6.3.0-SNAPSHOT
[INFO] +- org.kie:kie-api:jar:6.3.0-SNAPSHOT:compile
[INFO] +- org.kie:kie-internal:jar:6.3.0-SNAPSHOT:compile
[INFO] +- org.kie.server:kie-server-api:jar:6.3.0-SNAPSHOT:compile
[INFO] |  +- org.drools:drools-core:jar:6.3.0-SNAPSHOT:compile
[INFO] |  |  +- org.mvel:mvel2:jar:2.2.6.Final:compile
[INFO] |  |  \- commons-codec:commons-codec:jar:1.4:compile
[INFO] |  +- org.codehaus.jackson:jackson-core-asl:jar:1.9.9:compile
[INFO] |  +- com.thoughtworks.xstream:xstream:jar:1.4.7:compile
[INFO] |  |  +- xmlpull:xmlpull:jar:1.1.3.1:compile
[INFO] |  |  \- xpp3:xpp3_min:jar:1.1.4c:compile
[INFO] |  \- org.apache.commons:commons-lang3:jar:3.1:compile
[INFO] +- org.jboss.resteasy:jaxrs-api:jar:2.3.10.Final:compile
[INFO] |  \- org.jboss.logging:jboss-logging:jar:3.1.4.GA:compile
[INFO] +- org.kie.remote:kie-remote-common:jar:6.3.0-SNAPSHOT:compile
[INFO] +- org.codehaus.jackson:jackson-xc:jar:1.9.9:compile
[INFO] +- org.codehaus.jackson:jackson-mapper-asl:jar:1.9.9:compile
[INFO] +- org.slf4j:slf4j-api:jar:1.7.2:compile
[INFO] +- org.jboss.spec.javax.jms:jboss-jms-api_1.1_spec:jar:1.0.1.Final:compile
[INFO] +- com.sun.xml.bind:jaxb-core:jar:2.2.11:compile
[INFO] \- com.sun.xml.bind:jaxb-impl:jar:2.2.11:compile


So let's setup a simple maven project that will use KIE Server Client to interact with the execution server

<project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://maven.apache.org/POM/4.0.0" xsi:schemalocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelversion>4.0.0</modelversion>
  <groupid>org.jbpm.test</groupid>
  <artifactid>kie-server-test</artifactid>
  <version>0.0.1-SNAPSHOT</version>
  
  <dependencies>
    <dependency>
        <groupid>org.kie</groupid>
        <artifactid>kie-internal</artifactid>
        <version>6.3.0-SNAPSHOT</version>
    </dependency>
    <dependency>
        <groupid>org.kie.server</groupid>
        <artifactid>kie-server-client</artifactid>
        <version>6.3.0-SNAPSHOT</version>
    </dependency>
    <dependency>
      <groupid>ch.qos.logback</groupid>
      <artifactid>logback-classic</artifactid>
      <version>1.1.2</version>
    </dependency>
  </dependencies>

That's all dependencies that are needed to have KIE Server Client embedded in custom application. Equipped with this we can start running KIE Server Client towards given server instance

Following is code snippet required to construct KIE Server Client instance using REST as transport

String serverUrl = "http://localhost:8230/kie-server/services/rest/server";
String user = "kieserver";
String password = "kieserver1!";

String containerId = "hr";
String processId = "hiring";

KieServicesConfiguration configuration = KieServicesFactory.newRestConfiguration(serverUrl, user, password);
// other formats supported MarshallingFormat.JSON or MarshallingFormat.XSTREAM
configuration.setMarshallingFormat(MarshallingFormat.JAXB);
// in case of custom classes shall be used they need to be added and client needs to be created with class loader that has these classes available 
//configuration.addJaxbClasses(extraClasses);
//KieServicesClient kieServicesClient =  KieServicesFactory.newKieServicesClient(configuration, kieContainer.getClassLoader());
KieServicesClient kieServicesClient =  KieServicesFactory.newKieServicesClient(configuration);

Once we have the the client instance we can start executing operations. We start with checking if the container we want to work with is already deployed and if not deploy it

boolean deployContainer = true;
KieContainerResourceList containers = kieServicesClient.listContainers().getResult();
// check if the container is not yet deployed, if not deploy it
if (containers != null) {
    for (KieContainerResource kieContainerResource : containers.getContainers()) {
        if (kieContainerResource.getContainerId().equals(containerId)) {
            System.out.println("\t######### Found container " + containerId + " skipping deployment...");
            deployContainer = false;
            break;
        }
    }
}
// deploy container if not there yet        
if (deployContainer) {
    System.out.println("\t######### Deploying container " + containerId);
    KieContainerResource resource = new KieContainerResource(containerId, new ReleaseId("org.jbpm", "HR", "1.0"));
    kieServicesClient.createContainer(containerId, resource);
}

Next let's check what is there available, in terms of processes and get some details about process id we are going to start


// query for all available process definitions
QueryServicesClient queryClient = kieServicesClient.getServicesClient(QueryServicesClient.class);
List<ProcessDefinition> processes = queryClient.findProcesses(0, 10);
System.out.println("\t######### Available processes" + processes);

ProcessServicesClient processClient = kieServicesClient.getServicesClient(ProcessServicesClient.class);
// get details of process definition
ProcessDefinition definition = processClient.getProcessDefinition(containerId, processId);
System.out.println("\t######### Definition details: " + definition);

We have all the details so we are ready to start the process instance for hiring process. We set two process variables:

  • name - of type string 
  • age - of type integer


// start process instance
Map<String, Object> params = new HashMap<String, Object>();
params.put("name", "john");
params.put("age", 25);
Long processInstanceId = processClient.startProcess(containerId, processId, params);
System.out.println("\t######### Process instance id: " + processInstanceId);

Once we started we can fetch tasks waiting to be completed for kieserver user

UserTaskServicesClient taskClient = kieServicesClient.getServicesClient(UserTaskServicesClient.class);
// find available tasks
List<TaskSummary> tasks = taskClient.findTasksAssignedAsPotentialOwner(user, 0, 10);
System.out.println("\t######### Tasks: " +tasks);

// complete task
Long taskId = tasks.get(0).getId();

taskClient.startTask(containerId, taskId, user);
taskClient.completeTask(containerId, taskId, user, null);


since the task has been completed and it has moved to another one we can continue until there are tasks available or we can simply abort the process instance to quit the work on this instance. Before we abort process instance let's examine what nodes has been completed so far

List<NodeInstance> completedNodes = queryClient.findCompletedNodeInstances(processInstanceId, 0, 10);
System.out.println("\t######### Completed nodes: " + completedNodes);

This will give us information if the task has already been completed and process moved on. Now let's abort the process instance

// at the end abort process instance
processClient.abortProcessInstance(containerId, processInstanceId);

ProcessInstance processInstance = queryClient.findProcessInstanceById(processInstanceId);
System.out.println("\t######### ProcessInstance: " + processInstance);

In the last step we get the process instance out to check if it was properly aborted - process instance state should be set to 3.

Last but not least, KIE Server Client can be used to insert facts and fire rules in very similar way

// work with rules
List<GenericCommand> commands = new ArrayList<GenericCommand>();
BatchExecutionCommandImpl executionCommand = new BatchExecutionCommandImpl(commands);
executionCommand.setLookup("defaultKieSession");

InsertObjectCommand insertObjectCommand = new InsertObjectCommand();
insertObjectCommand.setOutIdentifier("person");
insertObjectCommand.setObject("john");

FireAllRulesCommand fireAllRulesCommand = new FireAllRulesCommand();

commands.add(insertObjectCommand);
commands.add(fireAllRulesCommand);

RuleServicesClient ruleClient = kieServicesClient.getServicesClient(RuleServicesClient.class);
ruleClient.executeCommands(containerId, executionCommand);
System.out.println("\t######### Rules executed");

So that concludes simple usage scenario of KIE Server Client that covers

  • containers
  • processes
  • tasks
  • rules
A complete maven project with this sample execution can be found here.

Enjoy and stay tuned for more to come about awesome KIE Execution Server :)



Unified KIE Execution Server - Part 1

This blog post initiates the series of articles about KIE Execution Server and its capabilities provided in version 6.3. Here is a short description of what you can expect:

  1. Introduction to KIE Execution Server and installation notes
  2. Use of KIE Server Client to interact with KIE Execution Server
  3. KIE Execution Server managed vs unmanaged
  4. KIE Execution Server with non java clients
  5. KIE Execution Server clustering/scalability
These are just starting points as more articles most likely will follow depending on interest ... so let's start with first and foremost - the introduction and installation

KIE Execution Server introduction


In version 6.2 KIE Execution Server has been released that was targeting Drools users to provide out of the box execution environment that is accessible via REST and JMS interface. It was designed to be standalone and lightweight component that can be deployed to either application servers or web containers (with obvious limitation - no JMS on web containers).

As it proved to be a valid option as a standalone component that can be easily deployed and scaled, in version 6.3 there will be so called unified KIE Execution Server that will bring in more capabilities to the end users:

  • BRM capability that is what was in 6.2 providing rules execution
  • BPM capability that brings jBPM into the picture
    • process execution
    • task execution
    • asynchronous jobs execution
All of these are provided in unified way and exposed via REST and JMS interfaces. On top of it a KIE Server Client is delivered that makes use of this server very easy in java environment.
The unification means that from end user point of view you will not have to switch between different servers to take advantage of rule or process execution, same client can be used to perform both and so on. Unified terminology was used as well to not confuse users and thus here comes the most important parts:
  • server - is the actual instance of the execution server
  • container - is execution representation of the kjar/KieContainer that can be composed of various assets (rules, processes, data model, etc) - there can be multiple containers on single server
  • process - business process definition available in given container - can be many per container
  • task - user task definition available in given container - can be many per container
  • job - asynchronous job that is/was scheduled in the execution server
  • query - predefined set of queries to retrieve data out from the execution server

NOTE: Very important note to take into account is that all operations that modify data like:
  • insert fact
  • fire rules
  • start process
  • complete task
must always be referenced via container to guarantee all configuration to be properly set - class loader for custom data, handlers, listeners being registered in time etc.
While access to read only data like queries is simplified and expects the minimum set of data to be given to find details. E.g. get process instance - requires only process instance id as by that it will be able to find it and will return all the details required to perform operations on it - including container id (same goes for tasks etc).

Installation

Let's start with standalone mode running on Wildfly 8.1.0.Final (8.1.0 is used as it was tested with both kie server and kie workbench so better stick to just one version of the application server at the beginning :))

So we have to start with downloading Wildfly distribution and unzipping it to desired location - referred as WILDFLY_HOME. Here we start with configuration:
  • create user in application realm 
    • name: kieserver 
    • password: kieserver1!
    • roles: kie-server
NOTE: these are the defaults that can be changed but if you decide to change them you'll need to provide changed values via system properties upon server startup. So for the sake of simplicity let's start with defaults.
To add user you can use add-user.sh (or add-user.bat on windows) script that comes with Wildfly distribution. Just go to WILDFLY_HOME/bin and invoke add-user script:
  • next download EE7 version of kie execution server 6.3.0 version from here
  • downloaded version shall be copied to WILDFLY_HOME/standalone/deployments
    • personally I usually change the name of the war file to not include version and classifier as it will be used as context path of the deployed application making all urls much longer
    • so optionally you can rename the war file to short version like kie-server.war
We are almost ready to start, last thing is to prepare set of system properties that we will use to start our server with fully featured environment:
  • first of all we must start wildfly server with full profile that activates JMS support
    • --server-config=standalone-full.xml
  • optionally, though useful when we have many wildfly instances running on same machine, let's specify port offset for wildfly server
    • -Djboss.socket.binding.port-offset=150
  • next we give the kie server instance and identifier - it's optional as if not given it will generate one, though it will be less human readable so let's give it a name
    • -Dorg.kie.server.id=first-kie-server
  • let specify the url location that our kie server will be accessible - this is important when running in managed mode (see part 3 of this series) but it's a good practice to give it always
    • -Dorg.kie.server.location=http://localhost:8230/kie-server/services/rest/server
with that we are ready to launch our kie server in standalone mode, use this command from WILDFLY_HOME/bin:

./standalone.sh  
--server-config=standalone-full.xml 
-Djboss.socket.binding.port-offset=150 
-Dorg.kie.server.id=first-kie-server 
-Dorg.kie.server.location=http://localhost:8230/kie-server/services/rest/server

Once application server (and application) starts you should be able to issue simple GET request to the server using the org.kie.server.location url to get information about running server:
When opening this page you will be prompted for user name and password, use the one you created in the beginning of installation process - kieserver with password kieserver1!

So we have kie server up and running with following capabilities:
  • KieServer - this is always present as it provides deployment operations to be able to deploy/undeploy containers on kie server instance
  • BRM - rules execution
  • BPM - process, tasks and jobs execution
Version of the kie server is also available (in this case is 6.4.0-SNAPSHOT as already running on latest master version - though at the time of writing this 6.3.0 is exactly the same)

Unified kie server is built on top of extensions aka capabilities and they can be turned on or off via system properties if one does not need some:
  • -Dorg.drools.server.ext.disabled=true - to disable BRM extension
  • -Dorg.jbpm.server.ext.disabled=true - to disable BPM extension
When disabling BPM extension you will see lot less things being bootstrapped upon  server start - no persistence is involved. So let's disable BPM capability, simply shutdown the server and start it with following command:
./standalone.sh  
--server-config=standalone-full.xml 
-Djboss.socket.binding.port-offset=150 
-Dorg.kie.server.id=first-kie-server 
-Dorg.kie.server.location=http://localhost:8230/kie-server/services/rest/server
-Dorg.jbpm.server.ext.disabled=true

watch the server startup logs and then issue the same url request as previously to see the server info response:
As you can see there is no BPM capabilities any more that means any attempt to contact any of the REST/JMS api that belong to BPM will fail.

Let's get back to fully featured KIE Execution Server and deploy container to it and run some simple process to verify it does work.
To do so, I'll use REST client in Firefox that allows to execute any HTTP method towards given endpoint. So we start with creating/deploying container to running KIE Execution Server

Endpoint:
  • http://localhost:8230/kie-server/services/rest/server/containers/hr
  • where hr is the name of the container
Method:
  • PUT
Request body:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<kie-container container-id="hr">
    <release-id>
        <group-id>org.jbpm</group-id>
        <artifact-id>HR</artifact-id>
        <version>1.0</version>
    </release-id>
</kie-container>

this is one of the standard example project that comes with every version of jBPM and it's part of jbpm-playground repository. Make sure it was built at least once and is available in maven repository that your server has access to or is in your local maven repo (usually at ~/.m2/reporitory)


When request is finished successfully you should see following response being returned:


That tells us we have single container deployed and it is in status STARTED - meaning ready to accept and process requests. So let's see if it actually is ready...

First let's see what processes do we have available there
Endpoint:
  • http://localhost:8230/kie-server/services/rest/server/queries/processes/definitions
Method:
  • GET

When successfully executed you should find single process being available with process if hiring inside container id hr


That tells us we have some processes to be executed, so let's create one instance of hiring process with some process variables

Endpoint:
  • http://localhost:8230/kie-server/services/rest/server/containers/hr/processes/hiring/instances
  • where hr is the name of the container and hiring is the process id
Method:
  • POST
Request body:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<map-type>
    <entries>
        <entry>
            <key>age</key>
            <value xsi:type="xs:int" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">25</value>
        </entry>
        <entry>
            <key>name</key>
            <value xsi:type="xs:string" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">john</value>
        </entry>
    </entries>
</map-type>

So let's issue the start process request...

And examine response...


As we can see we have successfully created process instance of hiring process and the returned process instance id is 1.

As last verification step let's list active process instances available on our kie server instance
Endpoint:
  • http://localhost:8230/kie-server/services/rest/server/queries/processes/instances
Method:
  • GET




So that's all for the first article, introducing unified KIE Execution Server and it's first steps - installation and verification if it actually works. Stay tuned for more coming ... a lot more :)

2015/09/07

Improved signaling in jBPM 6.3

One of the very powerful features of BPMN2 is signaling. It is realized by throw (send signal) and catch (receive signal) constructs. Depending on which type of signal we need it can be used in different places in the process:

  • throw events
    • intermediate event
    • end event
  • catch events
    • start event
    • intermediate event
    • boundary event

It is powerful as is, but it has been enhanced in jBPM 6.3 in two areas:
  • introduction of signal scopes for throwing events
  • support for parameterized signal names - both throw and catch signal events

Signal scopes

Signals by default rely on process engine (ksession) signaling mechanism that until version 6.3 has been scoped only to the same ksession instance meaning it was not able to signal properly things outside of given ksession. This was especially visible when using strategy different than singleton e.g. per process instance. 
Version 6.3 is equipped with predefined scopes to eliminate this problem and further provide fine grained control over what is going to be signaled.

NOTE: signal scopes apply only to throw events.

  • process instance scope - is the lowest in the hierarchy of scopes that narrows down the signal to given process instance. That means only catch events within same process instance will be singled, nothing outside of process instance will be affected
  • default (ksession) scope - same as in previous versions (and thus called default) that signals only elements known to ksession - behavior will vary depending on what strategy is used 
    • singleton - will signal all instances available for this ksession
    • per request - will signal only currently processed process instance and those with start signal events
    • per process instance - same as per request - will signal only currently processed process instance and those with start signal events
  • project scope - will signal all active process instances of given deployment and start signal events (regardless of the strategy)
  • external scope - allows to signal both project scope way and cross deployments - for cross deployments it requires to have a process variable called 'SignalDeploymentId' that provides information about what deployment/project should be the target of the signal. It was done on purpose to provide deployment id as doing overall broadcast would have negative impact on performance in bigger environments

To illustrate this with an example let's consider few very simple processes:
  • starting up with those that will receive signals - here there is no difference
Intermediate catch signal event

Start signal event

  • next those that will throw events with different scopes

Process instance scoped signal

Default (ksession) scoped signal
Project  scoped signal

External scoped signal
Process instance, default and project does not require any additional configuration to work properly, though external does. This is because external signal uses work item handler as a backend to allow pluggable execution (out of the box jBPM comes with one that is based on JMS). It does support both queue and topic although it is configured with queue in jbpm console/kie workbench.
So to be able to use external signal one must register work item handler that can deal with the external signals. One that comes with jBPM can be easily registered via deployment descriptor (either on server level or project level)
Registered External Send Task work item handler for external scope signals

Some might ask why it is not registered there by default - and the reason is that jBPM supports multiple application servers and all of them deal with JMS differently - mainly they will have different JNDI names for queues and connection factories.
JMS based work item handler supports that configuration but requires to specify these JNDI look up names when registering handler.
As illustrated on above screenshot, when running on JBoss AS/EAP/Wildfly you can simply register it via mvel resolver with default (no arg) constructor and it will pick up the preconfigured queue (queue/KIE.SIGNAL) and connection factory (java:/JmsXA). For other cases you need to specify JNDI names as constructor arguments:

new org.jbpm.process.workitem.jms.
JMSSendTaskWorkItemHandler("jms/CF", "jms/Queue")

Since external signal support cross project signals it does even further than just broadcast. It allows to give what project needs to be signaled and even what process instance within that project. That is all controlled by process variables of the process that is going to throw a signal. Following are supported:
  • SignalProcessInstanceId - target process instance id
  • SignalDeploymentId - target deployment (project)

Both are optional and if not given engine will consider same deployment/project as the process that throws the signal, and broadcast in case of missing process instance id. When needed it does allow fine grained control even in cross project signaling.
declared SignalDeploymentId process variable for external scope signal

You can already give it a try yourself by cloning this repo and working with these two projects:

  • single-project - contains all process definitions that working with same project
  • external-project - contains process definition that uses external scope signal (includes a form to enter target deployment id)
But what are the results with these sample process??
  • When using process that signals only with process instance scope (process id: single-project.throw-pi-signal) it will only signal event based subprocess included in the same process definition nothing else
  • When using process that signals with default scope (process id: single-project.throw-default-signal) it will start a process (process id: single-project.start-with-signal) as it has signal start event (regardless of what strategy is used) but will not trigger process that waits in intermediate catch event for other strategies than singleton
  • When using process that signals with project scope (process id: single-project.throw-project-signal) it will start a process (process id: single-project.start-with-signal) as it has signal start event and will trigger process that waits in intermediate catch event (regardless of what strategy is used)
  • When using process that signals with external scope (process id: external-project.throw-external-signal) it will start a process (process id: single-project.start-with-signal) as it has signal start event and will trigger process that waits in intermediate catch event (regardless of what strategy is used) assuming the SignalDeploymentId was set to org.jbpm.test:single-project:1.0.0-SNAPSHOT on start of the process

Parameterized signal names

another enhancement on signals in jBPM 6.3 is to allow signal names to be parameterized. That means you don't have to hardcode signal names in process definition but simply refer to them by process variables. 
That gives extremely valuable approach to dynamically driven process definitions that allow to change the signal it throw or catches based on the state of process instance.

One of the use cases that is needed is when multi instance is used and we want individual instances to react to different signals.

Simply refer to it via variable expression as already supported in data input and outputs, user task assignments etc.

#{mysignalVariable}

then make sure that you define mysignalVariable variable in your process and it has a value before it enters the signal event node.

And that's it for now, stay tuned for more news about jBPM 6.3 that is almost out the door.

2015/08/20

Asynchronous processing with jBPM 6.3

As described in previous article, jBPM executor has been enhanced to provide more robust and powerful execution mechanism for asynchronous tasks. That is based on JMS. So let's take a look at the actual improvements by bringing this into the real process execution.

The use case is rather simple to understand but puts quite a load on the process engine and asynchronous execution capabilities.

  • main process that uses multi instance subprocess to create another process instance to carry additional processing and then awaits for for signal informing about completion of the child process
    • one version that uses Call Activity to start sub process
    • another that uses AsyncStartProcess command instead of Call Activity
  • sub process that has responsibility to execute a job in asynchronous fashion

Main process with call activity to start sub process


Main process with async start process task to start subprocess
Sub process that is invoked from the main process
So what we have here and what's the difference between two main process versions:

  • main process will create as many new process instances as given in collection that is an input to multi instance subprocess - that is driven by process variable that user needs to provide on main process start
  • then in one version to create new process instance as part of multi instance it will use Call Activity BPMN2 construct to create process - that is synchronous way
  • in the second version, on the other hand, multi instance will use Async Start Process command (via async task) to start process instance in asynchronous way
While these two achieve pretty much the same they do differ quite a lot. First of all, using Call Activity will result in following:
  • main process instance will not finish until all sub process instances are created - depending on number of them might be millisecond or seconds or even minutes (in case of really huge set of sub process instances)
  • creation of main process and sub process instances are done in single transaction - all or nothing so if one of the subprocess fails for whatever reason all will be rolled back including main process instance
  • it takes time to commit all data into data base after creating all process instances - note that each process instance (and session instance when using per process instance strategy) has to be serialized using protobuf and then send to db as byte array, and all other inserts as well (for process, tasks, history log etc). That all takes time and might exceed transaction timeout which will cause rollback again...
When using async start process command the situation is slightly different:
  • main process instance will wait only for creating job requests to start all subprocess instances, this is not really starting any process instance yet
  • rollback will affect only main process instance and job requests, meaning it is still consistent as unless main process is committed no sub process instances will be created
  • subprocess instances are started independently meaning a failure of one instance does not affect another, moreover since this is async job it will be retried and can actually be configured to retry with different delays
  • each sub process instance is carried within own transaction which is much smaller and finishes way faster (almost no risk to encounter transaction timeouts) and much less data to be send to data base - just one instance (and session in case of per process instance strategy)

That concludes the main use case here. Though there is one additional that in normal processing will cause issues - single parent process instance that must be notified by huge number of child process instances, and that can happen at pretty much same time. That will cause concurrent updates to same process instance which will result in optimistic lock exception (famous StaleObjectStateException). That is expected and process engine can cope with that to some extent - by using retry interceptor in case of optimistic lock exceptions. Although it might be too many concurrent updates that some of them will exceed the retry count and fail to notify the process instance. Besides that each such failure will cause errors to be printed to logs and by that can reduce visibility in logs and cause some alerts in production systems.

So how to deal with this?
Idea is to skip the regular notification mechanism that directly calls the parent process instance to avoid concurrent updates and instead use signal events (catch in main process instance and throw in subprocess instance).
Main process catch signal intermediate event
Sub process throw signal end event
But use of signal catch and throw events does not solve the problem by itself. The game changer is the scope of the throw event that allows to use so called 'External' scope that utilizes JMS messaging to deliver the signal from the child to parent process instance. Since main process instance uses multi instance subprocess to create child process instances there will be multiple (same number as sub process instances) catch signal events waiting for the notification.
With that signal name cannot be same like a constant as first signal from sub process instance would trigger all catch events and by that finish multi instance too early.

To support this case signal names must be dynamic - based on process variable. Let's enumerate of 
the steps these two processes will do when being executed:
  • main process: upon start will create given number of subprocess that will call new process instance (child process instance)
  • main process: upon requesting the sub process instance creation (regardless if it's via call activity or async task) it will pass signal name that is build based on some constant + unique (serialized-#{item}) items that represents single entry from multi instance input collection
  • main process: will then move on to intermediate catch signal event where name is again same as given to sub process (child) and await it (serialized-#{item})
  • sub process: after executing the process it will throw an event via end signal event with signal name given as input parameter when it was started (serialized-#{item}) and use external scope so it will be send via JMS in transactional way - delivered only when subprocess completes (and commits) successfully

External scope for throw signal events is backed by WorkItemHandler for plug-ability reasons so it can be realized in many ways, not only the default JMS way. Although JMS provides comprehensive messaging infrastructure that is configurable and cluster aware. To solve completely the problem - with concurrent updates to the parent process instance - we need to configure receiver of the signals accordingly. The configuration boils down to single property - activation specification property that limits number of sessions for given endpoint.
In JBoss EAP/Wildfly it can be given as simple entry on configuration of MDB defined in workbench/jbpm console:

In default installation the signal receiver MDB is not limiting concurrent processing and looks like this (WEB-INF/ejb-jar.xml):

  <message-driven>
    <ejb-name>JMSSignalReceiver</ejb-name>
    <ejb-class>org.jbpm.process.workitem.jms.JMSSignalReceiver</ejb-class>
    <transaction-type>Bean</transaction-type>
    <activation-config>
      <activation-config-property>
        <activation-config-property-name>destinationType</activation-config-property-name>
        <activation-config-property-value>javax.jms.Queue</activation-config-property-value>
      </activation-config-property>
      <activation-config-property>
        <activation-config-property-name>destination</activation-config-property-name>
        <activation-config-property-value>java:/queue/KIE.SIGNAL</activation-config-property-value>
      </activation-config-property>
    </activation-config>
  </message-driven>
To enable serialized processing that MDB configuration should look like this:

 <message-driven>
   <ejb-name>JMSSignalReceiver</ejb-name>
   <ejb-class>org.jbpm.process.workitem.jms.JMSSignalReceiver</ejb-class>
   <transaction-type>Bean</transaction-type>
   <activation-config>
      <activation-config-property>
        <activation-config-property-name>destinationType</activation-config-property-name>
        <activation-config-property-value>javax.jms.Queue</activation-config-property-value>
      </activation-config-property>
      <activation-config-property>
        <activation-config-property-name>destination</activation-config-property-name>
        <activation-config-property-value>java:/queue/KIE.SIGNAL</activation-config-property-value>
      </activation-config-property>
      <activation-config-property>
        <activation-config-property-name>maxSession</activation-config-property-name>
        <activation-config-property-value>1</activation-config-property-value>
      </activation-config-property> 
    </activation-config>
  </message-driven>

That ensure that all messages (even if they are sent concurrently) will be processed serially. By that notifying the parent process instance in non concurrent way ensuring that all notification will be delivered and will not cause conflicts - concurrent updates on same process instance.

With that we have fully featured solution that deals with complex process that requires high throughput with asynchronous processing. So now it's time to see what results we can expect from execution and see if different versions of main process differ in execution times.

Sample execution results

Following table represents sample execution results of the described process and might differ between different environments although any one is more than welcome to give it a try and report back how it actually performed.


100 instances300 instances500 instance
Call Activity with JMS executor7 sec24 sec41 sec
Async Start Task with JMS executor4 sec21 sec28 sec
Call Activity with polling executor (1 thread, 1 sec interval)1 min 44 sec5 min 11 sec8 min 44 sec
Async Start Task with polling executor (1 thread, 1 sec interval)3 min 21 sec10 min17 min 42 sec
Call Activity with polling executor (10 threads, 1 sec interval)17 sec43 sec2 min 13 sec
Async Start Task with polling executor (10 threads, 1 sec interval)"20 sec1 min 2 sec1 min 41 sec

Conclusions:

as you can see, JMS based processing is extremely fast compared to polling based only. In fact the fastest is when using async start process command for starting child process instances. The difference increases with number of sub process instances to be created.
From the other hand, using polling based executor only with async start process command is the slowest, and that is expected as well, as all start process commands are still handled by polling executor which will not run fast enough. 
In all the cases the all processing completed successfully but the time required to complete processing differs significantly. 


If you're willing to try that yourself, just downloaded 6.3.0 version of jBPM console (aka kie-wb) and then clone this repository into your environment. Once you have that in place go to async-perf project and build and deploy it. Once it's deployed successfully you can play around with the async execution:
  • miprocess-async is the main process that uses async start process command to start child process instance
  • miprocess is the main process that uses call activity to start child process instances
In both cases upon start you'll be asked for number of subprocesses to create. Just pick a number and run it!

Note that by default the JMS receiver will receive signals concurrently so unless you reconfigure it you'll see concurrent updates to parent process failing for some requests.

Have fun and comments and results reports welcome


Shift gears with jBPM executor

Since version 6.0 jBPM comes with component called jBPM executor that is responsible for carrying on with background (asynchronous) tasks. It started to be more and more used with release of 6.2 by users and even more with coming 6.3 where number of enhancements are based on that component:

  • async continuation 
  • async throw signals
  • async start process instance
jBPM executor uses by default a polling mechanism with backend data base that stores jobs to be executed. There are couple of reasons to use that mechanism:
  • supported on any runtime environment (application server, servlet container, standalone)
  • allows to decouple requesting the job from executing the job
  • allows configurable retry mechanism of failed jobs
  • provides search API to look through available jobs
  • allows to schedule jobs instead of being executed immediately 
Following is a diagram illustrating a sequence of events that describe default (polling based) mechanism of jBPM executor (credits for creating this diagram go to Chris Shumaker)
Executor runs in sort of event loop manner - there is one or more threads that constantly (on defined intervals) poll the data base to see if there are any jobs to be executed. If so picks it and delegates for execution. The delegation differs between runtime environments:
  • environment that supports EJB - it will delegate to ejb asynchronous method for execution
  • environment that does not support EJB will execute the job in the same thread that polls db
This in turn drives the configuration options that look pretty much like this:
  • in EJB environment usually single thread is enough as it is used only for triggering the poll and not actually doing the poll, so number of threads should be kept to minimum and the interval should be used to fine tune the speed of processing of async jobs
  • on non EJB environment number of threads should be increased to improve processing power as each thread will be actually doing the work
In both cases users must take into account the actual needs for execution as the more threads/more frequent polls will cause higher load on underlying data base (regardless if there are jobs to execute or not). So keep that in mind when fine tuning the executor settings.

So while this fits certain set of use cases it does not scale well for systems that require high throughput in distributed environment. Huge number of jobs to be executed as soon as possible requires more robust solution to actually cope with the load in reasonable time and with not too heavy load on underlaying data base. 
This came us to enhancement that allows much faster (and immediate compared to polling) execution, and yet still provide same capabilities as the polling:
  • jobs are searchable 
  • jobs can be retried
  • jobs can be scheduled
The solution chosen for this is based on JMS destination that will receive triggers to perform the operations. That eliminates to poll for available jobs as the JMS provider will invoke the executor to process the job. Even better thing is that the JMS message carries only the job request id so the executor will fetch the job from db by id - the most efficient retrieval method instead of running query by date.
JMS allows clustering support and fine tuning of JMS receiver sessions to improve concurrency. All in standard JEE way. 
Executor discovers JMS support and if available will use it (all supported application servers) or fall back to default polling mechanism.

NOTE: JMS is only supported for immediate job requests and not the scheduled one

Polling mechanism is still there as it's responsibility is still significant:
  • deals with retries
  • deals with scheduled jobs
Although need for the high throughput on polling is removed. That means that users when using JMS should consider to change the interval of polls to higher number like every minute instead of every 3 seconds. That will reduce the load on db but still provide very performant execution environment.

Next article will illustrate the performance improvements when using the JMS based executor compared with default polling based. Stay tuned and comments as usually are more than welcome.


2015/05/27

jBPM talk at JBCNConf- polyglot and reactive jBPM

With recent trend to move to lightweight, container-less runtime environments, jBPM to prove it does not stand out from this approach came up with integration with Vert.x (2.x). This integration is to show users how to move towards reactive, event driven application without a need to run on any container but still use BPM capabilities.

So if you're interested how this looks like join as at JBCNConf - Barcelona, 26 - 27 June / 2015.

Together with Mauricio "Salaboy" Salatino we are going to introduce you to "Polyglot and reactive jBPM". This talk is intended for developers to give basic information about both jBPM and Vert.x and how they work together.
As part of the talk (actually bigger part of the talk) we will perform live demo that will illustrate:

  • jBPM as vert.x module
  • running jBPM projects (aka kjars) inside vert.x instance - one kjar one instance
  • use of clustered vert.x event bus to exchange information between jBPM projects on runtime
  • integration with KIE workbench to prove you can combine these two without affecting each other
  • use of different languages (Java, JavaScript, Groovy, Scala, Ceylon)  to interact with jBPM running on vert.x
So come and join us to see jBPM and Vert.x in action!

2015/04/24

Asynchronous continuation in jBPM 6.3

It's been a while since release of 6.2.0.Final but jBPM is not staying idle, quite the opposite lots of changes are coming in. To give a quick heads up on a feature that has been requested many times - asynchronous continuation.

So what is that? Asynchronous continuation is all about allowing process designers to decide what activities should be executed asynchronously without any additional work required. Some might have already be familiar with async work item handlers that require commands to be given that will be carrying the actual work. While this is very powerful feature it requires additional coding - wrapping business logic in a command. Another drawback is flexibility - one could not easily change if work shall be executed synchronously on asynchronously.

Nevertheless let's take a look at the evolution of that concept to allow users decide themselves what and when should be executed in the background. Let's take a quick look at simple process that is composed of service tasks

You can notice that this process has two types of tasks (look at their names):

  • Async service
  • Sync service
As you can imagine async service will be executed in background while Sync service will be executed on the same thread that its preceding node - so if the preceding node is async node sync node with directly follow it within same thread. 

That's all clear and simple but then how do users define if the service task is async or sync? That's again simple - it's enough to define a dataInput on a task named 'async'
That is the key information to the engine with will inform it how to deal with given node.
Above is the configuration of an Async Service with defined 'async' data input. Next image shows the same configuration but for Sync Service
There is no 'async' dataInput defined.

Here is where I would like to ask for feedback if that way of defining async behavior of a node is sufficient? There is no general BPMN2 property for that behavior and extending BPMN2 xml with custom tags/attributes is not too good in my opinion. 
We could simplify that on editor level where user could simply use checkbox which would define dataInput for the user. All comments are welcome :)

So what will happen if we run this process?


// first async service followed directly by sync service (same thread id)
16:42:26,973 INFO (EJB default - 7) EJB default - 7 Service invoked with name john
16:42:26,977 INFO (EJB default - 7) EJB default - 7 Service invoked with name john

// first async service followed directly by sync service (same thread id)
16:42:29,958 INFO (EJB default - 9) EJB default - 9 Service invoked with name john
16:42:29,962 INFO (EJB default - 9) EJB default - 9 Service invoked with name john

// last async service
16:42:32,954 INFO (EJB default - 1) EJB default - 1 Service invoked with name john


If you look at the timestamps you will see that they match the default settings of jBPM executor - one async thread running every 3 seconds. These are of course configurable so you can fine tune it according to your requirements.

Each process instance of this process will be divided into three steps
Even though Service Tasks are synchronous by nature in BPMN2 with just single setting we can make them execute in background without any coding. 

Moreover, those of you who are already familiar with how jBPM works internally might noticed that these blue boxes actually represents transaction boundaries as well (well, not entirely as start and end node are part of transaction too). So with this we explored another advantage of this feature - possibility to easily define transaction scopes - meaning what nodes should be executed in single transaction. I believe that is another very important feature requested by many jBPM users.

Last but not least bit of technical details. This feature is backed by jBPM executor which is the backbone of asynchronous processing in jBPM 6. That means you need to have executor configured and running to be able to take advantage of this feature. 
If you run on jBPM console (aka kie workbench) there is no need to do anything, you're already fully equipped to do async continuation for all your process.
When you use jBPM in embedded mode there will be some additional steps required that depends on how you utilize jBPM API.
  1. Direct use of KIE API (KieBase and KieSession) - here you need to configure ExecutorService and add it to kieSession environment under "ExecutorService" key. Once it's there it will process the nodes async way
  2. RuntimeManager API - similar to KIE API though you should add ExecutorService as one of environment entires when setting up RuntimeEnvironment
  3. jBPM services API - you need to add ExecutorService as attribute of DepoymentService, if you use CDI or EJB that will be injected automatically for you (assuming all dependencies are available to the container)
This feature is available for:
  • all task types (service, send, receive, business rule, script, user task)
  • subprocesses (embedded and reusable)
  • multi instance task and subprocess

But what happens if user mark node as async but there is no ExecutorService available? Process will still run but will report warning in the log and proceed with nodes as synchronous execution. So it's safe to model your process definition in async way even if there is no async behavior available (yet)

Eclipse use

For those using eclipse modeler instead of web designer: In the new BPMN2 editor (1.2.2), there is a new element under general tab for all tasks, called Metadata. All that needs to be done is add an entry named customAsync and value = true. This will mark the task as asynchronous.

Hope you will like this feature and don't hesitate to leave some comments with feedback and ideas! 

P.S.
This feature is currently on jBPM master and scheduled to go out with 6.3, so if you would like to try it take the latest nightly build or build jBPM from source.

More to come with jBPM so stay tuned...