jBPM technical error handling is based on transactionality and going back to last (stable) state. That means in case of an error (of any kind) that is not handled by the process, will result in rolling back of entire transaction and leaving process instance in the previous wait state. Any trace about this is only visible in the logs and usually is displayed to the caller (who sent the request to process engine).
That in some cases might not be enough and thus additional error handling is required to provide:
The entry point from process engine point of view is ExecutionErrorManager that is integrated with RuntimeManager which is then responsible for providing it to underlying components - KieSession and TaskService. ExecutionErrorManager from the api point of view gives access to:
Such information is mainly used for errors that are of unknown type - in other words errors that do not provide information about the process context. For example, data base exception upon commit time will not carry any process information meaning that would make the error information really poor and pretty much useless.
ExecutionErrorStorage is pluggable strategy to allow various ways of persisting information about execution errors. Store is used directly by the handler that gets an instance of the store upon creation (at the time RuntimeEngine is created). Default store implementation is based on data base table. Every error will be stored into that table with all information available in it. Not all errors might have all the details they are dependent of the type and possibility to extract information from the error.
Error categorization and filtering is based on so called ExecutionErrorFilters. This is simple interface that is solely responsible for building instance of ExecutionError that is later on stored via the ExecutionErrorStorage. It has following methods:
ExecutionErrorFilter can be provided using ServiceLoader mechanism that is quite easy and proven so extending capability of the error handling is very simple.
Out of the box ExecutionErrorFilters:
The lower value of the priority the higher execution order it gets. In above table then filters will be invoked in following order:
Since the ExecutionErrorFilter is responsible for creating the ExecutionError instance, different implementations might decide that the acknowledgement is set to true immediately when the error is handled - maybe because there is a notification sent to some issue tracking system or an email to administrator. Again, that is up to concrete implementation of the filters or even storage.
Auto acknowledgement of execution errors
By default, executions errors are created unacknowledged and thus require manual action to be performed otherwise they will always be seen as information that requires attention. In case of bigger volumes, manual actions can be time consuming and not suitable in some situations. To help with that auto acknowledgement of errors has been provided. It is based on scheduled jobs (via jbpm executor) and there are three types of jobs available:
Last parameter that these jobs support is EmfName to provide custom name of entity manager factory that should be used when searching for jobs to acknowledge. All of these parameters are optional.
There is a base class that is extended by individual jobs and can be seen as the starting point for additional implementation of auto acknowledge options
Similar access and capabilities are exposed over KIE Server Remote api and its client library.
That in some cases might not be enough and thus additional error handling is required to provide:
- Better traceability
- Visibility in case of critical processes
- Reporting and analytics - based on error situations
- External system error handling and compensation
Overview
Configurable error handling is introduced in version 7.1 that will be responsible for catching any technical errors thrown throughout the process engine execution (including task service). Any technical exception means:- Anything that extends java.lang.Throwable
- Was not handled before - like process level error handling
- ExecutionErrorHandler - the heart of the error handling mechanism
- ExecutionErrorStorage - pluggable storage for execution error information
- Starting processing of a given node instance
- Completion of processing of a given node instance
- Starting processing of a given task instance
- Completion of processing of a given task instance
Such information is mainly used for errors that are of unknown type - in other words errors that do not provide information about the process context. For example, data base exception upon commit time will not carry any process information meaning that would make the error information really poor and pretty much useless.
ExecutionErrorStorage is pluggable strategy to allow various ways of persisting information about execution errors. Store is used directly by the handler that gets an instance of the store upon creation (at the time RuntimeEngine is created). Default store implementation is based on data base table. Every error will be stored into that table with all information available in it. Not all errors might have all the details they are dependent of the type and possibility to extract information from the error.
Error types and filters
Since error handling will attempt to catch and handle any kind of error it needs a way to categorize errors to be able to properly extract information out of the error and make it pluggable as users might use their special types of error to be thrown and handled in different way then one provided out of the box.Error categorization and filtering is based on so called ExecutionErrorFilters. This is simple interface that is solely responsible for building instance of ExecutionError that is later on stored via the ExecutionErrorStorage. It has following methods:
- accept to indicate if given error can be handled by the filter
- filter where the actual filtering/handling etc happens
- getPriority indicates the priority which is used when calling filters
ExecutionErrorFilter can be provided using ServiceLoader mechanism that is quite easy and proven so extending capability of the error handling is very simple.
Out of the box ExecutionErrorFilters:
Class name
|
Type
|
Priority
|
org.jbpm.runtime.manager.impl.error.filters.ProcessExecutionErrorFilter
|
Process
|
100
|
org.jbpm.runtime.manager.impl.error.filters.TaskExecutionErrorFilter
|
Task
|
80
|
org.jbpm.runtime.manager.impl.error.filters.DBExecutionErrorFilter
|
DB
|
200
|
org.jbpm.executor.impl.error.JobExecutionErrorFilter
|
Job
|
100
|
The lower value of the priority the higher execution order it gets. In above table then filters will be invoked in following order:
- Task
- Process
- Job
- DB
Error acknowledgment
By definition every error that is caught and stored is unacknowledged, that means it is to be handled by someone/something (in case of automatic error recovery). That is the base approach to allow to filter on existing errors if they have been already taken care of or not. Acknowledgment on each error saves user who did the acknowledgment and the time stamp for traceability purpose.Since the ExecutionErrorFilter is responsible for creating the ExecutionError instance, different implementations might decide that the acknowledgement is set to true immediately when the error is handled - maybe because there is a notification sent to some issue tracking system or an email to administrator. Again, that is up to concrete implementation of the filters or even storage.
Auto acknowledgement of execution errors
By default, executions errors are created unacknowledged and thus require manual action to be performed otherwise they will always be seen as information that requires attention. In case of bigger volumes, manual actions can be time consuming and not suitable in some situations. To help with that auto acknowledgement of errors has been provided. It is based on scheduled jobs (via jbpm executor) and there are three types of jobs available:
- org.jbpm.executor.commands.error.JobAutoAckErrorCommand
- Job responsible for finding out jobs that previously failed but now are either cancelled, completed or rescheduled for another execution. This job will only acknowledge execution errors of type “Job”
- org.jbpm.executor.commands.error.TaskAutoAckErrorCommand
- Job responsible for auto acknowledgment of user task execution errors for task that previously failed but now are in one of the exit states (completed, failed, exited, obsolete). This job will only acknowledge execution errors of type “Task”
- org.jbpm.executor.commands.error.ProcessAutoAckErrorCommand
- Job responsible for auto acknowledgment of process instances that have errors attached. It will acknowledge errors in case process instance is already finished (completed or aborted) or the task that the error originated from is already finished - based on init_activity_id value. This job will acknowledge any type of job that matches above criteria.
Last parameter that these jobs support is EmfName to provide custom name of entity manager factory that should be used when searching for jobs to acknowledge. All of these parameters are optional.
There is a base class that is extended by individual jobs and can be seen as the starting point for additional implementation of auto acknowledge options
org.jbpm.executor.commands.error.AutoAckErrorCommand
Once extended there are two methods to be implemented:
Once extended there are two methods to be implemented:
- protected abstract List<ExecutionErrorInfo> findErrorsToAck(EntityManager em);
- protected abstract String getAckRule();
Services and access to error information
Access to error information (for the out of the box storage) is through jbpm services. The two admin facing services provide basic access to the error information and to be able to acknowledge the errors:- ProcessInstanceAdminService
- allow to find execution errors of any type and mainly focusing on search capability around process instance
- UserTaskAdminService
- allow to find Task type of errors and focuses on search es around task details like name or id
Similar access and capabilities are exposed over KIE Server Remote api and its client library.
Clean up mechanism
To be able to maintain the ExecutionErrorInfo table in good health there is a need to clean it up from time to time. Since the errors can be there for quite some time, depending on the life cycle of the processes, there is no direct api to clean it up. Instead there is jBPM executor command that can be scheduled for recurring execution to periodically clean up errors. There are several options to be used for clean up command:- DateFormat
- date format for further date related params - if not given yyyy-MM-dd is used (pattern of SimpleDateFormat class)
- EmfName
- name of entity manager factory to be used for queries (valid persistence unit name)
- SingleRun
- indicates if execution should be single run only (true|false)
- NextRun
- provides next execution time (valid time expression e.g. 1d, 5h, etc)
- OlderThan
- indicates what errors should be deleted - older than given date
- OlderThanPeriod
- indicated what errors should be deleted older than given time expression (valid time expression e.g. 1d, 5h, etc)
- ForProcess
- indicates errors to be deleted only for given process definition
- ForProcessInstance
- indicates errors to be deleted only for given process instance
- ForDeployment
- indicates errors to be deleted that are from given deployment id
Time to see this in action
Below screen cast shows this error handling in action. Moreover it shows excellent UI support for it which I would like to give credits to the team that have worked on it - Cristiano, Neus and Rafael.
In the screen cast you'll see a simple process that based on variable either continues as expected or throws an exception. This exception is then handled as execution error and is available to users/administrators to deal with. In addition it will illustrate use of auto acknowledge jobs to based on various conditions acknowledge the errors. Please be patient as there are some waiting times in the screen cast while waiting for job to execute :)
Enjoy and stay tuned for more!!!