
Modern projects on optimization and automation of business processes, as a rule, assume at the initial stage the analysis of large volumes of Customer documents in order to simulate
as-is business processes on their basis in a short time. The list of analyzed documents may include regulations, industry standards, interview protocols, regulations, regulations, technical tasks and other corporate documents.
The project analyst is
assigned a rather
laborious and, at the same time,
routine task , which currently has no automation equipment. As the analysis of modern business process modeling tools shows, even such well-known on the market applications such as
Enterprise Architect, Business Studio, Bizagi Modeler do not have mechanisms to support building business process models based on their textual description.
The article solves the problem of
extracting the BPMN-model from the document.
It should be noted that at present in the market of business process management (
BPM ) there is a technology of intellectual analysis of processes (
Process Mining ). However, unlike the technology described below, the input to the Process Mining system is a database with the results of the business process being modeled, and not a set of documents with its textual description.
Formulation of the problem
The formulation of an ideal task can be represented as a “
big red button ”, by pressing which the entire volume of the documents to be analyzed is automatically converted into a network of BPMN-models of the Customer’s business processes available for analysis, optimization and automation.
Solving the problem in this formulation is a matter of the future. We introduce a series of logical and technical constraints for a real pilot task.
Objective: To minimize the complexity of building a business process model for the text description while ensuring the completeness and connectedness of the model.
At the entrance there is a document in
Microsoft Word format
, which:
- contains a text description of one internal business process ( Private Business Process ).
- Participant participates in the business process.
- the business process is described at the same level of detail (there are no sub- processes ).
At the output we get an xml file in the format
BPMN2.0 , which:
- contains a business process model corresponding to the baseline description level ( BPMN Descriptive Conformance Sub-Class ).
- correctly opened for editing in Bizagi Modeler .
As a test example, we will use a text description of such a widespread process as
Incident Management (
Incident Management ) from the
ITIL standard library (
Information Technology Infrastructure Library ). The test case is consciously taken in English. English has no cases and is selected to facilitate the processing of references (
coreferences ) to the elements of a business process as part of a pilot task (
this will be discussed in more detail in the second part ).
At the output, an
incident management model “
no worse than ” a flowchart provided in the ITIL library should be formed. By “
not worse ” criterion we will understand: the completeness and connectedness of business functions, data, decision-making conditions and participants in a business process.
Figure 1. A flowchart of the Incident Management process (ITIL v.3 Official Introduction, p.98)Solution concept
According to the
BPMN glossary (
Business Process Model and Notation, version 2.0 ), the business process (
Process ) is represented as "the
graph of Flow-elements (a set of activities, events, gateways) and the Sequence Flow relationships that link them into an executable stream ."
Definition By
BPMN-graph we mean a finite, directed graph (
Graph Theory ) with the following extensions:
- The vertices of the graph correspond to the BPMN-elements of the process ( Flow, Data, Participant ).
- The edges of the graph correspond to the BPMN process connections ( Sequence Flow, Message Flow, Association ).
- Vertices and edges have obligatory attributes: identifier ( id ), name ( name ), comment ( documentation ).
- Required vertex types are elements of the Flow category ( Activity, Event, Gateway ).
- Mandatory edge types are control flow connections ( Sequence Flow ).
Statement 1. Textual description of the business process in the document (in natural language) - contains the
BPMN graph in an implicit form .
Statement 2. The task of extracting BPMN models from a document belongs to the class of tasks of extracting information from weakly structured computer-readable documents (
Information extraction ), the main subtasks of which are:
named entity recognition ,
relationship extraction , reference
resolution .
Combining the algorithms
of graph theory and
information extraction , we obtain the following
solution steps .
- Document markup with BPMN tags ( to identify process elements ).
- Compiling BPMN tags into a BPMN process model ( to identify process associations ).
- Verification of the BPMN model ( to resolve links ).
- Correction of BPMN-model ( in case of non-compliance of the model with the text description ).
- Export the BPMN model to an xml file ( for converting a BPMN graph to a standard format ).
Figure 2. Process diagram of extracting a BPMN model from a document (BPMN Text Extraction)Decision. Step 1: Markup of the document with BPMN tags
For marking BPMN-elements of the business process in the document we will use
BPMN-tags.Definition A BPMN tag is a
colored text marker with an identifier containing the type of BPMN element. The name and color of a BPMN tag corresponds to a specific category of BPMN element.
Below are the colors, categories and types of BPMN tags, as well as recommendations for marking up the document (
finding the exact rules for identifying BPMN elements is the task of the next stage of the project ).
Table 1. Description of BPMN tagsThe general principle for performing operations with BPMN tags: highlight a piece of text containing a
BPMN element and press the button of the corresponding
BPMN tag .
For example, to select a business process, select "
INCIDENT MANAGEMENT ", then click the <
Business Proces s> button. The background of the selected BPMN element is colored in the color of the selected BPMN tag, and a bookmark with the BPMN tag identifier is added to the document tabs.
Figure 3. The menu bar of the BPMN tab (a group of BPMN tags, Edit tags)The following are the main operations on BPMN tags:
- Add ( BPMN tag ) - adds a new BPMN tag to the bookmarks of the document ( Word Bookmarks ) and marks the selected text with the corresponding color.
- Show / Hide ( Show Tags ) - enables / disables BPMN tags in the text of the document.
- Resize - changes the area of the BPMN tagged text.
- Delete - deletes the BPMN tag (bookmark and marker) from the document.
- Detailed Information ( Details ) - shows detailed information on the BPMN tag (identifier, category, type and text of the BPMN tag).
- Report - shows a statistical report on the number and types of BPMN tags in the active document.
As a result of the markup of the test document, we obtain the following result.
Figure 4. BPMN markup of the text description of the Incident Management process (the image is clickable)Note that the text has “
duplicate ” BPMN tags that have the same text and color (for example,
Service Desk, Problem Management, Incident Record ) are links to the same process element. Processing of such references (
coreferences ) will be considered at the 2nd step of the solution.
To be continued…