bpmn-engine icon indicating copy to clipboard operation
bpmn-engine copied to clipboard

getState/recover size optimization

Open acarrasco opened this issue 3 years ago • 14 comments

Hi @paed01!

We are delighted with bpmn so far, but we have found a bump in the road with our use case… Our implementation lives in AWS Serverless, so after each await action we need to persist the state and recover it later on. The problem is that the workflow state is quite big (around 20Kb for a very simple workflow definition) and that takes time to transmit and store in the database.

We have been trying to clean the state up a little bit before serializing it, as it seems to have a lot of data that is internal to the engine (queues, brokers, etc.) and seems transient, but when we tried to recover it won’t work because it relies on that data being there rather than recreating it from scratch.

Can you think of any way to reduce the footprint of the workflow state that still works? We are more than happy to contribute with a PR if you are kind enough to provide us with some guidance :)

Cheers!

acarrasco avatar Jul 13 '21 07:07 acarrasco

Since the engine can handle multiple definitions the source (BPMN) of each definition is also included. This isn't necessary if you only have one definition per instance (recommended). So you can remove source and then feed the engine with the correct source before recover/resume. Check this feature test for inspiration.

At Onify we actually use a smqp/Queue to push states and then have a separate consumer that persists the state. Better yet would be a proper broker, e.g. SNS/SQS. CQRS usually does the trick but it requires the solution to be idempotetent.

paed01 avatar Jul 13 '21 10:07 paed01

Is there any way to save only the current state, the current environment variables and start manually the engine in this middle state? In our case, we don't need to store the historical messages and we have only one definition per instance and we are using the same services and tasks for all definitions (we are only changing the order of the task or the variables, using the same source of tasks)

roberto-naharro avatar Jul 14 '21 09:07 roberto-naharro

And by current state you mean only running activities?

paed01 avatar Jul 14 '21 10:07 paed01

Let me explain.

Imagine this case:

<?xml version="1.0" encoding="UTF-8"?>
<definitions xmlns="http://www.omg.org/spec/BPMN/20100524/MODEL" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    <process id="theProcess" isExecutable="true">

        <startEvent id="theStart"/>
        <serviceTask id="action" implementation="${environment.services.action()}"/>
        <userTask id="waitAction"/>
        <endEvent id="theEnd" />

        <sequenceFlow id="toAction" sourceRef="theStart" targetRef="action" />
        <sequenceFlow id="toWait" sourceRef="action" targetRef="waitAction" />
        <sequenceFlow id="toEnd" sourceRef="waitAction" targetRef="theEnd" />

    </process>
</definitions>

We are launching the engine with a lambda function and the action does a call to an external service. In this case, we have to wait to the external service response (but we cannot let the lambda running), so we define a waitAction. The waitAction stops the engine and save the state to recover it later (when the service answers through a SQS call). When the SQS call is received, the state is retrieved from the DB and the engine is restored with this state to continue and finish.

The main problem we have is when we are managing the state, the state is huge and the calls to save a retrieve the state from DB are slow and expensive in space.

In the end, we only need the reference of the sequenceFlow to continue the process and the variables stored until this moment. We have the same services and listeners for all different possible definitions we have in our system.

This is why we are wondering how to save a lighter state to continue the engine without having all of the information stored

roberto-naharro avatar Jul 14 '21 12:07 roberto-naharro

You might consider skipping the engine all together and just use bpmn-elements Definition. It has a slimmer state and you should be able to trim it to a minimum. Though it requires some more work.

paed01 avatar Jul 14 '21 12:07 paed01

I have begun work with slimming the state. First off - the smqp broker.

paed01 avatar Sep 28 '21 07:09 paed01

As far as I can see it is reduced by about 50%.

paed01 avatar Sep 29 '21 19:09 paed01

@acarrasco, any luck with the state size?

paed01 avatar Nov 10 '21 15:11 paed01

@acarrasco, any luck with the state size?

We've had this issue parked at the bottom of the backlog for a while, but we will definitely give a try to the latest release to see if it gives a nice boost to our marshaling/unmarshaling times 😄

Thanks a lot for the work! 🤗

acarrasco avatar Nov 10 '21 15:11 acarrasco

No worries, electronic backlogs tend to have a long tail.

paed01 avatar Nov 11 '21 10:11 paed01

Even slimmer! - 7a9a0c6

Removed some process- and state properties not needed for recover.

paed01 avatar Jun 21 '23 07:06 paed01

Still too large @acarrasco?

paed01 avatar May 04 '24 08:05 paed01

Ah! forgot to tell you about the new setting disableTrackState. The setting name may not be the best but the effect is that element counters are ignored when getting state.

const engine = new Engine({
  name: 'state without counters',
  source,
  settings: {
    disableTrackState: true,
  },
});

paed01 avatar May 04 '24 08:05 paed01

We ended up using a custom state machine that was very minimal so it could serialize and initialize fast :sweat_smile:

acarrasco avatar May 05 '24 08:05 acarrasco

@acarrasco can I close this issue or do we keep it open as a reminder or for sentimental reasons?

paed01 avatar May 10 '24 07:05 paed01

Feel free to close it. Thanks for all the help while we were using your library! 😊

On Fri, May 10, 2024, 09:51 Pål Edman @.***> wrote:

@acarrasco https://github.com/acarrasco can I close this issue or do we keep it open as a reminder or for sentimental reasons?

— Reply to this email directly, view it on GitHub https://github.com/paed01/bpmn-engine/issues/137#issuecomment-2104115930, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAKRCXCX336L75PFNH6QZYTZBR4AVAVCNFSM5AIOSW7KU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMJQGQYTCNJZGMYA . You are receiving this because you were mentioned.Message ID: @.***>

acarrasco avatar May 10 '24 08:05 acarrasco

De nada.

paed01 avatar May 10 '24 08:05 paed01