State migration scenario's

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

State migration scenario's

Marlo Ploemen
Hi community,
 
I am looking into the data migration and schema evolution process for stateful streaming jobs. Currently, there is no orchestration support for performing these job evolutions and no in-job state migration or schema evolution syntax (as this is part of the separate state processor API). I am looking for examples (e.g. Github repositories) or scenarios of stateful streaming jobs where the orchestration of their state evolution process can improve development quality.
 
Best, Marlo
Reply | Threaded
Open this post in threaded view
|

Re: State migration scenario's

Yun Tang
Hi Marlo,

One of the scenarios that we're trying to improve is to add or remove one field in state serializer.
Users might add or remove one field during their schema evolution, state processor could help it with another offline job while state migration could help it once we restart the new job.

Best
Yun Tang
________________________________
From: Marlo Ploemen <[hidden email]>
Sent: Wednesday, June 9, 2021 15:57
To: [hidden email] <[hidden email]>
Subject: State migration scenario's

Hi community,

I am looking into the data migration and schema evolution process for stateful streaming jobs. Currently, there is no orchestration support for performing these job evolutions and no in-job state migration or schema evolution syntax (as this is part of the separate state processor API). I am looking for examples (e.g. Github repositories) or scenarios of stateful streaming jobs where the orchestration of their state evolution process can improve development quality.

Best, Marlo
Reply | Threaded
Open this post in threaded view
|

Re: State migration scenario's

Marlo Ploemen
Hi Yun,

Thanks for your response. If I understand correctly, for a particular state descriptor you want to improve the evolution of the underlying data structure? Are there other typical scenario’s that you encounter when running a stateful dataflow graph (e.g. merging data structures, changing underlying class (not modifying)) etc.

Best,
Marlo

> Op 9 jun. 2021, om 10:10 heeft Yun Tang <[hidden email]> het volgende geschreven:
>
> Hi Marlo,
>
> One of the scenarios that we're trying to improve is to add or remove one field in state serializer.
> Users might add or remove one field during their schema evolution, state processor could help it with another offline job while state migration could help it once we restart the new job.
>
> Best
> Yun Tang
> ________________________________
> From: Marlo Ploemen <[hidden email]>
> Sent: Wednesday, June 9, 2021 15:57
> To: [hidden email] <[hidden email]>
> Subject: State migration scenario's
>
> Hi community,
>
> I am looking into the data migration and schema evolution process for stateful streaming jobs. Currently, there is no orchestration support for performing these job evolutions and no in-job state migration or schema evolution syntax (as this is part of the separate state processor API). I am looking for examples (e.g. Github repositories) or scenarios of stateful streaming jobs where the orchestration of their state evolution process can improve development quality.
>
> Best, Marlo