Does the job name affect checkpointing?

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Does the job name affect checkpointing?

dan bress
If I deploy a job to Flink called "A"(set my
streamExecutionEnvironment.execute("A"), that checkpoints state.  Then I
cancel "A" and deploy the same job but call it "B", will it pick up A's
state?  Or is checkpointing key'd by the job name?

The reason I ask is I would like a way to reflect the version of the job
that is deployed in the flink UI somewhere, so that I know what is
running.  I put this in the jobname as a quick fix, but I'm concerned it
might affect checkpointing when I deploy a new version.  Is there a good
way to put the version of the job somewhere such that it gets reflected in
the UI?

Thanks!

Dan
Reply | Threaded
Open this post in threaded view
|

Re: Does the job name affect checkpointing?

Ufuk Celebi-2
Hey Dan!

On Thu, Sep 29, 2016 at 10:58 PM, dan bress <[hidden email]> wrote:
> If I deploy a job to Flink called "A"(set my
> streamExecutionEnvironment.execute("A"), that checkpoints state.  Then I
> cancel "A" and deploy the same job but call it "B", will it pick up A's
> state?  Or is checkpointing key'd by the job name?

No, this will not happen at the moment. By default, checkpoints are
scoped to the job and cleaned up on cancellation. You would have to
trigger a savepoint [1] and then resume from that savepoint when
submitting B.

For Flink 1.2, there has just been created an issue [2] to introduce a
cancel variant that automatically triggers a savepoint before
cancelling the job. That will be a little more convenient.

[1] https://ci.apache.org/projects/flink/flink-docs-master/setup/savepoints.html
[2] https://issues.apache.org/jira/browse/FLINK-4717

> The reason I ask is I would like a way to reflect the version of the job
> that is deployed in the flink UI somewhere, so that I know what is
> running.  I put this in the jobname as a quick fix, but I'm concerned it
> might affect checkpointing when I deploy a new version.  Is there a good
> way to put the version of the job somewhere such that it gets reflected in
> the UI?

The job name is the only thing that comes to my mind.

Best,

Ufuk
Reply | Threaded
Open this post in threaded view
|

Re: Does the job name affect checkpointing?

dan bress
Ufuk,
   Thanks for the answer.  In 1.1.X, If I deploy job "Job-V1", run for a
while, trigger a save point, cancel the job, submit job "Job-V2" and resume
save point.  Will Job-V2 understand the savepoint from Job-V1?

   What is the scope of a savepoint?  Who does it belong to?  A Job?  A
TaskManager?  A Job Manager?  Flink as a whole?

Thanks,
Dan

On Fri, Sep 30, 2016 at 1:20 AM Ufuk Celebi <[hidden email]> wrote:

> Hey Dan!
>
> On Thu, Sep 29, 2016 at 10:58 PM, dan bress <[hidden email]> wrote:
> > If I deploy a job to Flink called "A"(set my
> > streamExecutionEnvironment.execute("A"), that checkpoints state.  Then I
> > cancel "A" and deploy the same job but call it "B", will it pick up A's
> > state?  Or is checkpointing key'd by the job name?
>
> No, this will not happen at the moment. By default, checkpoints are
> scoped to the job and cleaned up on cancellation. You would have to
> trigger a savepoint [1] and then resume from that savepoint when
> submitting B.
>
> For Flink 1.2, there has just been created an issue [2] to introduce a
> cancel variant that automatically triggers a savepoint before
> cancelling the job. That will be a little more convenient.
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/setup/savepoints.html
> [2] https://issues.apache.org/jira/browse/FLINK-4717
>
> > The reason I ask is I would like a way to reflect the version of the job
> > that is deployed in the flink UI somewhere, so that I know what is
> > running.  I put this in the jobname as a quick fix, but I'm concerned it
> > might affect checkpointing when I deploy a new version.  Is there a good
> > way to put the version of the job somewhere such that it gets reflected
> in
> > the UI?
>
> The job name is the only thing that comes to my mind.
>
> Best,
>
> Ufuk
>
Reply | Threaded
Open this post in threaded view
|

Re: Does the job name affect checkpointing?

Ufuk Celebi-2
On Fri, Sep 30, 2016 at 7:02 PM, dan bress <[hidden email]> wrote:
> Thanks for the answer.  In 1.1.X, If I deploy job "Job-V1", run for a
> while, trigger a save point, cancel the job, submit job "Job-V2" and resume
> save point.  Will Job-V2 understand the savepoint from Job-V1?

Currently it depends on the type of changes between V2 and V1. If you
don't change the topology, but only the user function code, it should
pick it up. If you plan to change the topology, you should provide IDs
for each operator in order to allow Flink to map old state to the new
job.

>    What is the scope of a savepoint?  Who does it belong to?  A Job?  A
> TaskManager?  A Job Manager?  Flink as a whole?

Savepoints essentially externalize Flink's internal state and are
owned by the user. Currently they contain snapshot meta data (like
checkpoint ID) and pointers to the actual snapshot state, both Flink
internal and your user state. For regular checkpoints this state is
garbage collected automatically whereas savepoints have to be manually
cleaned up.

There is blog post about the general idea of versioning state here:
http://data-artisans.com/how-apache-flink-enables-new-streaming-applications/

If you have further questions, please feel free to ask.