Cron style for checkpoint

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Cron style for checkpoint

shuwen zhou
Hi Community,
I would like to know if there is a existing function to support cron style
checkpoint?
The case is, our data traffic is huge on HH:30 for each hour. We don't wont
checkpoint to fall in that range of time. A cron like 15,45 * * * * to set
for checkpoint would be nice. If a checkpoint is already in progress when
minutes is 15 or 45, there would be a config value to trigger a new
checkpoint or pass.

--
Best Wishes,
Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>
Reply | Threaded
Open this post in threaded view
|

Re:Cron style for checkpoint

Jiayi Liao
Hi Shuwen,




As far as I know, Flink can only support checkpoint with a fixed interval.




However I think the flexible mechanism of triggering checkpoint is worth working on, at least from my perspective. And it may not only be a cron style. In our business scenario, the data traffic usually reaches the peek of the day after 20:00, which we want to increase the interval of checkpoint otherwise it’ll introduce more disk and network IO.




Just want to share something about this :)







Best,

Jiayi Liao




At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]> wrote:

>Hi Community,
>I would like to know if there is a existing function to support cron style
>checkpoint?
>The case is, our data traffic is huge on HH:30 for each hour. We don't wont
>checkpoint to fall in that range of time. A cron like 15,45 * * * * to set
>for checkpoint would be nice. If a checkpoint is already in progress when
>minutes is 15 or 45, there would be a config value to trigger a new
>checkpoint or pass.
>
>--
>Best Wishes,
>Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>
Reply | Threaded
Open this post in threaded view
|

Re: Cron style for checkpoint

shuwen zhou
Hi Jiayi,
It would be great if Flink could have a user defined interface for user to
implement to control checkpoint behavior, at least for time related
behavior.
I brought up a wish on JIRA [1], perhaps it described clearly enough.

[1] https://issues.apache.org/jira/browse/FLINK-14884


On Thu, 21 Nov 2019 at 11:40, Jiayi Liao <[hidden email]> wrote:

> Hi Shuwen,
>
>
> As far as I know, Flink can only support checkpoint with a fixed interval.
>
>
> However I think the flexible mechanism of triggering checkpoint is worth
> working on, at least from my perspective. And it may not only be a cron
> style. In our business scenario, the data traffic usually reaches the peek
> of the day after 20:00, which we want to increase the interval of
> checkpoint otherwise it’ll introduce more disk and network IO.
>
>
> Just want to share something about this :)
>
>
>
> Best,
>
> Jiayi Liao
>
>
> At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]> wrote:
> >Hi Community,
> >I would like to know if there is a existing function to support cron style
> >checkpoint?
> >The case is, our data traffic is huge on HH:30 for each hour. We don't wont
> >checkpoint to fall in that range of time. A cron like 15,45 * * * * to set
> >for checkpoint would be nice. If a checkpoint is already in progress when
> >minutes is 15 or 45, there would be a config value to trigger a new
> >checkpoint or pass.
> >
> >--
> >Best Wishes,
> >Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>
>
>
>
>
>


--
Best Wishes,
Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>
Reply | Threaded
Open this post in threaded view
|

Re: Cron style for checkpoint

Congxian Qiu
Hi

Currently, Flink does not support such feature, from what you describe,
does set an appropriate timeout for checkpoint can solve your problem?

Best,
Congxian


shuwen zhou <[hidden email]> 于2019年11月21日周四 下午12:06写道:

> Hi Jiayi,
> It would be great if Flink could have a user defined interface for user to
> implement to control checkpoint behavior, at least for time related
> behavior.
> I brought up a wish on JIRA [1], perhaps it described clearly enough.
>
> [1] https://issues.apache.org/jira/browse/FLINK-14884
>
>
> On Thu, 21 Nov 2019 at 11:40, Jiayi Liao <[hidden email]> wrote:
>
>> Hi Shuwen,
>>
>>
>> As far as I know, Flink can only support checkpoint with a fixed
>> interval.
>>
>>
>> However I think the flexible mechanism of triggering checkpoint is worth
>> working on, at least from my perspective. And it may not only be a cron
>> style. In our business scenario, the data traffic usually reaches the peek
>> of the day after 20:00, which we want to increase the interval of
>> checkpoint otherwise it’ll introduce more disk and network IO.
>>
>>
>> Just want to share something about this :)
>>
>>
>>
>> Best,
>>
>> Jiayi Liao
>>
>>
>> At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]> wrote:
>> >Hi Community,
>> >I would like to know if there is a existing function to support cron style
>> >checkpoint?
>> >The case is, our data traffic is huge on HH:30 for each hour. We don't wont
>> >checkpoint to fall in that range of time. A cron like 15,45 * * * * to set
>> >for checkpoint would be nice. If a checkpoint is already in progress when
>> >minutes is 15 or 45, there would be a config value to trigger a new
>> >checkpoint or pass.
>> >
>> >--
>> >Best Wishes,
>> >Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>
>>
>>
>>
>>
>>
>
>
> --
> Best Wishes,
> Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Cron style for checkpoint

Yun Tang
Hi Shuwen

Conceptually, checkpoints in Flink behaves more like a system mechanism to achieve fault tolerance and transparent for users. On the other hand, savepoint in Flink behaves more like a user control behavior, can savepoint not satisfy your demands for crontab?

Best
Yun Tang

From: Congxian Qiu <[hidden email]>
Date: Thursday, November 21, 2019 at 2:27 PM
To: shuwen zhou <[hidden email]>
Cc: Jiayi Liao <[hidden email]>, dev <[hidden email]>, user <[hidden email]>
Subject: Re: Cron style for checkpoint

Hi

Currently, Flink does not support such feature, from what you describe, does set an appropriate timeout for checkpoint can solve your problem?

Best,
Congxian


shuwen zhou <[hidden email]<mailto:[hidden email]>> 于2019年11月21日周四 下午12:06写道:
Hi Jiayi,
It would be great if Flink could have a user defined interface for user to implement to control checkpoint behavior, at least for time related behavior.
I brought up a wish on JIRA [1], perhaps it described clearly enough.

[1] https://issues.apache.org/jira/browse/FLINK-14884


On Thu, 21 Nov 2019 at 11:40, Jiayi Liao <[hidden email]<mailto:[hidden email]>> wrote:

Hi Shuwen,



As far as I know, Flink can only support checkpoint with a fixed interval.



However I think the flexible mechanism of triggering checkpoint is worth working on, at least from my perspective. And it may not only be a cron style. In our business scenario, the data traffic usually reaches the peek of the day after 20:00, which we want to increase the interval of checkpoint otherwise it’ll introduce more disk and network IO.



Just want to share something about this :)





Best,

Jiayi Liao


At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]<mailto:[hidden email]>> wrote:

>Hi Community,

>I would like to know if there is a existing function to support cron style

>checkpoint?

>The case is, our data traffic is huge on HH:30 for each hour. We don't wont

>checkpoint to fall in that range of time. A cron like 15,45 * * * * to set

>for checkpoint would be nice. If a checkpoint is already in progress when

>minutes is 15 or 45, there would be a config value to trigger a new

>checkpoint or pass.

>

>--

>Best Wishes,

>Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>







--
Best Wishes,
Shuwen Zhou<http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>

Reply | Threaded
Open this post in threaded view
|

Re: Cron style for checkpoint

shuwen zhou
Hi Yun and Congxian,
I would actually want checkpoint to avoid being triggered on a certain
time. It still remains as system mechanism just avoid being triggered at a
certain range of time.
Waiting for the checkpoint to timeout still waste CPU&disk IO resources
since it was being triggered. I would like it to avoid from being triggered
at first.
I suppose use a cron style would not break checkpoint's system mechanism.
Savepoint, on the other hand, is not incremental update, trigger a
savepoint every 10 mins will waste a lot of disk and another script is
required to remove outdated savepoint. I suppose savepoint is being used in
upgrade/restart scenario.
A cron style checkpoint time config will provide a lot flexibility. Thanks.


On Thu, 21 Nov 2019 at 16:28, Yun Tang <[hidden email]> wrote:

> Hi Shuwen
>
>
>
> Conceptually, checkpoints in Flink behaves more like a system mechanism to
> achieve fault tolerance and transparent for users. On the other hand,
> savepoint in Flink behaves more like a user control behavior, can savepoint
> not satisfy your demands for crontab?
>
>
>
> Best
>
> Yun Tang
>
>
>
> *From: *Congxian Qiu <[hidden email]>
> *Date: *Thursday, November 21, 2019 at 2:27 PM
> *To: *shuwen zhou <[hidden email]>
> *Cc: *Jiayi Liao <[hidden email]>, dev <[hidden email]>, user <
> [hidden email]>
> *Subject: *Re: Cron style for checkpoint
>
>
>
> Hi
>
>
>
> Currently, Flink does not support such feature, from what you describe,
> does set an appropriate timeout for checkpoint can solve your problem?
>
>
> Best,
>
> Congxian
>
>
>
>
>
> shuwen zhou <[hidden email]> 于2019年11月21日周四 下午12:06写道:
>
> Hi Jiayi,
>
> It would be great if Flink could have a user defined interface for user to
> implement to control checkpoint behavior, at least for time related
> behavior.
>
> I brought up a wish on JIRA [1], perhaps it described clearly enough.
>
>
>
> [1] https://issues.apache.org/jira/browse/FLINK-14884
>
>
>
>
>
> On Thu, 21 Nov 2019 at 11:40, Jiayi Liao <[hidden email]> wrote:
>
> Hi Shuwen,
>
>
>
> As far as I know, Flink can only support checkpoint with a fixed interval.
>
>
>
> However I think the flexible mechanism of triggering checkpoint is worth
> working on, at least from my perspective. And it may not only be a cron
> style. In our business scenario, the data traffic usually reaches the peek
> of the day after 20:00, which we want to increase the interval of
> checkpoint otherwise it’ll introduce more disk and network IO.
>
>
>
> Just want to share something about this :)
>
>
>
>
>
> Best,
>
> Jiayi Liao
>
>
>
>
> At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]> wrote:
>
> >Hi Community,
>
> >I would like to know if there is a existing function to support cron style
>
> >checkpoint?
>
> >The case is, our data traffic is huge on HH:30 for each hour. We don't wont
>
> >checkpoint to fall in that range of time. A cron like 15,45 * * * * to set
>
> >for checkpoint would be nice. If a checkpoint is already in progress when
>
> >minutes is 15 or 45, there would be a config value to trigger a new
>
> >checkpoint or pass.
>
> >
>
> >--
>
> >Best Wishes,
>
> >Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>
>
>
>
>
>
>
>
>
>
> --
>
> Best Wishes,
>
> Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>
>
>
>
>

--
Best Wishes,
Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>
Reply | Threaded
Open this post in threaded view
|

Re: Cron style for checkpoint

Congxian Qiu
Hi

thanks for your explanation, what you want is to disable periodic
checkpoint in some time duration, and at other times the periodic
checkpoint is doing as normal. Currently, Flink does not support this, as
you've created an issue for this, we can track this in the issue side. for
now, if you really want this, you can change the logic in
`CheckpointCoordinator#triggerCheckpoint`.

Best,
Congxian


shuwen zhou <[hidden email]> 于2019年11月21日周四 下午4:57写道:

> Hi Yun and Congxian,
> I would actually want checkpoint to avoid being triggered on a certain
> time. It still remains as system mechanism just avoid being triggered at a
> certain range of time.
> Waiting for the checkpoint to timeout still waste CPU&disk IO resources
> since it was being triggered. I would like it to avoid from being triggered
> at first.
> I suppose use a cron style would not break checkpoint's system mechanism.
> Savepoint, on the other hand, is not incremental update, trigger a
> savepoint every 10 mins will waste a lot of disk and another script is
> required to remove outdated savepoint. I suppose savepoint is being used in
> upgrade/restart scenario.
> A cron style checkpoint time config will provide a lot flexibility. Thanks.
>
>
> On Thu, 21 Nov 2019 at 16:28, Yun Tang <[hidden email]> wrote:
>
>> Hi Shuwen
>>
>>
>>
>> Conceptually, checkpoints in Flink behaves more like a system mechanism
>> to achieve fault tolerance and transparent for users. On the other hand,
>> savepoint in Flink behaves more like a user control behavior, can savepoint
>> not satisfy your demands for crontab?
>>
>>
>>
>> Best
>>
>> Yun Tang
>>
>>
>>
>> *From: *Congxian Qiu <[hidden email]>
>> *Date: *Thursday, November 21, 2019 at 2:27 PM
>> *To: *shuwen zhou <[hidden email]>
>> *Cc: *Jiayi Liao <[hidden email]>, dev <[hidden email]>, user <
>> [hidden email]>
>> *Subject: *Re: Cron style for checkpoint
>>
>>
>>
>> Hi
>>
>>
>>
>> Currently, Flink does not support such feature, from what you describe,
>> does set an appropriate timeout for checkpoint can solve your problem?
>>
>>
>> Best,
>>
>> Congxian
>>
>>
>>
>>
>>
>> shuwen zhou <[hidden email]> 于2019年11月21日周四 下午12:06写道:
>>
>> Hi Jiayi,
>>
>> It would be great if Flink could have a user defined interface for user
>> to implement to control checkpoint behavior, at least for time related
>> behavior.
>>
>> I brought up a wish on JIRA [1], perhaps it described clearly enough.
>>
>>
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-14884
>>
>>
>>
>>
>>
>> On Thu, 21 Nov 2019 at 11:40, Jiayi Liao <[hidden email]> wrote:
>>
>> Hi Shuwen,
>>
>>
>>
>> As far as I know, Flink can only support checkpoint with a fixed
>> interval.
>>
>>
>>
>> However I think the flexible mechanism of triggering checkpoint is worth
>> working on, at least from my perspective. And it may not only be a cron
>> style. In our business scenario, the data traffic usually reaches the peek
>> of the day after 20:00, which we want to increase the interval of
>> checkpoint otherwise it’ll introduce more disk and network IO.
>>
>>
>>
>> Just want to share something about this :)
>>
>>
>>
>>
>>
>> Best,
>>
>> Jiayi Liao
>>
>>
>>
>>
>> At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]> wrote:
>>
>> >Hi Community,
>>
>> >I would like to know if there is a existing function to support cron style
>>
>> >checkpoint?
>>
>> >The case is, our data traffic is huge on HH:30 for each hour. We don't wont
>>
>> >checkpoint to fall in that range of time. A cron like 15,45 * * * * to set
>>
>> >for checkpoint would be nice. If a checkpoint is already in progress when
>>
>> >minutes is 15 or 45, there would be a config value to trigger a new
>>
>> >checkpoint or pass.
>>
>> >
>>
>> >--
>>
>> >Best Wishes,
>>
>> >Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> --
>>
>> Best Wishes,
>>
>> Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>
>>
>>
>>
>>
>
> --
> Best Wishes,
> Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/>
>
>