Hi Community,
I would like to know if there is a existing function to support cron style checkpoint? The case is, our data traffic is huge on HH:30 for each hour. We don't wont checkpoint to fall in that range of time. A cron like 15,45 * * * * to set for checkpoint would be nice. If a checkpoint is already in progress when minutes is 15 or 45, there would be a config value to trigger a new checkpoint or pass. -- Best Wishes, Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/> |
Hi Shuwen,
As far as I know, Flink can only support checkpoint with a fixed interval. However I think the flexible mechanism of triggering checkpoint is worth working on, at least from my perspective. And it may not only be a cron style. In our business scenario, the data traffic usually reaches the peek of the day after 20:00, which we want to increase the interval of checkpoint otherwise it’ll introduce more disk and network IO. Just want to share something about this :) Best, Jiayi Liao At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]> wrote: >Hi Community, >I would like to know if there is a existing function to support cron style >checkpoint? >The case is, our data traffic is huge on HH:30 for each hour. We don't wont >checkpoint to fall in that range of time. A cron like 15,45 * * * * to set >for checkpoint would be nice. If a checkpoint is already in progress when >minutes is 15 or 45, there would be a config value to trigger a new >checkpoint or pass. > >-- >Best Wishes, >Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/> |
Hi Jiayi,
It would be great if Flink could have a user defined interface for user to implement to control checkpoint behavior, at least for time related behavior. I brought up a wish on JIRA [1], perhaps it described clearly enough. [1] https://issues.apache.org/jira/browse/FLINK-14884 On Thu, 21 Nov 2019 at 11:40, Jiayi Liao <[hidden email]> wrote: > Hi Shuwen, > > > As far as I know, Flink can only support checkpoint with a fixed interval. > > > However I think the flexible mechanism of triggering checkpoint is worth > working on, at least from my perspective. And it may not only be a cron > style. In our business scenario, the data traffic usually reaches the peek > of the day after 20:00, which we want to increase the interval of > checkpoint otherwise it’ll introduce more disk and network IO. > > > Just want to share something about this :) > > > > Best, > > Jiayi Liao > > > At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]> wrote: > >Hi Community, > >I would like to know if there is a existing function to support cron style > >checkpoint? > >The case is, our data traffic is huge on HH:30 for each hour. We don't wont > >checkpoint to fall in that range of time. A cron like 15,45 * * * * to set > >for checkpoint would be nice. If a checkpoint is already in progress when > >minutes is 15 or 45, there would be a config value to trigger a new > >checkpoint or pass. > > > >-- > >Best Wishes, > >Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/> > > > > > -- Best Wishes, Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/> |
Hi
Currently, Flink does not support such feature, from what you describe, does set an appropriate timeout for checkpoint can solve your problem? Best, Congxian shuwen zhou <[hidden email]> 于2019年11月21日周四 下午12:06写道: > Hi Jiayi, > It would be great if Flink could have a user defined interface for user to > implement to control checkpoint behavior, at least for time related > behavior. > I brought up a wish on JIRA [1], perhaps it described clearly enough. > > [1] https://issues.apache.org/jira/browse/FLINK-14884 > > > On Thu, 21 Nov 2019 at 11:40, Jiayi Liao <[hidden email]> wrote: > >> Hi Shuwen, >> >> >> As far as I know, Flink can only support checkpoint with a fixed >> interval. >> >> >> However I think the flexible mechanism of triggering checkpoint is worth >> working on, at least from my perspective. And it may not only be a cron >> style. In our business scenario, the data traffic usually reaches the peek >> of the day after 20:00, which we want to increase the interval of >> checkpoint otherwise it’ll introduce more disk and network IO. >> >> >> Just want to share something about this :) >> >> >> >> Best, >> >> Jiayi Liao >> >> >> At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]> wrote: >> >Hi Community, >> >I would like to know if there is a existing function to support cron style >> >checkpoint? >> >The case is, our data traffic is huge on HH:30 for each hour. We don't wont >> >checkpoint to fall in that range of time. A cron like 15,45 * * * * to set >> >for checkpoint would be nice. If a checkpoint is already in progress when >> >minutes is 15 or 45, there would be a config value to trigger a new >> >checkpoint or pass. >> > >> >-- >> >Best Wishes, >> >Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/> >> >> >> >> >> > > > -- > Best Wishes, > Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/> > > |
Hi Shuwen
Conceptually, checkpoints in Flink behaves more like a system mechanism to achieve fault tolerance and transparent for users. On the other hand, savepoint in Flink behaves more like a user control behavior, can savepoint not satisfy your demands for crontab? Best Yun Tang From: Congxian Qiu <[hidden email]> Date: Thursday, November 21, 2019 at 2:27 PM To: shuwen zhou <[hidden email]> Cc: Jiayi Liao <[hidden email]>, dev <[hidden email]>, user <[hidden email]> Subject: Re: Cron style for checkpoint Hi Currently, Flink does not support such feature, from what you describe, does set an appropriate timeout for checkpoint can solve your problem? Best, Congxian shuwen zhou <[hidden email]<mailto:[hidden email]>> 于2019年11月21日周四 下午12:06写道: Hi Jiayi, It would be great if Flink could have a user defined interface for user to implement to control checkpoint behavior, at least for time related behavior. I brought up a wish on JIRA [1], perhaps it described clearly enough. [1] https://issues.apache.org/jira/browse/FLINK-14884 On Thu, 21 Nov 2019 at 11:40, Jiayi Liao <[hidden email]<mailto:[hidden email]>> wrote: Hi Shuwen, As far as I know, Flink can only support checkpoint with a fixed interval. However I think the flexible mechanism of triggering checkpoint is worth working on, at least from my perspective. And it may not only be a cron style. In our business scenario, the data traffic usually reaches the peek of the day after 20:00, which we want to increase the interval of checkpoint otherwise it’ll introduce more disk and network IO. Just want to share something about this :) Best, Jiayi Liao At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]<mailto:[hidden email]>> wrote: >Hi Community, >I would like to know if there is a existing function to support cron style >checkpoint? >The case is, our data traffic is huge on HH:30 for each hour. We don't wont >checkpoint to fall in that range of time. A cron like 15,45 * * * * to set >for checkpoint would be nice. If a checkpoint is already in progress when >minutes is 15 or 45, there would be a config value to trigger a new >checkpoint or pass. > >-- >Best Wishes, >Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/> -- Best Wishes, Shuwen Zhou<http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/> |
Hi Yun and Congxian,
I would actually want checkpoint to avoid being triggered on a certain time. It still remains as system mechanism just avoid being triggered at a certain range of time. Waiting for the checkpoint to timeout still waste CPU&disk IO resources since it was being triggered. I would like it to avoid from being triggered at first. I suppose use a cron style would not break checkpoint's system mechanism. Savepoint, on the other hand, is not incremental update, trigger a savepoint every 10 mins will waste a lot of disk and another script is required to remove outdated savepoint. I suppose savepoint is being used in upgrade/restart scenario. A cron style checkpoint time config will provide a lot flexibility. Thanks. On Thu, 21 Nov 2019 at 16:28, Yun Tang <[hidden email]> wrote: > Hi Shuwen > > > > Conceptually, checkpoints in Flink behaves more like a system mechanism to > achieve fault tolerance and transparent for users. On the other hand, > savepoint in Flink behaves more like a user control behavior, can savepoint > not satisfy your demands for crontab? > > > > Best > > Yun Tang > > > > *From: *Congxian Qiu <[hidden email]> > *Date: *Thursday, November 21, 2019 at 2:27 PM > *To: *shuwen zhou <[hidden email]> > *Cc: *Jiayi Liao <[hidden email]>, dev <[hidden email]>, user < > [hidden email]> > *Subject: *Re: Cron style for checkpoint > > > > Hi > > > > Currently, Flink does not support such feature, from what you describe, > does set an appropriate timeout for checkpoint can solve your problem? > > > Best, > > Congxian > > > > > > shuwen zhou <[hidden email]> 于2019年11月21日周四 下午12:06写道: > > Hi Jiayi, > > It would be great if Flink could have a user defined interface for user to > implement to control checkpoint behavior, at least for time related > behavior. > > I brought up a wish on JIRA [1], perhaps it described clearly enough. > > > > [1] https://issues.apache.org/jira/browse/FLINK-14884 > > > > > > On Thu, 21 Nov 2019 at 11:40, Jiayi Liao <[hidden email]> wrote: > > Hi Shuwen, > > > > As far as I know, Flink can only support checkpoint with a fixed interval. > > > > However I think the flexible mechanism of triggering checkpoint is worth > working on, at least from my perspective. And it may not only be a cron > style. In our business scenario, the data traffic usually reaches the peek > of the day after 20:00, which we want to increase the interval of > checkpoint otherwise it’ll introduce more disk and network IO. > > > > Just want to share something about this :) > > > > > > Best, > > Jiayi Liao > > > > > At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]> wrote: > > >Hi Community, > > >I would like to know if there is a existing function to support cron style > > >checkpoint? > > >The case is, our data traffic is huge on HH:30 for each hour. We don't wont > > >checkpoint to fall in that range of time. A cron like 15,45 * * * * to set > > >for checkpoint would be nice. If a checkpoint is already in progress when > > >minutes is 15 or 45, there would be a config value to trigger a new > > >checkpoint or pass. > > > > > >-- > > >Best Wishes, > > >Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/> > > > > > > > > > > -- > > Best Wishes, > > Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/> > > > > -- Best Wishes, Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/> |
Hi
thanks for your explanation, what you want is to disable periodic checkpoint in some time duration, and at other times the periodic checkpoint is doing as normal. Currently, Flink does not support this, as you've created an issue for this, we can track this in the issue side. for now, if you really want this, you can change the logic in `CheckpointCoordinator#triggerCheckpoint`. Best, Congxian shuwen zhou <[hidden email]> 于2019年11月21日周四 下午4:57写道: > Hi Yun and Congxian, > I would actually want checkpoint to avoid being triggered on a certain > time. It still remains as system mechanism just avoid being triggered at a > certain range of time. > Waiting for the checkpoint to timeout still waste CPU&disk IO resources > since it was being triggered. I would like it to avoid from being triggered > at first. > I suppose use a cron style would not break checkpoint's system mechanism. > Savepoint, on the other hand, is not incremental update, trigger a > savepoint every 10 mins will waste a lot of disk and another script is > required to remove outdated savepoint. I suppose savepoint is being used in > upgrade/restart scenario. > A cron style checkpoint time config will provide a lot flexibility. Thanks. > > > On Thu, 21 Nov 2019 at 16:28, Yun Tang <[hidden email]> wrote: > >> Hi Shuwen >> >> >> >> Conceptually, checkpoints in Flink behaves more like a system mechanism >> to achieve fault tolerance and transparent for users. On the other hand, >> savepoint in Flink behaves more like a user control behavior, can savepoint >> not satisfy your demands for crontab? >> >> >> >> Best >> >> Yun Tang >> >> >> >> *From: *Congxian Qiu <[hidden email]> >> *Date: *Thursday, November 21, 2019 at 2:27 PM >> *To: *shuwen zhou <[hidden email]> >> *Cc: *Jiayi Liao <[hidden email]>, dev <[hidden email]>, user < >> [hidden email]> >> *Subject: *Re: Cron style for checkpoint >> >> >> >> Hi >> >> >> >> Currently, Flink does not support such feature, from what you describe, >> does set an appropriate timeout for checkpoint can solve your problem? >> >> >> Best, >> >> Congxian >> >> >> >> >> >> shuwen zhou <[hidden email]> 于2019年11月21日周四 下午12:06写道: >> >> Hi Jiayi, >> >> It would be great if Flink could have a user defined interface for user >> to implement to control checkpoint behavior, at least for time related >> behavior. >> >> I brought up a wish on JIRA [1], perhaps it described clearly enough. >> >> >> >> [1] https://issues.apache.org/jira/browse/FLINK-14884 >> >> >> >> >> >> On Thu, 21 Nov 2019 at 11:40, Jiayi Liao <[hidden email]> wrote: >> >> Hi Shuwen, >> >> >> >> As far as I know, Flink can only support checkpoint with a fixed >> interval. >> >> >> >> However I think the flexible mechanism of triggering checkpoint is worth >> working on, at least from my perspective. And it may not only be a cron >> style. In our business scenario, the data traffic usually reaches the peek >> of the day after 20:00, which we want to increase the interval of >> checkpoint otherwise it’ll introduce more disk and network IO. >> >> >> >> Just want to share something about this :) >> >> >> >> >> >> Best, >> >> Jiayi Liao >> >> >> >> >> At 2019-11-21 10:20:47, "shuwen zhou" <[hidden email]> wrote: >> >> >Hi Community, >> >> >I would like to know if there is a existing function to support cron style >> >> >checkpoint? >> >> >The case is, our data traffic is huge on HH:30 for each hour. We don't wont >> >> >checkpoint to fall in that range of time. A cron like 15,45 * * * * to set >> >> >for checkpoint would be nice. If a checkpoint is already in progress when >> >> >minutes is 15 or 45, there would be a config value to trigger a new >> >> >checkpoint or pass. >> >> > >> >> >-- >> >> >Best Wishes, >> >> >Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/> >> >> >> >> >> >> >> >> >> >> -- >> >> Best Wishes, >> >> Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/> >> >> >> >> > > -- > Best Wishes, > Shuwen Zhou <http://www.linkedin.com/pub/shuwen-zhou/57/55b/599/> > > |
Free forum by Nabble | Edit this page |