This post was updated on .
From conclusion of this paper https://dl.acm.org/citation.cfm?id=3132750
<https://dl.acm.org/citation.cfm?id=3132750http://> , Flink's recovery speed is slower than that of Spark Streaming, which will be a problem in large scale deployment where fault happens frequently. I'd like to know whether this is still a problem or not. Any advices are appreciated. -- Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ |
Hi jiaxl,
The paper you mentioned was published at 2017. I think it doesn't have much reference value now. Over time, both frameworks are constantly evolving. At the end of May this year, Flink has supported the major feature of local recovery in the latest release of version 1.5. This greatly improves the speed of recovery. Flink has not stopped the improvement of state recovery and fault tolerance. I think you can verify it yourself. Thanks, vino. 2018-07-24 23:15 GMT+08:00 jiaxl <[hidden email]>: > From conclusion of this paper https://dl.acm.org/citation.cfm?id=3132750 > <https://dl.acm.org/citation.cfm?id=3132750http://> , Flink's recovery > speed is slower than that of Spark Streaming, which will be a problem in > large scale deployment where fault happens frequently. > I'd like to know whether this is still a problem or not. Any advices are > appreciated. > > > > -- > Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ > |
As far as I learned from folks with better understanding than myself , barrier alignment might be only path to get deterministic output.
Any state or outcome between barrier alignments requires second thought(like UDP packages from network). Currently, alignment is used only do heavyweight checkpointing. If folks decided to improve algorithm and use in other ways like auto scaling or secondary task shadowing is still TBD. Chen > On Jul 24, 2018, at 18:57, vino yang <[hidden email]> wrote: > > Hi jiaxl, > > The paper you mentioned was published at 2017. I think it doesn't have much > reference value now. > Over time, both frameworks are constantly evolving. > At the end of May this year, Flink has supported the major feature of local > recovery in the latest release of version 1.5. > This greatly improves the speed of recovery. > Flink has not stopped the improvement of state recovery and fault > tolerance. > I think you can verify it yourself. > > Thanks, vino. > > > 2018-07-24 23:15 GMT+08:00 jiaxl <[hidden email]>: > >> From conclusion of this paper https://dl.acm.org/citation.cfm?id=3132750 >> <https://dl.acm.org/citation.cfm?id=3132750http://> , Flink's recovery >> speed is slower than that of Spark Streaming, which will be a problem in >> large scale deployment where fault happens frequently. >> I'd like to know whether this is still a problem or not. Any advices are >> appreciated. >> >> >> >> -- >> Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ >> |
In reply to this post by vino yang
Hi vino,
Thanks for your early reply. Since 2017, developers of Flink have done great job to improve the performance. But I didn't find papers or blogs as a response to that paper. So I asked this question here. Before asking this question, I was doing some experiment with Flink 1.5.1. But as you know, it takes some time to tune the system to its best state and then experiment can be done. So I expect that some experienced developers may have done some related research to share. Thanks again, jiaxl -- Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ |
Hi jiaxl,
Thanks for your verification! Yes, Flink is growing very fast. There is really not much benchmark or blog to explore this topic, after all, the local recovery feature is released in version 1.5. The time point is not long before, and this part is still being improved and not very mature. Thanks, vino. 2018-07-25 19:16 GMT+08:00 jiaxl <[hidden email]>: > Hi vino, > > Thanks for your early reply. > > Since 2017, developers of Flink have done great job to improve the > performance. But I didn't find papers or blogs as a response to that paper. > So I asked this question here. > Before asking this question, I was doing some experiment with Flink 1.5.1. > But as you know, it takes some time to tune the system to its best state > and > then experiment can be done. So I expect that some experienced developers > may have done some related research to share. > > > Thanks again, jiaxl > > > > -- > Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/ > |
Free forum by Nabble | Edit this page |