Hi everyone, thank you for your comments. Mail name was updated
and streaming-related concepts were added. We would like to start a discussion thread on "FLIP-63: Rework table partition support"(Design doc: [1]), where we describe how to partition support in flink and how to integrate to hive partition. This FLIP addresses: - Introduce whole story about partition support. - Introduce and discuss DDL of partition support. - Introduce static and dynamic partition insert. - Introduce partition pruning - Introduce dynamic partition implementation - Introduce FileFormatSink to deal with streaming exactly-once and partition-related logic. Details can be seen in the design document. Looking forward to your feedbacks. Thank you. [1] https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing Best, Jingsong Lee |
Hi Jingsong,
Thank you for bringing this discussion. Since I don't have much experience of Flink table/SQL, I'll ask some questions from runtime or engine perspective. > ... where we describe how to partition support in flink and how to integrate to hive partition. FLIP-27 [1] introduces "partition" concept officially. The changes of FLIP-27 are not only about source interface but also about the whole infrastructure. Have you ever thought how to integrate your proposal with these changes? Or you just want to support "partition" in table layer, there will be no requirement of underlying infrastructure? I have seen a discussion [2] that seems be a requirement of infrastructure to support your proposal. So I have some concerns there might be some conflicts between this proposal and FLIP-27. 1. https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface 2. http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html Thanks, Biao /'bɪ.aʊ/ On Fri, 6 Sep 2019 at 13:22, JingsongLee <[hidden email]> wrote: > Hi everyone, thank you for your comments. Mail name was updated > and streaming-related concepts were added. > > We would like to start a discussion thread on "FLIP-63: Rework table > partition support"(Design doc: [1]), where we describe how to partition > support in flink and how to integrate to hive partition. > > This FLIP addresses: > - Introduce whole story about partition support. > - Introduce and discuss DDL of partition support. > - Introduce static and dynamic partition insert. > - Introduce partition pruning > - Introduce dynamic partition implementation > - Introduce FileFormatSink to deal with streaming exactly-once and > partition-related logic. > > Details can be seen in the design document. > Looking forward to your feedbacks. Thank you. > > [1] > https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing > > Best, > Jingsong Lee |
Hi biao, thanks for your feedbacks:
Actually, the runtime source partition of runtime is similar to split, which concerns data reading, parallelism and fault tolerance, all the runtime concepts. While table partition is only a virtual concept. Users are more likely to choose which partition to read and which partition to write. Users can manage their partitions. One is physical implementation correlation, the other is logical concept correlation. So I think they are two completely different things. About [2], The main problem is that how to write data to a catalog file system in stream mode, it is a general problem and has little to do with partition. [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html Best, Jingsong Lee ------------------------------------------------------------------ From:Biao Liu <[hidden email]> Send Time:2019年9月10日(星期二) 14:57 To:dev <[hidden email]>; JingsongLee <[hidden email]> Subject:Re: [DISCUSS] FLIP-63: Rework table partition support Hi Jingsong, Thank you for bringing this discussion. Since I don't have much experience of Flink table/SQL, I'll ask some questions from runtime or engine perspective. > ... where we describe how to partition support in flink and how to integrate to hive partition. FLIP-27 [1] introduces "partition" concept officially. The changes of FLIP-27 are not only about source interface but also about the whole infrastructure. Have you ever thought how to integrate your proposal with these changes? Or you just want to support "partition" in table layer, there will be no requirement of underlying infrastructure? I have seen a discussion [2] that seems be a requirement of infrastructure to support your proposal. So I have some concerns there might be some conflicts between this proposal and FLIP-27. 1. https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface 2. http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html Thanks, Biao /'bɪ.aʊ/ On Fri, 6 Sep 2019 at 13:22, JingsongLee <[hidden email]> wrote: Hi everyone, thank you for your comments. Mail name was updated and streaming-related concepts were added. We would like to start a discussion thread on "FLIP-63: Rework table partition support"(Design doc: [1]), where we describe how to partition support in flink and how to integrate to hive partition. This FLIP addresses: - Introduce whole story about partition support. - Introduce and discuss DDL of partition support. - Introduce static and dynamic partition insert. - Introduce partition pruning - Introduce dynamic partition implementation - Introduce FileFormatSink to deal with streaming exactly-once and partition-related logic. Details can be seen in the design document. Looking forward to your feedbacks. Thank you. [1] https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing Best, Jingsong Lee |
Hi Jingsong,
Thanks for explaining. It looks cool! Thanks, Biao /'bɪ.aʊ/ On Wed, 11 Sep 2019 at 11:37, JingsongLee <[hidden email]> wrote: > Hi biao, thanks for your feedbacks: > > Actually, the runtime source partition of runtime is similar to split, > which concerns data reading, parallelism and fault tolerance, all the > runtime concepts. > While table partition is only a virtual concept. Users are more likely to > choose which partition to read and which partition to write. Users can > manage their partitions. > One is physical implementation correlation, the other is logical concept > correlation. > So I think they are two completely different things. > > About [2], The main problem is that how to write data to a catalog file > system in stream mode, it is a general problem and has little to do with > partition. > > [2] > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html > > Best, > Jingsong Lee > > > ------------------------------------------------------------------ > From:Biao Liu <[hidden email]> > Send Time:2019年9月10日(星期二) 14:57 > To:dev <[hidden email]>; JingsongLee <[hidden email]> > Subject:Re: [DISCUSS] FLIP-63: Rework table partition support > > Hi Jingsong, > > Thank you for bringing this discussion. Since I don't have much experience > of Flink table/SQL, I'll ask some questions from runtime or engine > perspective. > > > ... where we describe how to partition support in flink and how to > integrate to hive partition. > > FLIP-27 [1] introduces "partition" concept officially. The changes of > FLIP-27 are not only about source interface but also about the whole > infrastructure. > Have you ever thought how to integrate your proposal with these changes? > Or you just want to support "partition" in table layer, there will be no > requirement of underlying infrastructure? > > I have seen a discussion [2] that seems be a requirement of infrastructure > to support your proposal. So I have some concerns there might be some > conflicts between this proposal and FLIP-27. > > 1. > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > 2. > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html > > Thanks, > Biao /'bɪ.aʊ/ > > > > On Fri, 6 Sep 2019 at 13:22, JingsongLee <[hidden email]> > wrote: > Hi everyone, thank you for your comments. Mail name was updated > and streaming-related concepts were added. > > We would like to start a discussion thread on "FLIP-63: Rework table > partition support"(Design doc: [1]), where we describe how to partition > support in flink and how to integrate to hive partition. > > This FLIP addresses: > - Introduce whole story about partition support. > - Introduce and discuss DDL of partition support. > - Introduce static and dynamic partition insert. > - Introduce partition pruning > - Introduce dynamic partition implementation > - Introduce FileFormatSink to deal with streaming exactly-once and > partition-related logic. > > Details can be seen in the design document. > Looking forward to your feedbacks. Thank you. > > [1] > https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing > > Best, > Jingsong Lee > > |
+1 to this feature, I left some comments on google doc.
Another comment is I think we should do some reorganize about the content when you converting this to a cwiki page. I will have some offline discussion with you. Since this feature seems to be a fairly big efforts, so I suggest we can settle down the design doc ASAP and start vote process. Best, Kurt On Thu, Sep 12, 2019 at 12:43 PM Biao Liu <[hidden email]> wrote: > Hi Jingsong, > > Thanks for explaining. It looks cool! > > Thanks, > Biao /'bɪ.aʊ/ > > > > On Wed, 11 Sep 2019 at 11:37, JingsongLee <[hidden email] > .invalid> > wrote: > > > Hi biao, thanks for your feedbacks: > > > > Actually, the runtime source partition of runtime is similar to split, > > which concerns data reading, parallelism and fault tolerance, all the > > runtime concepts. > > While table partition is only a virtual concept. Users are more likely to > > choose which partition to read and which partition to write. Users can > > manage their partitions. > > One is physical implementation correlation, the other is logical concept > > correlation. > > So I think they are two completely different things. > > > > About [2], The main problem is that how to write data to a catalog file > > system in stream mode, it is a general problem and has little to do with > > partition. > > > > [2] > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html > > > > Best, > > Jingsong Lee > > > > > > ------------------------------------------------------------------ > > From:Biao Liu <[hidden email]> > > Send Time:2019年9月10日(星期二) 14:57 > > To:dev <[hidden email]>; JingsongLee <[hidden email]> > > Subject:Re: [DISCUSS] FLIP-63: Rework table partition support > > > > Hi Jingsong, > > > > Thank you for bringing this discussion. Since I don't have much > experience > > of Flink table/SQL, I'll ask some questions from runtime or engine > > perspective. > > > > > ... where we describe how to partition support in flink and how to > > integrate to hive partition. > > > > FLIP-27 [1] introduces "partition" concept officially. The changes of > > FLIP-27 are not only about source interface but also about the whole > > infrastructure. > > Have you ever thought how to integrate your proposal with these changes? > > Or you just want to support "partition" in table layer, there will be no > > requirement of underlying infrastructure? > > > > I have seen a discussion [2] that seems be a requirement of > infrastructure > > to support your proposal. So I have some concerns there might be some > > conflicts between this proposal and FLIP-27. > > > > 1. > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > > 2. > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html > > > > Thanks, > > Biao /'bɪ.aʊ/ > > > > > > > > On Fri, 6 Sep 2019 at 13:22, JingsongLee <[hidden email] > .invalid> > > wrote: > > Hi everyone, thank you for your comments. Mail name was updated > > and streaming-related concepts were added. > > > > We would like to start a discussion thread on "FLIP-63: Rework table > > partition support"(Design doc: [1]), where we describe how to partition > > support in flink and how to integrate to hive partition. > > > > This FLIP addresses: > > - Introduce whole story about partition support. > > - Introduce and discuss DDL of partition support. > > - Introduce static and dynamic partition insert. > > - Introduce partition pruning > > - Introduce dynamic partition implementation > > - Introduce FileFormatSink to deal with streaming exactly-once and > > partition-related logic. > > > > Details can be seen in the design document. > > Looking forward to your feedbacks. Thank you. > > > > [1] > > > https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing > > > > Best, > > Jingsong Lee > > > > > |
Thanks for your reply and google doc comments. It has been discussed
for two weeks now. I will start a vote thread. Best, Jingsong Lee ------------------------------------------------------------------ From:Kurt Young <[hidden email]> Send Time:2019年9月16日(星期一) 15:55 To:dev <[hidden email]> Cc:JingsongLee <[hidden email]> Subject:Re: [DISCUSS] FLIP-63: Rework table partition support +1 to this feature, I left some comments on google doc. Another comment is I think we should do some reorganize about the content when you converting this to a cwiki page. I will have some offline discussion with you. Since this feature seems to be a fairly big efforts, so I suggest we can settle down the design doc ASAP and start vote process. Best, Kurt On Thu, Sep 12, 2019 at 12:43 PM Biao Liu <[hidden email]> wrote: Hi Jingsong, Thanks for explaining. It looks cool! Thanks, Biao /'bɪ.aʊ/ On Wed, 11 Sep 2019 at 11:37, JingsongLee <[hidden email]> wrote: > Hi biao, thanks for your feedbacks: > > Actually, the runtime source partition of runtime is similar to split, > which concerns data reading, parallelism and fault tolerance, all the > runtime concepts. > While table partition is only a virtual concept. Users are more likely to > choose which partition to read and which partition to write. Users can > manage their partitions. > One is physical implementation correlation, the other is logical concept > correlation. > So I think they are two completely different things. > > About [2], The main problem is that how to write data to a catalog file > system in stream mode, it is a general problem and has little to do with > partition. > > [2] > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html > > Best, > Jingsong Lee > > > ------------------------------------------------------------------ > From:Biao Liu <[hidden email]> > Send Time:2019年9月10日(星期二) 14:57 > To:dev <[hidden email]>; JingsongLee <[hidden email]> > Subject:Re: [DISCUSS] FLIP-63: Rework table partition support > > Hi Jingsong, > > Thank you for bringing this discussion. Since I don't have much experience > of Flink table/SQL, I'll ask some questions from runtime or engine > perspective. > > > ... where we describe how to partition support in flink and how to > integrate to hive partition. > > FLIP-27 [1] introduces "partition" concept officially. The changes of > FLIP-27 are not only about source interface but also about the whole > infrastructure. > Have you ever thought how to integrate your proposal with these changes? > Or you just want to support "partition" in table layer, there will be no > requirement of underlying infrastructure? > > I have seen a discussion [2] that seems be a requirement of infrastructure > to support your proposal. So I have some concerns there might be some > conflicts between this proposal and FLIP-27. > > 1. > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > 2. > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html > > Thanks, > Biao /'bɪ.aʊ/ > > > > On Fri, 6 Sep 2019 at 13:22, JingsongLee <[hidden email]> > wrote: > Hi everyone, thank you for your comments. Mail name was updated > and streaming-related concepts were added. > > We would like to start a discussion thread on "FLIP-63: Rework table > partition support"(Design doc: [1]), where we describe how to partition > support in flink and how to integrate to hive partition. > > This FLIP addresses: > - Introduce whole story about partition support. > - Introduce and discuss DDL of partition support. > - Introduce static and dynamic partition insert. > - Introduce partition pruning > - Introduce dynamic partition implementation > - Introduce FileFormatSink to deal with streaming exactly-once and > partition-related logic. > > Details can be seen in the design document. > Looking forward to your feedbacks. Thank you. > > [1] > https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing > > Best, > Jingsong Lee > > |
Thanks for your discussion on google document.
Comments addressed and added FileSystem connector chapter, and introduce code prototype for file system connector to unify flink file system and hive connectors. Looking forward to your feedbacks. Thank you. Best, Jingsong Lee ------------------------------------------------------------------ From:JingsongLee <[hidden email]> Send Time:2019年9月18日(星期三) 09:45 To:Kurt Young <[hidden email]>; dev <[hidden email]> Subject:Re: [DISCUSS] FLIP-63: Rework table partition support Thanks for your reply and google doc comments. It has been discussed for two weeks now. I will start a vote thread. Best, Jingsong Lee ------------------------------------------------------------------ From:Kurt Young <[hidden email]> Send Time:2019年9月16日(星期一) 15:55 To:dev <[hidden email]> Cc:JingsongLee <[hidden email]> Subject:Re: [DISCUSS] FLIP-63: Rework table partition support +1 to this feature, I left some comments on google doc. Another comment is I think we should do some reorganize about the content when you converting this to a cwiki page. I will have some offline discussion with you. Since this feature seems to be a fairly big efforts, so I suggest we can settle down the design doc ASAP and start vote process. Best, Kurt On Thu, Sep 12, 2019 at 12:43 PM Biao Liu <[hidden email]> wrote: Hi Jingsong, Thanks for explaining. It looks cool! Thanks, Biao /'bɪ.aʊ/ On Wed, 11 Sep 2019 at 11:37, JingsongLee <[hidden email]> wrote: > Hi biao, thanks for your feedbacks: > > Actually, the runtime source partition of runtime is similar to split, > which concerns data reading, parallelism and fault tolerance, all the > runtime concepts. > While table partition is only a virtual concept. Users are more likely to > choose which partition to read and which partition to write. Users can > manage their partitions. > One is physical implementation correlation, the other is logical concept > correlation. > So I think they are two completely different things. > > About [2], The main problem is that how to write data to a catalog file > system in stream mode, it is a general problem and has little to do with > partition. > > [2] > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html > > Best, > Jingsong Lee > > > ------------------------------------------------------------------ > From:Biao Liu <[hidden email]> > Send Time:2019年9月10日(星期二) 14:57 > To:dev <[hidden email]>; JingsongLee <[hidden email]> > Subject:Re: [DISCUSS] FLIP-63: Rework table partition support > > Hi Jingsong, > > Thank you for bringing this discussion. Since I don't have much experience > of Flink table/SQL, I'll ask some questions from runtime or engine > perspective. > > > ... where we describe how to partition support in flink and how to > integrate to hive partition. > > FLIP-27 [1] introduces "partition" concept officially. The changes of > FLIP-27 are not only about source interface but also about the whole > infrastructure. > Have you ever thought how to integrate your proposal with these changes? > Or you just want to support "partition" in table layer, there will be no > requirement of underlying infrastructure? > > I have seen a discussion [2] that seems be a requirement of infrastructure > to support your proposal. So I have some concerns there might be some > conflicts between this proposal and FLIP-27. > > 1. > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > 2. > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html > > Thanks, > Biao /'bɪ.aʊ/ > > > > On Fri, 6 Sep 2019 at 13:22, JingsongLee <[hidden email]> > wrote: > Hi everyone, thank you for your comments. Mail name was updated > and streaming-related concepts were added. > > We would like to start a discussion thread on "FLIP-63: Rework table > partition support"(Design doc: [1]), where we describe how to partition > support in flink and how to integrate to hive partition. > > This FLIP addresses: > - Introduce whole story about partition support. > - Introduce and discuss DDL of partition support. > - Introduce static and dynamic partition insert. > - Introduce partition pruning > - Introduce dynamic partition implementation > - Introduce FileFormatSink to deal with streaming exactly-once and > partition-related logic. > > Details can be seen in the design document. > Looking forward to your feedbacks. Thank you. > > [1] > https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing > > Best, > Jingsong Lee > > |
Hi Jingsong,
Thanks for driving this effort! Besides a few further comments on Catalog APIs that I just left, it LGTM. Not sure why, but the voting thread in gmail shows in the same thread as the discussion's. After addressing all the comments, could you start a new, separate thread to let other people be aware of it? Thanks, Bowen On Mon, Sep 23, 2019 at 1:25 AM JingsongLee <[hidden email]> wrote: > Thanks for your discussion on google document. > Comments addressed and added FileSystem connector chapter, and introduce > code prototype for file system connector to unify flink file system and > hive connectors. > > Looking forward to your feedbacks. Thank you. > > Best, > Jingsong Lee > > > ------------------------------------------------------------------ > From:JingsongLee <[hidden email]> > Send Time:2019年9月18日(星期三) 09:45 > To:Kurt Young <[hidden email]>; dev <[hidden email]> > Subject:Re: [DISCUSS] FLIP-63: Rework table partition support > > Thanks for your reply and google doc comments. It has been discussed > for two weeks now. I will start a vote thread. > > Best, > Jingsong Lee > > > ------------------------------------------------------------------ > From:Kurt Young <[hidden email]> > Send Time:2019年9月16日(星期一) 15:55 > To:dev <[hidden email]> > Cc:JingsongLee <[hidden email]> > Subject:Re: [DISCUSS] FLIP-63: Rework table partition support > > +1 to this feature, I left some comments on google doc. > > Another comment is I think we should do some reorganize about the content > when you converting this to a cwiki page. I will have some offline > discussion > with you. > > Since this feature seems to be a fairly big efforts, so I suggest we can > settle > down the design doc ASAP and start vote process. > Best, > Kurt > > > On Thu, Sep 12, 2019 at 12:43 PM Biao Liu <[hidden email]> wrote: > Hi Jingsong, > > Thanks for explaining. It looks cool! > > Thanks, > Biao /'bɪ.aʊ/ > > > > On Wed, 11 Sep 2019 at 11:37, JingsongLee <[hidden email] > .invalid> > wrote: > > > Hi biao, thanks for your feedbacks: > > > > Actually, the runtime source partition of runtime is similar to split, > > which concerns data reading, parallelism and fault tolerance, all the > > runtime concepts. > > While table partition is only a virtual concept. Users are more likely > to > > choose which partition to read and which partition to write. Users can > > manage their partitions. > > One is physical implementation correlation, the other is logical concept > > correlation. > > So I think they are two completely different things. > > > > About [2], The main problem is that how to write data to a catalog file > > system in stream mode, it is a general problem and has little to do with > > partition. > > > > [2] > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html > > > > Best, > > Jingsong Lee > > > > > > ------------------------------------------------------------------ > > From:Biao Liu <[hidden email]> > > Send Time:2019年9月10日(星期二) 14:57 > > To:dev <[hidden email]>; JingsongLee <[hidden email]> > > Subject:Re: [DISCUSS] FLIP-63: Rework table partition support > > > > Hi Jingsong, > > > > Thank you for bringing this discussion. Since I don't have much > experience > > of Flink table/SQL, I'll ask some questions from runtime or engine > > perspective. > > > > > ... where we describe how to partition support in flink and how to > > integrate to hive partition. > > > > FLIP-27 [1] introduces "partition" concept officially. The changes of > > FLIP-27 are not only about source interface but also about the whole > > infrastructure. > > Have you ever thought how to integrate your proposal with these changes? > > Or you just want to support "partition" in table layer, there will be no > > requirement of underlying infrastructure? > > > > I have seen a discussion [2] that seems be a requirement of > infrastructure > > to support your proposal. So I have some concerns there might be some > > conflicts between this proposal and FLIP-27. > > > > 1. > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > > 2. > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html > > > > Thanks, > > Biao /'bɪ.aʊ/ > > > > > > > > On Fri, 6 Sep 2019 at 13:22, JingsongLee <[hidden email] > .invalid> > > wrote: > > Hi everyone, thank you for your comments. Mail name was updated > > and streaming-related concepts were added. > > > > We would like to start a discussion thread on "FLIP-63: Rework table > > partition support"(Design doc: [1]), where we describe how to partition > > support in flink and how to integrate to hive partition. > > > > This FLIP addresses: > > - Introduce whole story about partition support. > > - Introduce and discuss DDL of partition support. > > - Introduce static and dynamic partition insert. > > - Introduce partition pruning > > - Introduce dynamic partition implementation > > - Introduce FileFormatSink to deal with streaming exactly-once and > > partition-related logic. > > > > Details can be seen in the design document. > > Looking forward to your feedbacks. Thank you. > > > > [1] > > > https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing > > > > Best, > > Jingsong Lee > > > > > > |
Thanks for you review, I will send a another vote thread from my apache email.
Best, Jingsong Lee ------------------------------------------------------------------ From:Bowen Li <[hidden email]> Send Time:2019年9月24日(星期二) 03:06 To:JingsongLee <[hidden email]> Cc:dev <[hidden email]> Subject:Re: [DISCUSS] FLIP-63: Rework table partition support Hi Jingsong, Thanks for driving this effort! Besides a few further comments on Catalog APIs that I just left, it LGTM. Not sure why, but the voting thread in gmail shows in the same thread as the discussion's. After addressing all the comments, could you start a new, separate thread to let other people be aware of it? Thanks, Bowen On Mon, Sep 23, 2019 at 1:25 AM JingsongLee <[hidden email]> wrote: > Thanks for your discussion on google document. > Comments addressed and added FileSystem connector chapter, and introduce > code prototype for file system connector to unify flink file system and > hive connectors. > > Looking forward to your feedbacks. Thank you. > > Best, > Jingsong Lee > > > ------------------------------------------------------------------ > From:JingsongLee <[hidden email]> > Send Time:2019年9月18日(星期三) 09:45 > To:Kurt Young <[hidden email]>; dev <[hidden email]> > Subject:Re: [DISCUSS] FLIP-63: Rework table partition support > > Thanks for your reply and google doc comments. It has been discussed > for two weeks now. I will start a vote thread. > > Best, > Jingsong Lee > > > ------------------------------------------------------------------ > From:Kurt Young <[hidden email]> > Send Time:2019年9月16日(星期一) 15:55 > To:dev <[hidden email]> > Cc:JingsongLee <[hidden email]> > Subject:Re: [DISCUSS] FLIP-63: Rework table partition support > > +1 to this feature, I left some comments on google doc. > > Another comment is I think we should do some reorganize about the content > when you converting this to a cwiki page. I will have some offline > discussion > with you. > > Since this feature seems to be a fairly big efforts, so I suggest we can > settle > down the design doc ASAP and start vote process. > Best, > Kurt > > > On Thu, Sep 12, 2019 at 12:43 PM Biao Liu <[hidden email]> wrote: > Hi Jingsong, > > Thanks for explaining. It looks cool! > > Thanks, > Biao /'bɪ.aʊ/ > > > > On Wed, 11 Sep 2019 at 11:37, JingsongLee <[hidden email] > .invalid> > wrote: > > > Hi biao, thanks for your feedbacks: > > > > Actually, the runtime source partition of runtime is similar to split, > > which concerns data reading, parallelism and fault tolerance, all the > > runtime concepts. > > While table partition is only a virtual concept. Users are more likely > to > > choose which partition to read and which partition to write. Users can > > manage their partitions. > > One is physical implementation correlation, the other is logical concept > > correlation. > > So I think they are two completely different things. > > > > About [2], The main problem is that how to write data to a catalog file > > system in stream mode, it is a general problem and has little to do with > > partition. > > > > [2] > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html > > > > Best, > > Jingsong Lee > > > > > > ------------------------------------------------------------------ > > From:Biao Liu <[hidden email]> > > Send Time:2019年9月10日(星期二) 14:57 > > To:dev <[hidden email]>; JingsongLee <[hidden email]> > > Subject:Re: [DISCUSS] FLIP-63: Rework table partition support > > > > Hi Jingsong, > > > > Thank you for bringing this discussion. Since I don't have much > experience > > of Flink table/SQL, I'll ask some questions from runtime or engine > > perspective. > > > > > ... where we describe how to partition support in flink and how to > > integrate to hive partition. > > > > FLIP-27 [1] introduces "partition" concept officially. The changes of > > FLIP-27 are not only about source interface but also about the whole > > infrastructure. > > Have you ever thought how to integrate your proposal with these changes? > > Or you just want to support "partition" in table layer, there will be no > > requirement of underlying infrastructure? > > > > I have seen a discussion [2] that seems be a requirement of > infrastructure > > to support your proposal. So I have some concerns there might be some > > conflicts between this proposal and FLIP-27. > > > > 1. > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > > 2. > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html > > > > Thanks, > > Biao /'bɪ.aʊ/ > > > > > > > > On Fri, 6 Sep 2019 at 13:22, JingsongLee <[hidden email] > .invalid> > > wrote: > > Hi everyone, thank you for your comments. Mail name was updated > > and streaming-related concepts were added. > > > > We would like to start a discussion thread on "FLIP-63: Rework table > > partition support"(Design doc: [1]), where we describe how to partition > > support in flink and how to integrate to hive partition. > > > > This FLIP addresses: > > - Introduce whole story about partition support. > > - Introduce and discuss DDL of partition support. > > - Introduce static and dynamic partition insert. > > - Introduce partition pruning > > - Introduce dynamic partition implementation > > - Introduce FileFormatSink to deal with streaming exactly-once and > > partition-related logic. > > > > Details can be seen in the design document. > > Looking forward to your feedbacks. Thank you. > > > > [1] > > > https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing > > > > Best, > > Jingsong Lee > > > > > > |
After offline discussion with Jark, the current grammar for creating partition tables is limited to hive dialect,
and the Flink built-in grammar for creating partition tables is treated as further discussion, it will be determined by voting after a period of time (Need more thinking). Best, Jingsong Lee ------------------------------------------------------------------ From:JingsongLee <[hidden email]> Send Time:2019年9月24日(星期二) 10:19 To:dev <[hidden email]> Cc:dev <[hidden email]> Subject:Re: [DISCUSS] FLIP-63: Rework table partition support Thanks for you review, I will send a another vote thread from my apache email. Best, Jingsong Lee ------------------------------------------------------------------ From:Bowen Li <[hidden email]> Send Time:2019年9月24日(星期二) 03:06 To:JingsongLee <[hidden email]> Cc:dev <[hidden email]> Subject:Re: [DISCUSS] FLIP-63: Rework table partition support Hi Jingsong, Thanks for driving this effort! Besides a few further comments on Catalog APIs that I just left, it LGTM. Not sure why, but the voting thread in gmail shows in the same thread as the discussion's. After addressing all the comments, could you start a new, separate thread to let other people be aware of it? Thanks, Bowen On Mon, Sep 23, 2019 at 1:25 AM JingsongLee <[hidden email]> wrote: > Thanks for your discussion on google document. > Comments addressed and added FileSystem connector chapter, and introduce > code prototype for file system connector to unify flink file system and > hive connectors. > > Looking forward to your feedbacks. Thank you. > > Best, > Jingsong Lee > > > ------------------------------------------------------------------ > From:JingsongLee <[hidden email]> > Send Time:2019年9月18日(星期三) 09:45 > To:Kurt Young <[hidden email]>; dev <[hidden email]> > Subject:Re: [DISCUSS] FLIP-63: Rework table partition support > > Thanks for your reply and google doc comments. It has been discussed > for two weeks now. I will start a vote thread. > > Best, > Jingsong Lee > > > ------------------------------------------------------------------ > From:Kurt Young <[hidden email]> > Send Time:2019年9月16日(星期一) 15:55 > To:dev <[hidden email]> > Cc:JingsongLee <[hidden email]> > Subject:Re: [DISCUSS] FLIP-63: Rework table partition support > > +1 to this feature, I left some comments on google doc. > > Another comment is I think we should do some reorganize about the content > when you converting this to a cwiki page. I will have some offline > discussion > with you. > > Since this feature seems to be a fairly big efforts, so I suggest we can > settle > down the design doc ASAP and start vote process. > Best, > Kurt > > > On Thu, Sep 12, 2019 at 12:43 PM Biao Liu <[hidden email]> wrote: > Hi Jingsong, > > Thanks for explaining. It looks cool! > > Thanks, > Biao /'bɪ.aʊ/ > > > > On Wed, 11 Sep 2019 at 11:37, JingsongLee <[hidden email] > .invalid> > wrote: > > > Hi biao, thanks for your feedbacks: > > > > Actually, the runtime source partition of runtime is similar to split, > > which concerns data reading, parallelism and fault tolerance, all the > > runtime concepts. > > While table partition is only a virtual concept. Users are more likely > to > > choose which partition to read and which partition to write. Users can > > manage their partitions. > > One is physical implementation correlation, the other is logical concept > > correlation. > > So I think they are two completely different things. > > > > About [2], The main problem is that how to write data to a catalog file > > system in stream mode, it is a general problem and has little to do with > > partition. > > > > [2] > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html > > > > Best, > > Jingsong Lee > > > > > > ------------------------------------------------------------------ > > From:Biao Liu <[hidden email]> > > Send Time:2019年9月10日(星期二) 14:57 > > To:dev <[hidden email]>; JingsongLee <[hidden email]> > > Subject:Re: [DISCUSS] FLIP-63: Rework table partition support > > > > Hi Jingsong, > > > > Thank you for bringing this discussion. Since I don't have much > experience > > of Flink table/SQL, I'll ask some questions from runtime or engine > > perspective. > > > > > ... where we describe how to partition support in flink and how to > > integrate to hive partition. > > > > FLIP-27 [1] introduces "partition" concept officially. The changes of > > FLIP-27 are not only about source interface but also about the whole > > infrastructure. > > Have you ever thought how to integrate your proposal with these changes? > > Or you just want to support "partition" in table layer, there will be no > > requirement of underlying infrastructure? > > > > I have seen a discussion [2] that seems be a requirement of > infrastructure > > to support your proposal. So I have some concerns there might be some > > conflicts between this proposal and FLIP-27. > > > > 1. > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > > 2. > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Support-notifyOnMaster-for-notifyCheckpointComplete-td32769.html > > > > Thanks, > > Biao /'bɪ.aʊ/ > > > > > > > > On Fri, 6 Sep 2019 at 13:22, JingsongLee <[hidden email] > .invalid> > > wrote: > > Hi everyone, thank you for your comments. Mail name was updated > > and streaming-related concepts were added. > > > > We would like to start a discussion thread on "FLIP-63: Rework table > > partition support"(Design doc: [1]), where we describe how to partition > > support in flink and how to integrate to hive partition. > > > > This FLIP addresses: > > - Introduce whole story about partition support. > > - Introduce and discuss DDL of partition support. > > - Introduce static and dynamic partition insert. > > - Introduce partition pruning > > - Introduce dynamic partition implementation > > - Introduce FileFormatSink to deal with streaming exactly-once and > > partition-related logic. > > > > Details can be seen in the design document. > > Looking forward to your feedbacks. Thank you. > > > > [1] > > > https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=sharing > > > > Best, > > Jingsong Lee > > > > > > |
Free forum by Nabble | Edit this page |