Hi everyone,
Per discussion in the previous thread <http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html>, I have created FLIP-72 to kick off a more detailed discussion on the Flink Pulsar connector: https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector In short, the connector has the following features: - Pulsar as a streaming source with exactly-once guarantee. - Sink streaming results to Pulsar with at-least-once semantics. - Build upon Flink new Table API Type system (FLIP-37 <https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System> ), and can automatically (de)serialize messages with the help of Pulsar schema. - Integrate with Flink new Catalog API (FLIP-30 <https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs>), which enables the use of Pulsar topics as tables in Table API as well as SQL client. https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u Would love to here your thoughts on this. Best, Yijie |
Hi Yijie,
Per the discussion, maybe you can move pulsar source to 'future work' section in the FLIP for now? Besides, the FLIP seems to be quite rough at the moment, and I'd recommend to add more details . A few questions mainly regarding the proposed pulsar catalog. - Can you provide some background of pulsar schema registry and how it works? - The proposed design of pulsar catalog is very vague now, can you share some details of how a pulsar catalog would work internally? E.g. - which APIs does it support exactly? E.g. I see from your prototype that table creation is supported but not alteration. - is it going to connect to a pulsar schema registry via a http client or a pulsar client, etc - will it be able to handle multiple versions of pulsar, or just one? How is compatibility handles between different Flink-Pulsar versions? - will it support only reading from pulsar schema registry , or both read/write? Will it work end-to-end in Flink SQL for users to create and manipulate a pulsar table such as "CREATE TABLE t WITH PROPERTIES(type=pulsar)" and "DROP TABLE t"? - Is a pulsar topic always gonna be a non-partitioned table? How is a partitioned topic mapped to a Flink table? - How to map Flink's catalog/database namespace to pulsar's multi-tenant namespaces? I'm not very familiar with how multi tenancy works in pulsar, and some background context/use cases may help here too. E.g. - can a pulsar client/consumer/producer be multiple-tenant at the same time? - how does authentication work in pulsar's multi-tenancy and the catalog? asking since I didn't see the proposed pulsar catalog has username/password configs - the FLIP seems propose mapping a pulsar cluster and 'tenant/namespace' respectively to Flink's 'catalog' and 'database'. I wonder whether it totally makes sense, or should we actually map "tenant" to "catalog", and "namespace" to "database"? Cheers, Bowen On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen <[hidden email]> wrote: > Hi everyone, > > Per discussion in the previous thread > < > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html > >, > I have created FLIP-72 to kick off a more detailed discussion on the Flink > Pulsar connector: > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector > > In short, the connector has the following features: > > - > > Pulsar as a streaming source with exactly-once guarantee. > - > > Sink streaming results to Pulsar with at-least-once semantics. > - > > Build upon Flink new Table API Type system (FLIP-37 > < > https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System > > > ), and can automatically (de)serialize messages with the help of Pulsar > schema. > - > > Integrate with Flink new Catalog API (FLIP-30 > < > https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs > >), > which enables the use of Pulsar topics as tables in Table API as well as > SQL client. > > > > https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u > > > Would love to here your thoughts on this. > > Best, > Yijie > |
Hi Bowen,
Thanks for your comments. I'll add catalog details as you suggested. One more question: since we decide to not implement source part of the connector at the moment. What can users do with a Pulsar catalog? Create a table backed by Pulsar and check existing pulsar tables to see their schemas? Drop tables maybe? Best, Yijie On Thu, Oct 17, 2019 at 1:04 AM Bowen Li <[hidden email]> wrote: > Hi Yijie, > > Per the discussion, maybe you can move pulsar source to 'future work' > section in the FLIP for now? > > Besides, the FLIP seems to be quite rough at the moment, and I'd recommend > to add more details . > > A few questions mainly regarding the proposed pulsar catalog. > > - Can you provide some background of pulsar schema registry and how it > works? > - The proposed design of pulsar catalog is very vague now, can you > share some details of how a pulsar catalog would work internally? E.g. > - which APIs does it support exactly? E.g. I see from your > prototype that table creation is supported but not alteration. > - is it going to connect to a pulsar schema registry via a http > client or a pulsar client, etc > - will it be able to handle multiple versions of pulsar, or just > one? How is compatibility handles between different Flink-Pulsar versions? > - will it support only reading from pulsar schema registry , or > both read/write? Will it work end-to-end in Flink SQL for users to create > and manipulate a pulsar table such as "CREATE TABLE t WITH > PROPERTIES(type=pulsar)" and "DROP TABLE t"? > - Is a pulsar topic always gonna be a non-partitioned table? How is > a partitioned topic mapped to a Flink table? > - How to map Flink's catalog/database namespace to pulsar's > multi-tenant namespaces? I'm not very familiar with how multi tenancy works > in pulsar, and some background context/use cases may help here too. E.g. > - can a pulsar client/consumer/producer be multiple-tenant at the > same time? > - how does authentication work in pulsar's multi-tenancy and the > catalog? asking since I didn't see the proposed pulsar catalog has > username/password configs > - the FLIP seems propose mapping a pulsar cluster and > 'tenant/namespace' respectively to Flink's 'catalog' and 'database'. I > wonder whether it totally makes sense, or should we actually map "tenant" > to "catalog", and "namespace" to "database"? > > Cheers, > Bowen > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen <[hidden email]> > wrote: > >> Hi everyone, >> >> Per discussion in the previous thread >> < >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html >> >, >> I have created FLIP-72 to kick off a more detailed discussion on the Flink >> Pulsar connector: >> >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector >> >> In short, the connector has the following features: >> >> - >> >> Pulsar as a streaming source with exactly-once guarantee. >> - >> >> Sink streaming results to Pulsar with at-least-once semantics. >> - >> >> Build upon Flink new Table API Type system (FLIP-37 >> < >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System >> > >> ), and can automatically (de)serialize messages with the help of Pulsar >> schema. >> - >> >> Integrate with Flink new Catalog API (FLIP-30 >> < >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs >> >), >> which enables the use of Pulsar topics as tables in Table API as well >> as >> SQL client. >> >> >> >> https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u >> >> >> Would love to here your thoughts on this. >> >> Best, >> Yijie >> > |
Hi Yijie,
Thanks for the design document. I agree with Bowen that the catalog part needs more details. And I would suggest to separate Pulsar Catalog as another FLIP. IMO, it has little to do with source/sink. Having a separate FLIP can unblock the contribution for sink (or source) and keep the discussion more focus. I also left some comments in the documentation. Thanks, Jark On Thu, 17 Oct 2019 at 11:24, Yijie Shen <[hidden email]> wrote: > Hi Bowen, > > Thanks for your comments. I'll add catalog details as you suggested. > > One more question: since we decide to not implement source part of the > connector at the moment. > What can users do with a Pulsar catalog? > Create a table backed by Pulsar and check existing pulsar tables to see > their schemas? Drop tables maybe? > > Best, > Yijie > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li <[hidden email]> wrote: > > > Hi Yijie, > > > > Per the discussion, maybe you can move pulsar source to 'future work' > > section in the FLIP for now? > > > > Besides, the FLIP seems to be quite rough at the moment, and I'd > recommend > > to add more details . > > > > A few questions mainly regarding the proposed pulsar catalog. > > > > - Can you provide some background of pulsar schema registry and how it > > works? > > - The proposed design of pulsar catalog is very vague now, can you > > share some details of how a pulsar catalog would work internally? E.g. > > - which APIs does it support exactly? E.g. I see from your > > prototype that table creation is supported but not alteration. > > - is it going to connect to a pulsar schema registry via a http > > client or a pulsar client, etc > > - will it be able to handle multiple versions of pulsar, or just > > one? How is compatibility handles between different Flink-Pulsar > versions? > > - will it support only reading from pulsar schema registry , or > > both read/write? Will it work end-to-end in Flink SQL for users to > create > > and manipulate a pulsar table such as "CREATE TABLE t WITH > > PROPERTIES(type=pulsar)" and "DROP TABLE t"? > > - Is a pulsar topic always gonna be a non-partitioned table? How is > > a partitioned topic mapped to a Flink table? > > - How to map Flink's catalog/database namespace to pulsar's > > multi-tenant namespaces? I'm not very familiar with how multi tenancy > works > > in pulsar, and some background context/use cases may help here too. > E.g. > > - can a pulsar client/consumer/producer be multiple-tenant at the > > same time? > > - how does authentication work in pulsar's multi-tenancy and the > > catalog? asking since I didn't see the proposed pulsar catalog has > > username/password configs > > - the FLIP seems propose mapping a pulsar cluster and > > 'tenant/namespace' respectively to Flink's 'catalog' and > 'database'. I > > wonder whether it totally makes sense, or should we actually map > "tenant" > > to "catalog", and "namespace" to "database"? > > > > Cheers, > > Bowen > > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen <[hidden email]> > > wrote: > > > >> Hi everyone, > >> > >> Per discussion in the previous thread > >> < > >> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html > >> >, > >> I have created FLIP-72 to kick off a more detailed discussion on the > Flink > >> Pulsar connector: > >> > >> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector > >> > >> In short, the connector has the following features: > >> > >> - > >> > >> Pulsar as a streaming source with exactly-once guarantee. > >> - > >> > >> Sink streaming results to Pulsar with at-least-once semantics. > >> - > >> > >> Build upon Flink new Table API Type system (FLIP-37 > >> < > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System > >> > > >> ), and can automatically (de)serialize messages with the help of > Pulsar > >> schema. > >> - > >> > >> Integrate with Flink new Catalog API (FLIP-30 > >> < > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs > >> >), > >> which enables the use of Pulsar topics as tables in Table API as well > >> as > >> SQL client. > >> > >> > >> > >> > https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u > >> > >> > >> Would love to here your thoughts on this. > >> > >> Best, > >> Yijie > >> > > > |
Hi Yijie,
I also agree with Jark on separating the Catalog part into another FLIP. With FLIP-27[1] also in the air, it is also probably great to split and unblock the sink implementation contribution. I would suggest either putting in a detail implementation plan section in the doc, or (maybe too much separation?) splitting them into different FLIPs. What do you guys think? -- Rong [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <[hidden email]> wrote: > Hi Yijie, > > Thanks for the design document. I agree with Bowen that the catalog part > needs more details. > And I would suggest to separate Pulsar Catalog as another FLIP. IMO, it has > little to do with source/sink. > Having a separate FLIP can unblock the contribution for sink (or source) > and keep the discussion more focus. > I also left some comments in the documentation. > > Thanks, > Jark > > On Thu, 17 Oct 2019 at 11:24, Yijie Shen <[hidden email]> > wrote: > > > Hi Bowen, > > > > Thanks for your comments. I'll add catalog details as you suggested. > > > > One more question: since we decide to not implement source part of the > > connector at the moment. > > What can users do with a Pulsar catalog? > > Create a table backed by Pulsar and check existing pulsar tables to see > > their schemas? Drop tables maybe? > > > > Best, > > Yijie > > > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li <[hidden email]> wrote: > > > > > Hi Yijie, > > > > > > Per the discussion, maybe you can move pulsar source to 'future work' > > > section in the FLIP for now? > > > > > > Besides, the FLIP seems to be quite rough at the moment, and I'd > > recommend > > > to add more details . > > > > > > A few questions mainly regarding the proposed pulsar catalog. > > > > > > - Can you provide some background of pulsar schema registry and how > it > > > works? > > > - The proposed design of pulsar catalog is very vague now, can you > > > share some details of how a pulsar catalog would work internally? > E.g. > > > - which APIs does it support exactly? E.g. I see from your > > > prototype that table creation is supported but not alteration. > > > - is it going to connect to a pulsar schema registry via a http > > > client or a pulsar client, etc > > > - will it be able to handle multiple versions of pulsar, or just > > > one? How is compatibility handles between different Flink-Pulsar > > versions? > > > - will it support only reading from pulsar schema registry , or > > > both read/write? Will it work end-to-end in Flink SQL for users > to > > create > > > and manipulate a pulsar table such as "CREATE TABLE t WITH > > > PROPERTIES(type=pulsar)" and "DROP TABLE t"? > > > - Is a pulsar topic always gonna be a non-partitioned table? How > is > > > a partitioned topic mapped to a Flink table? > > > - How to map Flink's catalog/database namespace to pulsar's > > > multi-tenant namespaces? I'm not very familiar with how multi > tenancy > > works > > > in pulsar, and some background context/use cases may help here too. > > E.g. > > > - can a pulsar client/consumer/producer be multiple-tenant at the > > > same time? > > > - how does authentication work in pulsar's multi-tenancy and the > > > catalog? asking since I didn't see the proposed pulsar catalog > has > > > username/password configs > > > - the FLIP seems propose mapping a pulsar cluster and > > > 'tenant/namespace' respectively to Flink's 'catalog' and > > 'database'. I > > > wonder whether it totally makes sense, or should we actually map > > "tenant" > > > to "catalog", and "namespace" to "database"? > > > > > > Cheers, > > > Bowen > > > > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen <[hidden email]> > > > wrote: > > > > > >> Hi everyone, > > >> > > >> Per discussion in the previous thread > > >> < > > >> > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html > > >> >, > > >> I have created FLIP-72 to kick off a more detailed discussion on the > > Flink > > >> Pulsar connector: > > >> > > >> > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector > > >> > > >> In short, the connector has the following features: > > >> > > >> - > > >> > > >> Pulsar as a streaming source with exactly-once guarantee. > > >> - > > >> > > >> Sink streaming results to Pulsar with at-least-once semantics. > > >> - > > >> > > >> Build upon Flink new Table API Type system (FLIP-37 > > >> < > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System > > >> > > > >> ), and can automatically (de)serialize messages with the help of > > Pulsar > > >> schema. > > >> - > > >> > > >> Integrate with Flink new Catalog API (FLIP-30 > > >> < > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs > > >> >), > > >> which enables the use of Pulsar topics as tables in Table API as > well > > >> as > > >> SQL client. > > >> > > >> > > >> > > >> > > > https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u > > >> > > >> > > >> Would love to here your thoughts on this. > > >> > > >> Best, > > >> Yijie > > >> > > > > > > |
Hi everyone,
Glad to receive your valuable feedbacks. I'd first separate the Pulsar catalog as another doc and show more design and implementation details there. For the current FLIP-72, I would separate it into the sink part for current work and keep the source part as future works until we reach FLIP-27 finals. I also reply to some of the comments in the design doc. I will rewrite the catalog part in regarding to Bowen's advice in both email and comments. Thanks for the help again. Best, Yijie On Fri, Oct 18, 2019 at 12:40 AM Rong Rong <[hidden email]> wrote: > Hi Yijie, > > I also agree with Jark on separating the Catalog part into another FLIP. > > With FLIP-27[1] also in the air, it is also probably great to split and > unblock the sink implementation contribution. > I would suggest either putting in a detail implementation plan section in > the doc, or (maybe too much separation?) splitting them into different > FLIPs. What do you guys think? > > -- > Rong > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > > On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <[hidden email]> wrote: > > > Hi Yijie, > > > > Thanks for the design document. I agree with Bowen that the catalog part > > needs more details. > > And I would suggest to separate Pulsar Catalog as another FLIP. IMO, it > has > > little to do with source/sink. > > Having a separate FLIP can unblock the contribution for sink (or source) > > and keep the discussion more focus. > > I also left some comments in the documentation. > > > > Thanks, > > Jark > > > > On Thu, 17 Oct 2019 at 11:24, Yijie Shen <[hidden email]> > > wrote: > > > > > Hi Bowen, > > > > > > Thanks for your comments. I'll add catalog details as you suggested. > > > > > > One more question: since we decide to not implement source part of the > > > connector at the moment. > > > What can users do with a Pulsar catalog? > > > Create a table backed by Pulsar and check existing pulsar tables to see > > > their schemas? Drop tables maybe? > > > > > > Best, > > > Yijie > > > > > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li <[hidden email]> wrote: > > > > > > > Hi Yijie, > > > > > > > > Per the discussion, maybe you can move pulsar source to 'future work' > > > > section in the FLIP for now? > > > > > > > > Besides, the FLIP seems to be quite rough at the moment, and I'd > > > recommend > > > > to add more details . > > > > > > > > A few questions mainly regarding the proposed pulsar catalog. > > > > > > > > - Can you provide some background of pulsar schema registry and > how > > it > > > > works? > > > > - The proposed design of pulsar catalog is very vague now, can you > > > > share some details of how a pulsar catalog would work internally? > > E.g. > > > > - which APIs does it support exactly? E.g. I see from your > > > > prototype that table creation is supported but not alteration. > > > > - is it going to connect to a pulsar schema registry via a http > > > > client or a pulsar client, etc > > > > - will it be able to handle multiple versions of pulsar, or > just > > > > one? How is compatibility handles between different > Flink-Pulsar > > > versions? > > > > - will it support only reading from pulsar schema registry , or > > > > both read/write? Will it work end-to-end in Flink SQL for users > > to > > > create > > > > and manipulate a pulsar table such as "CREATE TABLE t WITH > > > > PROPERTIES(type=pulsar)" and "DROP TABLE t"? > > > > - Is a pulsar topic always gonna be a non-partitioned table? > How > > is > > > > a partitioned topic mapped to a Flink table? > > > > - How to map Flink's catalog/database namespace to pulsar's > > > > multi-tenant namespaces? I'm not very familiar with how multi > > tenancy > > > works > > > > in pulsar, and some background context/use cases may help here > too. > > > E.g. > > > > - can a pulsar client/consumer/producer be multiple-tenant at > the > > > > same time? > > > > - how does authentication work in pulsar's multi-tenancy and > the > > > > catalog? asking since I didn't see the proposed pulsar catalog > > has > > > > username/password configs > > > > - the FLIP seems propose mapping a pulsar cluster and > > > > 'tenant/namespace' respectively to Flink's 'catalog' and > > > 'database'. I > > > > wonder whether it totally makes sense, or should we actually > map > > > "tenant" > > > > to "catalog", and "namespace" to "database"? > > > > > > > > Cheers, > > > > Bowen > > > > > > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen < > [hidden email]> > > > > wrote: > > > > > > > >> Hi everyone, > > > >> > > > >> Per discussion in the previous thread > > > >> < > > > >> > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html > > > >> >, > > > >> I have created FLIP-72 to kick off a more detailed discussion on the > > > Flink > > > >> Pulsar connector: > > > >> > > > >> > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector > > > >> > > > >> In short, the connector has the following features: > > > >> > > > >> - > > > >> > > > >> Pulsar as a streaming source with exactly-once guarantee. > > > >> - > > > >> > > > >> Sink streaming results to Pulsar with at-least-once semantics. > > > >> - > > > >> > > > >> Build upon Flink new Table API Type system (FLIP-37 > > > >> < > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System > > > >> > > > > >> ), and can automatically (de)serialize messages with the help of > > > Pulsar > > > >> schema. > > > >> - > > > >> > > > >> Integrate with Flink new Catalog API (FLIP-30 > > > >> < > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs > > > >> >), > > > >> which enables the use of Pulsar topics as tables in Table API as > > well > > > >> as > > > >> SQL client. > > > >> > > > >> > > > >> > > > >> > > > > > > https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u > > > >> > > > >> > > > >> Would love to here your thoughts on this. > > > >> > > > >> Best, > > > >> Yijie > > > >> > > > > > > > > > > |
Hi everyone,
I've put the catalog part design in separate doc with more details for easier communication. https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing I would love to hear your thoughts on this. Best, Yijie On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen <[hidden email]> wrote: > Hi everyone, > > Glad to receive your valuable feedbacks. > > I'd first separate the Pulsar catalog as another doc and show more design > and implementation details there. > > For the current FLIP-72, I would separate it into the sink part for > current work and keep the source part as future works until we reach > FLIP-27 finals. > > I also reply to some of the comments in the design doc. I will rewrite the > catalog part in regarding to Bowen's advice in both email and comments. > > Thanks for the help again. > > Best, > Yijie > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong <[hidden email]> wrote: > >> Hi Yijie, >> >> I also agree with Jark on separating the Catalog part into another FLIP. >> >> With FLIP-27[1] also in the air, it is also probably great to split and >> unblock the sink implementation contribution. >> I would suggest either putting in a detail implementation plan section in >> the doc, or (maybe too much separation?) splitting them into different >> FLIPs. What do you guys think? >> >> -- >> Rong >> >> [1] >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface >> >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <[hidden email]> wrote: >> >> > Hi Yijie, >> > >> > Thanks for the design document. I agree with Bowen that the catalog part >> > needs more details. >> > And I would suggest to separate Pulsar Catalog as another FLIP. IMO, it >> has >> > little to do with source/sink. >> > Having a separate FLIP can unblock the contribution for sink (or source) >> > and keep the discussion more focus. >> > I also left some comments in the documentation. >> > >> > Thanks, >> > Jark >> > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen <[hidden email]> >> > wrote: >> > >> > > Hi Bowen, >> > > >> > > Thanks for your comments. I'll add catalog details as you suggested. >> > > >> > > One more question: since we decide to not implement source part of the >> > > connector at the moment. >> > > What can users do with a Pulsar catalog? >> > > Create a table backed by Pulsar and check existing pulsar tables to >> see >> > > their schemas? Drop tables maybe? >> > > >> > > Best, >> > > Yijie >> > > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li <[hidden email]> wrote: >> > > >> > > > Hi Yijie, >> > > > >> > > > Per the discussion, maybe you can move pulsar source to 'future >> work' >> > > > section in the FLIP for now? >> > > > >> > > > Besides, the FLIP seems to be quite rough at the moment, and I'd >> > > recommend >> > > > to add more details . >> > > > >> > > > A few questions mainly regarding the proposed pulsar catalog. >> > > > >> > > > - Can you provide some background of pulsar schema registry and >> how >> > it >> > > > works? >> > > > - The proposed design of pulsar catalog is very vague now, can >> you >> > > > share some details of how a pulsar catalog would work internally? >> > E.g. >> > > > - which APIs does it support exactly? E.g. I see from your >> > > > prototype that table creation is supported but not alteration. >> > > > - is it going to connect to a pulsar schema registry via a >> http >> > > > client or a pulsar client, etc >> > > > - will it be able to handle multiple versions of pulsar, or >> just >> > > > one? How is compatibility handles between different >> Flink-Pulsar >> > > versions? >> > > > - will it support only reading from pulsar schema registry , >> or >> > > > both read/write? Will it work end-to-end in Flink SQL for >> users >> > to >> > > create >> > > > and manipulate a pulsar table such as "CREATE TABLE t WITH >> > > > PROPERTIES(type=pulsar)" and "DROP TABLE t"? >> > > > - Is a pulsar topic always gonna be a non-partitioned table? >> How >> > is >> > > > a partitioned topic mapped to a Flink table? >> > > > - How to map Flink's catalog/database namespace to pulsar's >> > > > multi-tenant namespaces? I'm not very familiar with how multi >> > tenancy >> > > works >> > > > in pulsar, and some background context/use cases may help here >> too. >> > > E.g. >> > > > - can a pulsar client/consumer/producer be multiple-tenant at >> the >> > > > same time? >> > > > - how does authentication work in pulsar's multi-tenancy and >> the >> > > > catalog? asking since I didn't see the proposed pulsar catalog >> > has >> > > > username/password configs >> > > > - the FLIP seems propose mapping a pulsar cluster and >> > > > 'tenant/namespace' respectively to Flink's 'catalog' and >> > > 'database'. I >> > > > wonder whether it totally makes sense, or should we actually >> map >> > > "tenant" >> > > > to "catalog", and "namespace" to "database"? >> > > > >> > > > Cheers, >> > > > Bowen >> > > > >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen < >> [hidden email]> >> > > > wrote: >> > > > >> > > >> Hi everyone, >> > > >> >> > > >> Per discussion in the previous thread >> > > >> < >> > > >> >> > > >> > >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html >> > > >> >, >> > > >> I have created FLIP-72 to kick off a more detailed discussion on >> the >> > > Flink >> > > >> Pulsar connector: >> > > >> >> > > >> >> > > >> >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector >> > > >> >> > > >> In short, the connector has the following features: >> > > >> >> > > >> - >> > > >> >> > > >> Pulsar as a streaming source with exactly-once guarantee. >> > > >> - >> > > >> >> > > >> Sink streaming results to Pulsar with at-least-once semantics. >> > > >> - >> > > >> >> > > >> Build upon Flink new Table API Type system (FLIP-37 >> > > >> < >> > > >> >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System >> > > >> > >> > > >> ), and can automatically (de)serialize messages with the help of >> > > Pulsar >> > > >> schema. >> > > >> - >> > > >> >> > > >> Integrate with Flink new Catalog API (FLIP-30 >> > > >> < >> > > >> >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs >> > > >> >), >> > > >> which enables the use of Pulsar topics as tables in Table API as >> > well >> > > >> as >> > > >> SQL client. >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> > >> https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u >> > > >> >> > > >> >> > > >> Would love to here your thoughts on this. >> > > >> >> > > >> Best, >> > > >> Yijie >> > > >> >> > > > >> > > >> > >> > |
Hi all,
Sorry for the long belated update. I have updated FLIP-27 wiki page with the latest proposals. Some noticeable changes include: 1. A new generic communication mechanism between SplitEnumerator and SourceReader. 2. Some detail API method signature changes. We left a few things out of this FLIP and will address them in separate FLIPs. Including: 1. Per split event time. 2. Event time alignment. 3. Fine grained failover for SplitEnumerator failure. Please let us know if you have any question. Thanks, Jiangjie (Becket) Qin On Tue, Nov 19, 2019 at 10:28 AM Yijie Shen <[hidden email]> wrote: > Hi everyone, > > I've put the catalog part design in separate doc with more details for > easier communication. > > > https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing > > I would love to hear your thoughts on this. > > Best, > Yijie > > On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen <[hidden email]> > wrote: > > > Hi everyone, > > > > Glad to receive your valuable feedbacks. > > > > I'd first separate the Pulsar catalog as another doc and show more design > > and implementation details there. > > > > For the current FLIP-72, I would separate it into the sink part for > > current work and keep the source part as future works until we reach > > FLIP-27 finals. > > > > I also reply to some of the comments in the design doc. I will rewrite > the > > catalog part in regarding to Bowen's advice in both email and comments. > > > > Thanks for the help again. > > > > Best, > > Yijie > > > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong <[hidden email]> wrote: > > > >> Hi Yijie, > >> > >> I also agree with Jark on separating the Catalog part into another FLIP. > >> > >> With FLIP-27[1] also in the air, it is also probably great to split and > >> unblock the sink implementation contribution. > >> I would suggest either putting in a detail implementation plan section > in > >> the doc, or (maybe too much separation?) splitting them into different > >> FLIPs. What do you guys think? > >> > >> -- > >> Rong > >> > >> [1] > >> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > >> > >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <[hidden email]> wrote: > >> > >> > Hi Yijie, > >> > > >> > Thanks for the design document. I agree with Bowen that the catalog > part > >> > needs more details. > >> > And I would suggest to separate Pulsar Catalog as another FLIP. IMO, > it > >> has > >> > little to do with source/sink. > >> > Having a separate FLIP can unblock the contribution for sink (or > source) > >> > and keep the discussion more focus. > >> > I also left some comments in the documentation. > >> > > >> > Thanks, > >> > Jark > >> > > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen <[hidden email]> > >> > wrote: > >> > > >> > > Hi Bowen, > >> > > > >> > > Thanks for your comments. I'll add catalog details as you suggested. > >> > > > >> > > One more question: since we decide to not implement source part of > the > >> > > connector at the moment. > >> > > What can users do with a Pulsar catalog? > >> > > Create a table backed by Pulsar and check existing pulsar tables to > >> see > >> > > their schemas? Drop tables maybe? > >> > > > >> > > Best, > >> > > Yijie > >> > > > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li <[hidden email]> > wrote: > >> > > > >> > > > Hi Yijie, > >> > > > > >> > > > Per the discussion, maybe you can move pulsar source to 'future > >> work' > >> > > > section in the FLIP for now? > >> > > > > >> > > > Besides, the FLIP seems to be quite rough at the moment, and I'd > >> > > recommend > >> > > > to add more details . > >> > > > > >> > > > A few questions mainly regarding the proposed pulsar catalog. > >> > > > > >> > > > - Can you provide some background of pulsar schema registry and > >> how > >> > it > >> > > > works? > >> > > > - The proposed design of pulsar catalog is very vague now, can > >> you > >> > > > share some details of how a pulsar catalog would work > internally? > >> > E.g. > >> > > > - which APIs does it support exactly? E.g. I see from your > >> > > > prototype that table creation is supported but not > alteration. > >> > > > - is it going to connect to a pulsar schema registry via a > >> http > >> > > > client or a pulsar client, etc > >> > > > - will it be able to handle multiple versions of pulsar, or > >> just > >> > > > one? How is compatibility handles between different > >> Flink-Pulsar > >> > > versions? > >> > > > - will it support only reading from pulsar schema registry , > >> or > >> > > > both read/write? Will it work end-to-end in Flink SQL for > >> users > >> > to > >> > > create > >> > > > and manipulate a pulsar table such as "CREATE TABLE t WITH > >> > > > PROPERTIES(type=pulsar)" and "DROP TABLE t"? > >> > > > - Is a pulsar topic always gonna be a non-partitioned table? > >> How > >> > is > >> > > > a partitioned topic mapped to a Flink table? > >> > > > - How to map Flink's catalog/database namespace to pulsar's > >> > > > multi-tenant namespaces? I'm not very familiar with how multi > >> > tenancy > >> > > works > >> > > > in pulsar, and some background context/use cases may help here > >> too. > >> > > E.g. > >> > > > - can a pulsar client/consumer/producer be multiple-tenant > at > >> the > >> > > > same time? > >> > > > - how does authentication work in pulsar's multi-tenancy and > >> the > >> > > > catalog? asking since I didn't see the proposed pulsar > catalog > >> > has > >> > > > username/password configs > >> > > > - the FLIP seems propose mapping a pulsar cluster and > >> > > > 'tenant/namespace' respectively to Flink's 'catalog' and > >> > > 'database'. I > >> > > > wonder whether it totally makes sense, or should we actually > >> map > >> > > "tenant" > >> > > > to "catalog", and "namespace" to "database"? > >> > > > > >> > > > Cheers, > >> > > > Bowen > >> > > > > >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen < > >> [hidden email]> > >> > > > wrote: > >> > > > > >> > > >> Hi everyone, > >> > > >> > >> > > >> Per discussion in the previous thread > >> > > >> < > >> > > >> > >> > > > >> > > >> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html > >> > > >> >, > >> > > >> I have created FLIP-72 to kick off a more detailed discussion on > >> the > >> > > Flink > >> > > >> Pulsar connector: > >> > > >> > >> > > >> > >> > > >> > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector > >> > > >> > >> > > >> In short, the connector has the following features: > >> > > >> > >> > > >> - > >> > > >> > >> > > >> Pulsar as a streaming source with exactly-once guarantee. > >> > > >> - > >> > > >> > >> > > >> Sink streaming results to Pulsar with at-least-once semantics. > >> > > >> - > >> > > >> > >> > > >> Build upon Flink new Table API Type system (FLIP-37 > >> > > >> < > >> > > >> > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System > >> > > >> > > >> > > >> ), and can automatically (de)serialize messages with the help > of > >> > > Pulsar > >> > > >> schema. > >> > > >> - > >> > > >> > >> > > >> Integrate with Flink new Catalog API (FLIP-30 > >> > > >> < > >> > > >> > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs > >> > > >> >), > >> > > >> which enables the use of Pulsar topics as tables in Table API > as > >> > well > >> > > >> as > >> > > >> SQL client. > >> > > >> > >> > > >> > >> > > >> > >> > > >> > >> > > > >> > > >> > https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u > >> > > >> > >> > > >> > >> > > >> Would love to here your thoughts on this. > >> > > >> > >> > > >> Best, > >> > > >> Yijie > >> > > >> > >> > > > > >> > > > >> > > >> > > > |
Thanks Becket the the updating,
But shouldn't this message be posted in FLIP-27 discussion thread[1]? Best, Jark [1]: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-27-Refactor-Source-Interface-td24952.html On Wed, 4 Dec 2019 at 12:12, Becket Qin <[hidden email]> wrote: > Hi all, > > Sorry for the long belated update. I have updated FLIP-27 wiki page with > the latest proposals. Some noticeable changes include: > 1. A new generic communication mechanism between SplitEnumerator and > SourceReader. > 2. Some detail API method signature changes. > > We left a few things out of this FLIP and will address them in separate > FLIPs. Including: > 1. Per split event time. > 2. Event time alignment. > 3. Fine grained failover for SplitEnumerator failure. > > Please let us know if you have any question. > > Thanks, > > Jiangjie (Becket) Qin > > On Tue, Nov 19, 2019 at 10:28 AM Yijie Shen <[hidden email]> > wrote: > > > Hi everyone, > > > > I've put the catalog part design in separate doc with more details for > > easier communication. > > > > > > > https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing > > > > I would love to hear your thoughts on this. > > > > Best, > > Yijie > > > > On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen <[hidden email]> > > wrote: > > > > > Hi everyone, > > > > > > Glad to receive your valuable feedbacks. > > > > > > I'd first separate the Pulsar catalog as another doc and show more > design > > > and implementation details there. > > > > > > For the current FLIP-72, I would separate it into the sink part for > > > current work and keep the source part as future works until we reach > > > FLIP-27 finals. > > > > > > I also reply to some of the comments in the design doc. I will rewrite > > the > > > catalog part in regarding to Bowen's advice in both email and comments. > > > > > > Thanks for the help again. > > > > > > Best, > > > Yijie > > > > > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong <[hidden email]> > wrote: > > > > > >> Hi Yijie, > > >> > > >> I also agree with Jark on separating the Catalog part into another > FLIP. > > >> > > >> With FLIP-27[1] also in the air, it is also probably great to split > and > > >> unblock the sink implementation contribution. > > >> I would suggest either putting in a detail implementation plan section > > in > > >> the doc, or (maybe too much separation?) splitting them into different > > >> FLIPs. What do you guys think? > > >> > > >> -- > > >> Rong > > >> > > >> [1] > > >> > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > > >> > > >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <[hidden email]> wrote: > > >> > > >> > Hi Yijie, > > >> > > > >> > Thanks for the design document. I agree with Bowen that the catalog > > part > > >> > needs more details. > > >> > And I would suggest to separate Pulsar Catalog as another FLIP. IMO, > > it > > >> has > > >> > little to do with source/sink. > > >> > Having a separate FLIP can unblock the contribution for sink (or > > source) > > >> > and keep the discussion more focus. > > >> > I also left some comments in the documentation. > > >> > > > >> > Thanks, > > >> > Jark > > >> > > > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen <[hidden email] > > > > >> > wrote: > > >> > > > >> > > Hi Bowen, > > >> > > > > >> > > Thanks for your comments. I'll add catalog details as you > suggested. > > >> > > > > >> > > One more question: since we decide to not implement source part of > > the > > >> > > connector at the moment. > > >> > > What can users do with a Pulsar catalog? > > >> > > Create a table backed by Pulsar and check existing pulsar tables > to > > >> see > > >> > > their schemas? Drop tables maybe? > > >> > > > > >> > > Best, > > >> > > Yijie > > >> > > > > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li <[hidden email]> > > wrote: > > >> > > > > >> > > > Hi Yijie, > > >> > > > > > >> > > > Per the discussion, maybe you can move pulsar source to 'future > > >> work' > > >> > > > section in the FLIP for now? > > >> > > > > > >> > > > Besides, the FLIP seems to be quite rough at the moment, and I'd > > >> > > recommend > > >> > > > to add more details . > > >> > > > > > >> > > > A few questions mainly regarding the proposed pulsar catalog. > > >> > > > > > >> > > > - Can you provide some background of pulsar schema registry > and > > >> how > > >> > it > > >> > > > works? > > >> > > > - The proposed design of pulsar catalog is very vague now, > can > > >> you > > >> > > > share some details of how a pulsar catalog would work > > internally? > > >> > E.g. > > >> > > > - which APIs does it support exactly? E.g. I see from your > > >> > > > prototype that table creation is supported but not > > alteration. > > >> > > > - is it going to connect to a pulsar schema registry via a > > >> http > > >> > > > client or a pulsar client, etc > > >> > > > - will it be able to handle multiple versions of pulsar, > or > > >> just > > >> > > > one? How is compatibility handles between different > > >> Flink-Pulsar > > >> > > versions? > > >> > > > - will it support only reading from pulsar schema > registry , > > >> or > > >> > > > both read/write? Will it work end-to-end in Flink SQL for > > >> users > > >> > to > > >> > > create > > >> > > > and manipulate a pulsar table such as "CREATE TABLE t WITH > > >> > > > PROPERTIES(type=pulsar)" and "DROP TABLE t"? > > >> > > > - Is a pulsar topic always gonna be a non-partitioned > table? > > >> How > > >> > is > > >> > > > a partitioned topic mapped to a Flink table? > > >> > > > - How to map Flink's catalog/database namespace to pulsar's > > >> > > > multi-tenant namespaces? I'm not very familiar with how multi > > >> > tenancy > > >> > > works > > >> > > > in pulsar, and some background context/use cases may help > here > > >> too. > > >> > > E.g. > > >> > > > - can a pulsar client/consumer/producer be multiple-tenant > > at > > >> the > > >> > > > same time? > > >> > > > - how does authentication work in pulsar's multi-tenancy > and > > >> the > > >> > > > catalog? asking since I didn't see the proposed pulsar > > catalog > > >> > has > > >> > > > username/password configs > > >> > > > - the FLIP seems propose mapping a pulsar cluster and > > >> > > > 'tenant/namespace' respectively to Flink's 'catalog' and > > >> > > 'database'. I > > >> > > > wonder whether it totally makes sense, or should we > actually > > >> map > > >> > > "tenant" > > >> > > > to "catalog", and "namespace" to "database"? > > >> > > > > > >> > > > Cheers, > > >> > > > Bowen > > >> > > > > > >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen < > > >> [hidden email]> > > >> > > > wrote: > > >> > > > > > >> > > >> Hi everyone, > > >> > > >> > > >> > > >> Per discussion in the previous thread > > >> > > >> < > > >> > > >> > > >> > > > > >> > > > >> > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html > > >> > > >> >, > > >> > > >> I have created FLIP-72 to kick off a more detailed discussion > on > > >> the > > >> > > Flink > > >> > > >> Pulsar connector: > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector > > >> > > >> > > >> > > >> In short, the connector has the following features: > > >> > > >> > > >> > > >> - > > >> > > >> > > >> > > >> Pulsar as a streaming source with exactly-once guarantee. > > >> > > >> - > > >> > > >> > > >> > > >> Sink streaming results to Pulsar with at-least-once > semantics. > > >> > > >> - > > >> > > >> > > >> > > >> Build upon Flink new Table API Type system (FLIP-37 > > >> > > >> < > > >> > > >> > > >> > > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System > > >> > > >> > > > >> > > >> ), and can automatically (de)serialize messages with the > help > > of > > >> > > Pulsar > > >> > > >> schema. > > >> > > >> - > > >> > > >> > > >> > > >> Integrate with Flink new Catalog API (FLIP-30 > > >> > > >> < > > >> > > >> > > >> > > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs > > >> > > >> >), > > >> > > >> which enables the use of Pulsar topics as tables in Table > API > > as > > >> > well > > >> > > >> as > > >> > > >> SQL client. > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > > > >> > > > >> > > > https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u > > >> > > >> > > >> > > >> > > >> > > >> Would love to here your thoughts on this. > > >> > > >> > > >> > > >> Best, > > >> > > >> Yijie > > >> > > >> > > >> > > > > > >> > > > > >> > > > >> > > > > > > |
Yes, you are absolutely right. Cannot believe I posted in the wrong
thread... On Wed, Dec 4, 2019 at 1:46 PM Jark Wu <[hidden email]> wrote: > Thanks Becket the the updating, > > But shouldn't this message be posted in FLIP-27 discussion thread[1]? > > > Best, > Jark > > [1]: > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-27-Refactor-Source-Interface-td24952.html > > On Wed, 4 Dec 2019 at 12:12, Becket Qin <[hidden email]> wrote: > > > Hi all, > > > > Sorry for the long belated update. I have updated FLIP-27 wiki page with > > the latest proposals. Some noticeable changes include: > > 1. A new generic communication mechanism between SplitEnumerator and > > SourceReader. > > 2. Some detail API method signature changes. > > > > We left a few things out of this FLIP and will address them in separate > > FLIPs. Including: > > 1. Per split event time. > > 2. Event time alignment. > > 3. Fine grained failover for SplitEnumerator failure. > > > > Please let us know if you have any question. > > > > Thanks, > > > > Jiangjie (Becket) Qin > > > > On Tue, Nov 19, 2019 at 10:28 AM Yijie Shen <[hidden email]> > > wrote: > > > > > Hi everyone, > > > > > > I've put the catalog part design in separate doc with more details for > > > easier communication. > > > > > > > > > > > > https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing > > > > > > I would love to hear your thoughts on this. > > > > > > Best, > > > Yijie > > > > > > On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen <[hidden email] > > > > > wrote: > > > > > > > Hi everyone, > > > > > > > > Glad to receive your valuable feedbacks. > > > > > > > > I'd first separate the Pulsar catalog as another doc and show more > > design > > > > and implementation details there. > > > > > > > > For the current FLIP-72, I would separate it into the sink part for > > > > current work and keep the source part as future works until we reach > > > > FLIP-27 finals. > > > > > > > > I also reply to some of the comments in the design doc. I will > rewrite > > > the > > > > catalog part in regarding to Bowen's advice in both email and > comments. > > > > > > > > Thanks for the help again. > > > > > > > > Best, > > > > Yijie > > > > > > > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong <[hidden email]> > > wrote: > > > > > > > >> Hi Yijie, > > > >> > > > >> I also agree with Jark on separating the Catalog part into another > > FLIP. > > > >> > > > >> With FLIP-27[1] also in the air, it is also probably great to split > > and > > > >> unblock the sink implementation contribution. > > > >> I would suggest either putting in a detail implementation plan > section > > > in > > > >> the doc, or (maybe too much separation?) splitting them into > different > > > >> FLIPs. What do you guys think? > > > >> > > > >> -- > > > >> Rong > > > >> > > > >> [1] > > > >> > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > > > >> > > > >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <[hidden email]> wrote: > > > >> > > > >> > Hi Yijie, > > > >> > > > > >> > Thanks for the design document. I agree with Bowen that the > catalog > > > part > > > >> > needs more details. > > > >> > And I would suggest to separate Pulsar Catalog as another FLIP. > IMO, > > > it > > > >> has > > > >> > little to do with source/sink. > > > >> > Having a separate FLIP can unblock the contribution for sink (or > > > source) > > > >> > and keep the discussion more focus. > > > >> > I also left some comments in the documentation. > > > >> > > > > >> > Thanks, > > > >> > Jark > > > >> > > > > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen < > [hidden email] > > > > > > >> > wrote: > > > >> > > > > >> > > Hi Bowen, > > > >> > > > > > >> > > Thanks for your comments. I'll add catalog details as you > > suggested. > > > >> > > > > > >> > > One more question: since we decide to not implement source part > of > > > the > > > >> > > connector at the moment. > > > >> > > What can users do with a Pulsar catalog? > > > >> > > Create a table backed by Pulsar and check existing pulsar tables > > to > > > >> see > > > >> > > their schemas? Drop tables maybe? > > > >> > > > > > >> > > Best, > > > >> > > Yijie > > > >> > > > > > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li <[hidden email]> > > > wrote: > > > >> > > > > > >> > > > Hi Yijie, > > > >> > > > > > > >> > > > Per the discussion, maybe you can move pulsar source to > 'future > > > >> work' > > > >> > > > section in the FLIP for now? > > > >> > > > > > > >> > > > Besides, the FLIP seems to be quite rough at the moment, and > I'd > > > >> > > recommend > > > >> > > > to add more details . > > > >> > > > > > > >> > > > A few questions mainly regarding the proposed pulsar catalog. > > > >> > > > > > > >> > > > - Can you provide some background of pulsar schema registry > > and > > > >> how > > > >> > it > > > >> > > > works? > > > >> > > > - The proposed design of pulsar catalog is very vague now, > > can > > > >> you > > > >> > > > share some details of how a pulsar catalog would work > > > internally? > > > >> > E.g. > > > >> > > > - which APIs does it support exactly? E.g. I see from > your > > > >> > > > prototype that table creation is supported but not > > > alteration. > > > >> > > > - is it going to connect to a pulsar schema registry > via a > > > >> http > > > >> > > > client or a pulsar client, etc > > > >> > > > - will it be able to handle multiple versions of pulsar, > > or > > > >> just > > > >> > > > one? How is compatibility handles between different > > > >> Flink-Pulsar > > > >> > > versions? > > > >> > > > - will it support only reading from pulsar schema > > registry , > > > >> or > > > >> > > > both read/write? Will it work end-to-end in Flink SQL > for > > > >> users > > > >> > to > > > >> > > create > > > >> > > > and manipulate a pulsar table such as "CREATE TABLE t > WITH > > > >> > > > PROPERTIES(type=pulsar)" and "DROP TABLE t"? > > > >> > > > - Is a pulsar topic always gonna be a non-partitioned > > table? > > > >> How > > > >> > is > > > >> > > > a partitioned topic mapped to a Flink table? > > > >> > > > - How to map Flink's catalog/database namespace to pulsar's > > > >> > > > multi-tenant namespaces? I'm not very familiar with how > multi > > > >> > tenancy > > > >> > > works > > > >> > > > in pulsar, and some background context/use cases may help > > here > > > >> too. > > > >> > > E.g. > > > >> > > > - can a pulsar client/consumer/producer be > multiple-tenant > > > at > > > >> the > > > >> > > > same time? > > > >> > > > - how does authentication work in pulsar's multi-tenancy > > and > > > >> the > > > >> > > > catalog? asking since I didn't see the proposed pulsar > > > catalog > > > >> > has > > > >> > > > username/password configs > > > >> > > > - the FLIP seems propose mapping a pulsar cluster and > > > >> > > > 'tenant/namespace' respectively to Flink's 'catalog' and > > > >> > > 'database'. I > > > >> > > > wonder whether it totally makes sense, or should we > > actually > > > >> map > > > >> > > "tenant" > > > >> > > > to "catalog", and "namespace" to "database"? > > > >> > > > > > > >> > > > Cheers, > > > >> > > > Bowen > > > >> > > > > > > >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen < > > > >> [hidden email]> > > > >> > > > wrote: > > > >> > > > > > > >> > > >> Hi everyone, > > > >> > > >> > > > >> > > >> Per discussion in the previous thread > > > >> > > >> < > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html > > > >> > > >> >, > > > >> > > >> I have created FLIP-72 to kick off a more detailed discussion > > on > > > >> the > > > >> > > Flink > > > >> > > >> Pulsar connector: > > > >> > > >> > > > >> > > >> > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector > > > >> > > >> > > > >> > > >> In short, the connector has the following features: > > > >> > > >> > > > >> > > >> - > > > >> > > >> > > > >> > > >> Pulsar as a streaming source with exactly-once guarantee. > > > >> > > >> - > > > >> > > >> > > > >> > > >> Sink streaming results to Pulsar with at-least-once > > semantics. > > > >> > > >> - > > > >> > > >> > > > >> > > >> Build upon Flink new Table API Type system (FLIP-37 > > > >> > > >> < > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System > > > >> > > >> > > > > >> > > >> ), and can automatically (de)serialize messages with the > > help > > > of > > > >> > > Pulsar > > > >> > > >> schema. > > > >> > > >> - > > > >> > > >> > > > >> > > >> Integrate with Flink new Catalog API (FLIP-30 > > > >> > > >> < > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs > > > >> > > >> >), > > > >> > > >> which enables the use of Pulsar topics as tables in Table > > API > > > as > > > >> > well > > > >> > > >> as > > > >> > > >> SQL client. > > > >> > > >> > > > >> > > >> > > > >> > > >> > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u > > > >> > > >> > > > >> > > >> > > > >> > > >> Would love to here your thoughts on this. > > > >> > > >> > > > >> > > >> Best, > > > >> > > >> Yijie > > > >> > > >> > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > > > > > |
Hi Yijie,
I took a look at the design doc. LGTM overall, left a few questions. On Tue, Dec 3, 2019 at 10:39 PM Becket Qin <[hidden email]> wrote: > Yes, you are absolutely right. Cannot believe I posted in the wrong > thread... > > On Wed, Dec 4, 2019 at 1:46 PM Jark Wu <[hidden email]> wrote: > >> Thanks Becket the the updating, >> >> But shouldn't this message be posted in FLIP-27 discussion thread[1]? >> >> >> Best, >> Jark >> >> [1]: >> >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-27-Refactor-Source-Interface-td24952.html >> >> On Wed, 4 Dec 2019 at 12:12, Becket Qin <[hidden email]> wrote: >> >> > Hi all, >> > >> > Sorry for the long belated update. I have updated FLIP-27 wiki page with >> > the latest proposals. Some noticeable changes include: >> > 1. A new generic communication mechanism between SplitEnumerator and >> > SourceReader. >> > 2. Some detail API method signature changes. >> > >> > We left a few things out of this FLIP and will address them in separate >> > FLIPs. Including: >> > 1. Per split event time. >> > 2. Event time alignment. >> > 3. Fine grained failover for SplitEnumerator failure. >> > >> > Please let us know if you have any question. >> > >> > Thanks, >> > >> > Jiangjie (Becket) Qin >> > >> > On Tue, Nov 19, 2019 at 10:28 AM Yijie Shen <[hidden email]> >> > wrote: >> > >> > > Hi everyone, >> > > >> > > I've put the catalog part design in separate doc with more details for >> > > easier communication. >> > > >> > > >> > > >> > >> https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing >> > > >> > > I would love to hear your thoughts on this. >> > > >> > > Best, >> > > Yijie >> > > >> > > On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen < >> [hidden email]> >> > > wrote: >> > > >> > > > Hi everyone, >> > > > >> > > > Glad to receive your valuable feedbacks. >> > > > >> > > > I'd first separate the Pulsar catalog as another doc and show more >> > design >> > > > and implementation details there. >> > > > >> > > > For the current FLIP-72, I would separate it into the sink part for >> > > > current work and keep the source part as future works until we reach >> > > > FLIP-27 finals. >> > > > >> > > > I also reply to some of the comments in the design doc. I will >> rewrite >> > > the >> > > > catalog part in regarding to Bowen's advice in both email and >> comments. >> > > > >> > > > Thanks for the help again. >> > > > >> > > > Best, >> > > > Yijie >> > > > >> > > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong <[hidden email]> >> > wrote: >> > > > >> > > >> Hi Yijie, >> > > >> >> > > >> I also agree with Jark on separating the Catalog part into another >> > FLIP. >> > > >> >> > > >> With FLIP-27[1] also in the air, it is also probably great to split >> > and >> > > >> unblock the sink implementation contribution. >> > > >> I would suggest either putting in a detail implementation plan >> section >> > > in >> > > >> the doc, or (maybe too much separation?) splitting them into >> different >> > > >> FLIPs. What do you guys think? >> > > >> >> > > >> -- >> > > >> Rong >> > > >> >> > > >> [1] >> > > >> >> > > >> >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface >> > > >> >> > > >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <[hidden email]> wrote: >> > > >> >> > > >> > Hi Yijie, >> > > >> > >> > > >> > Thanks for the design document. I agree with Bowen that the >> catalog >> > > part >> > > >> > needs more details. >> > > >> > And I would suggest to separate Pulsar Catalog as another FLIP. >> IMO, >> > > it >> > > >> has >> > > >> > little to do with source/sink. >> > > >> > Having a separate FLIP can unblock the contribution for sink (or >> > > source) >> > > >> > and keep the discussion more focus. >> > > >> > I also left some comments in the documentation. >> > > >> > >> > > >> > Thanks, >> > > >> > Jark >> > > >> > >> > > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen < >> [hidden email] >> > > >> > > >> > wrote: >> > > >> > >> > > >> > > Hi Bowen, >> > > >> > > >> > > >> > > Thanks for your comments. I'll add catalog details as you >> > suggested. >> > > >> > > >> > > >> > > One more question: since we decide to not implement source >> part of >> > > the >> > > >> > > connector at the moment. >> > > >> > > What can users do with a Pulsar catalog? >> > > >> > > Create a table backed by Pulsar and check existing pulsar >> tables >> > to >> > > >> see >> > > >> > > their schemas? Drop tables maybe? >> > > >> > > >> > > >> > > Best, >> > > >> > > Yijie >> > > >> > > >> > > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li <[hidden email]> >> > > wrote: >> > > >> > > >> > > >> > > > Hi Yijie, >> > > >> > > > >> > > >> > > > Per the discussion, maybe you can move pulsar source to >> 'future >> > > >> work' >> > > >> > > > section in the FLIP for now? >> > > >> > > > >> > > >> > > > Besides, the FLIP seems to be quite rough at the moment, and >> I'd >> > > >> > > recommend >> > > >> > > > to add more details . >> > > >> > > > >> > > >> > > > A few questions mainly regarding the proposed pulsar catalog. >> > > >> > > > >> > > >> > > > - Can you provide some background of pulsar schema >> registry >> > and >> > > >> how >> > > >> > it >> > > >> > > > works? >> > > >> > > > - The proposed design of pulsar catalog is very vague now, >> > can >> > > >> you >> > > >> > > > share some details of how a pulsar catalog would work >> > > internally? >> > > >> > E.g. >> > > >> > > > - which APIs does it support exactly? E.g. I see from >> your >> > > >> > > > prototype that table creation is supported but not >> > > alteration. >> > > >> > > > - is it going to connect to a pulsar schema registry >> via a >> > > >> http >> > > >> > > > client or a pulsar client, etc >> > > >> > > > - will it be able to handle multiple versions of >> pulsar, >> > or >> > > >> just >> > > >> > > > one? How is compatibility handles between different >> > > >> Flink-Pulsar >> > > >> > > versions? >> > > >> > > > - will it support only reading from pulsar schema >> > registry , >> > > >> or >> > > >> > > > both read/write? Will it work end-to-end in Flink SQL >> for >> > > >> users >> > > >> > to >> > > >> > > create >> > > >> > > > and manipulate a pulsar table such as "CREATE TABLE t >> WITH >> > > >> > > > PROPERTIES(type=pulsar)" and "DROP TABLE t"? >> > > >> > > > - Is a pulsar topic always gonna be a non-partitioned >> > table? >> > > >> How >> > > >> > is >> > > >> > > > a partitioned topic mapped to a Flink table? >> > > >> > > > - How to map Flink's catalog/database namespace to >> pulsar's >> > > >> > > > multi-tenant namespaces? I'm not very familiar with how >> multi >> > > >> > tenancy >> > > >> > > works >> > > >> > > > in pulsar, and some background context/use cases may help >> > here >> > > >> too. >> > > >> > > E.g. >> > > >> > > > - can a pulsar client/consumer/producer be >> multiple-tenant >> > > at >> > > >> the >> > > >> > > > same time? >> > > >> > > > - how does authentication work in pulsar's >> multi-tenancy >> > and >> > > >> the >> > > >> > > > catalog? asking since I didn't see the proposed pulsar >> > > catalog >> > > >> > has >> > > >> > > > username/password configs >> > > >> > > > - the FLIP seems propose mapping a pulsar cluster and >> > > >> > > > 'tenant/namespace' respectively to Flink's 'catalog' >> and >> > > >> > > 'database'. I >> > > >> > > > wonder whether it totally makes sense, or should we >> > actually >> > > >> map >> > > >> > > "tenant" >> > > >> > > > to "catalog", and "namespace" to "database"? >> > > >> > > > >> > > >> > > > Cheers, >> > > >> > > > Bowen >> > > >> > > > >> > > >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen < >> > > >> [hidden email]> >> > > >> > > > wrote: >> > > >> > > > >> > > >> > > >> Hi everyone, >> > > >> > > >> >> > > >> > > >> Per discussion in the previous thread >> > > >> > > >> < >> > > >> > > >> >> > > >> > > >> > > >> > >> > > >> >> > > >> > >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html >> > > >> > > >> >, >> > > >> > > >> I have created FLIP-72 to kick off a more detailed >> discussion >> > on >> > > >> the >> > > >> > > Flink >> > > >> > > >> Pulsar connector: >> > > >> > > >> >> > > >> > > >> >> > > >> > > >> >> > > >> > > >> > > >> > >> > > >> >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector >> > > >> > > >> >> > > >> > > >> In short, the connector has the following features: >> > > >> > > >> >> > > >> > > >> - >> > > >> > > >> >> > > >> > > >> Pulsar as a streaming source with exactly-once guarantee. >> > > >> > > >> - >> > > >> > > >> >> > > >> > > >> Sink streaming results to Pulsar with at-least-once >> > semantics. >> > > >> > > >> - >> > > >> > > >> >> > > >> > > >> Build upon Flink new Table API Type system (FLIP-37 >> > > >> > > >> < >> > > >> > > >> >> > > >> > > >> > > >> > >> > > >> >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System >> > > >> > > >> > >> > > >> > > >> ), and can automatically (de)serialize messages with the >> > help >> > > of >> > > >> > > Pulsar >> > > >> > > >> schema. >> > > >> > > >> - >> > > >> > > >> >> > > >> > > >> Integrate with Flink new Catalog API (FLIP-30 >> > > >> > > >> < >> > > >> > > >> >> > > >> > > >> > > >> > >> > > >> >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs >> > > >> > > >> >), >> > > >> > > >> which enables the use of Pulsar topics as tables in Table >> > API >> > > as >> > > >> > well >> > > >> > > >> as >> > > >> > > >> SQL client. >> > > >> > > >> >> > > >> > > >> >> > > >> > > >> >> > > >> > > >> >> > > >> > > >> > > >> > >> > > >> >> > > >> > >> https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u >> > > >> > > >> >> > > >> > > >> >> > > >> > > >> Would love to here your thoughts on this. >> > > >> > > >> >> > > >> > > >> Best, >> > > >> > > >> Yijie >> > > >> > > >> >> > > >> > > > >> > > >> > > >> > > >> > >> > > >> >> > > > >> > > >> > >> > |
Hi Bowen,
I've done updated the design doc, PTAL. Btw the PR for catalog is https://github.com/apache/flink/pull/10455, could you please take a look? Best, Yijie On Mon, Dec 9, 2019 at 8:44 AM Bowen Li <[hidden email]> wrote: > Hi Yijie, > > I took a look at the design doc. LGTM overall, left a few questions. > > On Tue, Dec 3, 2019 at 10:39 PM Becket Qin <[hidden email]> wrote: > > > Yes, you are absolutely right. Cannot believe I posted in the wrong > > thread... > > > > On Wed, Dec 4, 2019 at 1:46 PM Jark Wu <[hidden email]> wrote: > > > >> Thanks Becket the the updating, > >> > >> But shouldn't this message be posted in FLIP-27 discussion thread[1]? > >> > >> > >> Best, > >> Jark > >> > >> [1]: > >> > >> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-27-Refactor-Source-Interface-td24952.html > >> > >> On Wed, 4 Dec 2019 at 12:12, Becket Qin <[hidden email]> wrote: > >> > >> > Hi all, > >> > > >> > Sorry for the long belated update. I have updated FLIP-27 wiki page > with > >> > the latest proposals. Some noticeable changes include: > >> > 1. A new generic communication mechanism between SplitEnumerator and > >> > SourceReader. > >> > 2. Some detail API method signature changes. > >> > > >> > We left a few things out of this FLIP and will address them in > separate > >> > FLIPs. Including: > >> > 1. Per split event time. > >> > 2. Event time alignment. > >> > 3. Fine grained failover for SplitEnumerator failure. > >> > > >> > Please let us know if you have any question. > >> > > >> > Thanks, > >> > > >> > Jiangjie (Becket) Qin > >> > > >> > On Tue, Nov 19, 2019 at 10:28 AM Yijie Shen < > [hidden email]> > >> > wrote: > >> > > >> > > Hi everyone, > >> > > > >> > > I've put the catalog part design in separate doc with more details > for > >> > > easier communication. > >> > > > >> > > > >> > > > >> > > >> > https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing > >> > > > >> > > I would love to hear your thoughts on this. > >> > > > >> > > Best, > >> > > Yijie > >> > > > >> > > On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen < > >> [hidden email]> > >> > > wrote: > >> > > > >> > > > Hi everyone, > >> > > > > >> > > > Glad to receive your valuable feedbacks. > >> > > > > >> > > > I'd first separate the Pulsar catalog as another doc and show more > >> > design > >> > > > and implementation details there. > >> > > > > >> > > > For the current FLIP-72, I would separate it into the sink part > for > >> > > > current work and keep the source part as future works until we > reach > >> > > > FLIP-27 finals. > >> > > > > >> > > > I also reply to some of the comments in the design doc. I will > >> rewrite > >> > > the > >> > > > catalog part in regarding to Bowen's advice in both email and > >> comments. > >> > > > > >> > > > Thanks for the help again. > >> > > > > >> > > > Best, > >> > > > Yijie > >> > > > > >> > > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong <[hidden email]> > >> > wrote: > >> > > > > >> > > >> Hi Yijie, > >> > > >> > >> > > >> I also agree with Jark on separating the Catalog part into > another > >> > FLIP. > >> > > >> > >> > > >> With FLIP-27[1] also in the air, it is also probably great to > split > >> > and > >> > > >> unblock the sink implementation contribution. > >> > > >> I would suggest either putting in a detail implementation plan > >> section > >> > > in > >> > > >> the doc, or (maybe too much separation?) splitting them into > >> different > >> > > >> FLIPs. What do you guys think? > >> > > >> > >> > > >> -- > >> > > >> Rong > >> > > >> > >> > > >> [1] > >> > > >> > >> > > >> > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > >> > > >> > >> > > >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <[hidden email]> > wrote: > >> > > >> > >> > > >> > Hi Yijie, > >> > > >> > > >> > > >> > Thanks for the design document. I agree with Bowen that the > >> catalog > >> > > part > >> > > >> > needs more details. > >> > > >> > And I would suggest to separate Pulsar Catalog as another FLIP. > >> IMO, > >> > > it > >> > > >> has > >> > > >> > little to do with source/sink. > >> > > >> > Having a separate FLIP can unblock the contribution for sink > (or > >> > > source) > >> > > >> > and keep the discussion more focus. > >> > > >> > I also left some comments in the documentation. > >> > > >> > > >> > > >> > Thanks, > >> > > >> > Jark > >> > > >> > > >> > > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen < > >> [hidden email] > >> > > > >> > > >> > wrote: > >> > > >> > > >> > > >> > > Hi Bowen, > >> > > >> > > > >> > > >> > > Thanks for your comments. I'll add catalog details as you > >> > suggested. > >> > > >> > > > >> > > >> > > One more question: since we decide to not implement source > >> part of > >> > > the > >> > > >> > > connector at the moment. > >> > > >> > > What can users do with a Pulsar catalog? > >> > > >> > > Create a table backed by Pulsar and check existing pulsar > >> tables > >> > to > >> > > >> see > >> > > >> > > their schemas? Drop tables maybe? > >> > > >> > > > >> > > >> > > Best, > >> > > >> > > Yijie > >> > > >> > > > >> > > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li < > [hidden email]> > >> > > wrote: > >> > > >> > > > >> > > >> > > > Hi Yijie, > >> > > >> > > > > >> > > >> > > > Per the discussion, maybe you can move pulsar source to > >> 'future > >> > > >> work' > >> > > >> > > > section in the FLIP for now? > >> > > >> > > > > >> > > >> > > > Besides, the FLIP seems to be quite rough at the moment, > and > >> I'd > >> > > >> > > recommend > >> > > >> > > > to add more details . > >> > > >> > > > > >> > > >> > > > A few questions mainly regarding the proposed pulsar > catalog. > >> > > >> > > > > >> > > >> > > > - Can you provide some background of pulsar schema > >> registry > >> > and > >> > > >> how > >> > > >> > it > >> > > >> > > > works? > >> > > >> > > > - The proposed design of pulsar catalog is very vague > now, > >> > can > >> > > >> you > >> > > >> > > > share some details of how a pulsar catalog would work > >> > > internally? > >> > > >> > E.g. > >> > > >> > > > - which APIs does it support exactly? E.g. I see from > >> your > >> > > >> > > > prototype that table creation is supported but not > >> > > alteration. > >> > > >> > > > - is it going to connect to a pulsar schema registry > >> via a > >> > > >> http > >> > > >> > > > client or a pulsar client, etc > >> > > >> > > > - will it be able to handle multiple versions of > >> pulsar, > >> > or > >> > > >> just > >> > > >> > > > one? How is compatibility handles between different > >> > > >> Flink-Pulsar > >> > > >> > > versions? > >> > > >> > > > - will it support only reading from pulsar schema > >> > registry , > >> > > >> or > >> > > >> > > > both read/write? Will it work end-to-end in Flink SQL > >> for > >> > > >> users > >> > > >> > to > >> > > >> > > create > >> > > >> > > > and manipulate a pulsar table such as "CREATE TABLE t > >> WITH > >> > > >> > > > PROPERTIES(type=pulsar)" and "DROP TABLE t"? > >> > > >> > > > - Is a pulsar topic always gonna be a non-partitioned > >> > table? > >> > > >> How > >> > > >> > is > >> > > >> > > > a partitioned topic mapped to a Flink table? > >> > > >> > > > - How to map Flink's catalog/database namespace to > >> pulsar's > >> > > >> > > > multi-tenant namespaces? I'm not very familiar with how > >> multi > >> > > >> > tenancy > >> > > >> > > works > >> > > >> > > > in pulsar, and some background context/use cases may > help > >> > here > >> > > >> too. > >> > > >> > > E.g. > >> > > >> > > > - can a pulsar client/consumer/producer be > >> multiple-tenant > >> > > at > >> > > >> the > >> > > >> > > > same time? > >> > > >> > > > - how does authentication work in pulsar's > >> multi-tenancy > >> > and > >> > > >> the > >> > > >> > > > catalog? asking since I didn't see the proposed > pulsar > >> > > catalog > >> > > >> > has > >> > > >> > > > username/password configs > >> > > >> > > > - the FLIP seems propose mapping a pulsar cluster and > >> > > >> > > > 'tenant/namespace' respectively to Flink's 'catalog' > >> and > >> > > >> > > 'database'. I > >> > > >> > > > wonder whether it totally makes sense, or should we > >> > actually > >> > > >> map > >> > > >> > > "tenant" > >> > > >> > > > to "catalog", and "namespace" to "database"? > >> > > >> > > > > >> > > >> > > > Cheers, > >> > > >> > > > Bowen > >> > > >> > > > > >> > > >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen < > >> > > >> [hidden email]> > >> > > >> > > > wrote: > >> > > >> > > > > >> > > >> > > >> Hi everyone, > >> > > >> > > >> > >> > > >> > > >> Per discussion in the previous thread > >> > > >> > > >> < > >> > > >> > > >> > >> > > >> > > > >> > > >> > > >> > > >> > >> > > > >> > > >> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html > >> > > >> > > >> >, > >> > > >> > > >> I have created FLIP-72 to kick off a more detailed > >> discussion > >> > on > >> > > >> the > >> > > >> > > Flink > >> > > >> > > >> Pulsar connector: > >> > > >> > > >> > >> > > >> > > >> > >> > > >> > > >> > >> > > >> > > > >> > > >> > > >> > > >> > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector > >> > > >> > > >> > >> > > >> > > >> In short, the connector has the following features: > >> > > >> > > >> > >> > > >> > > >> - > >> > > >> > > >> > >> > > >> > > >> Pulsar as a streaming source with exactly-once > guarantee. > >> > > >> > > >> - > >> > > >> > > >> > >> > > >> > > >> Sink streaming results to Pulsar with at-least-once > >> > semantics. > >> > > >> > > >> - > >> > > >> > > >> > >> > > >> > > >> Build upon Flink new Table API Type system (FLIP-37 > >> > > >> > > >> < > >> > > >> > > >> > >> > > >> > > > >> > > >> > > >> > > >> > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System > >> > > >> > > >> > > >> > > >> > > >> ), and can automatically (de)serialize messages with > the > >> > help > >> > > of > >> > > >> > > Pulsar > >> > > >> > > >> schema. > >> > > >> > > >> - > >> > > >> > > >> > >> > > >> > > >> Integrate with Flink new Catalog API (FLIP-30 > >> > > >> > > >> < > >> > > >> > > >> > >> > > >> > > > >> > > >> > > >> > > >> > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs > >> > > >> > > >> >), > >> > > >> > > >> which enables the use of Pulsar topics as tables in > Table > >> > API > >> > > as > >> > > >> > well > >> > > >> > > >> as > >> > > >> > > >> SQL client. > >> > > >> > > >> > >> > > >> > > >> > >> > > >> > > >> > >> > > >> > > >> > >> > > >> > > > >> > > >> > > >> > > >> > >> > > > >> > > >> > https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u > >> > > >> > > >> > >> > > >> > > >> > >> > > >> > > >> Would love to here your thoughts on this. > >> > > >> > > >> > >> > > >> > > >> Best, > >> > > >> > > >> Yijie > >> > > >> > > >> > >> > > >> > > > > >> > > >> > > > >> > > >> > > >> > > >> > >> > > > > >> > > > >> > > >> > > > |
Hi Yijie,
There's just one more concern on the yaml configs. Otherwise, I think we should be good to go. Can you update your PR and ensure all tests pass? I can help review and merge in the next couple weeks. Thanks, Bowen On Mon, Dec 23, 2019 at 7:03 PM Yijie Shen <[hidden email]> wrote: > Hi Bowen, > > I've done updated the design doc, PTAL. > Btw the PR for catalog is https://github.com/apache/flink/pull/10455, > could > you please take a look? > > Best, > Yijie > > On Mon, Dec 9, 2019 at 8:44 AM Bowen Li <[hidden email]> wrote: > > > Hi Yijie, > > > > I took a look at the design doc. LGTM overall, left a few questions. > > > > On Tue, Dec 3, 2019 at 10:39 PM Becket Qin <[hidden email]> wrote: > > > > > Yes, you are absolutely right. Cannot believe I posted in the wrong > > > thread... > > > > > > On Wed, Dec 4, 2019 at 1:46 PM Jark Wu <[hidden email]> wrote: > > > > > >> Thanks Becket the the updating, > > >> > > >> But shouldn't this message be posted in FLIP-27 discussion thread[1]? > > >> > > >> > > >> Best, > > >> Jark > > >> > > >> [1]: > > >> > > >> > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-27-Refactor-Source-Interface-td24952.html > > >> > > >> On Wed, 4 Dec 2019 at 12:12, Becket Qin <[hidden email]> wrote: > > >> > > >> > Hi all, > > >> > > > >> > Sorry for the long belated update. I have updated FLIP-27 wiki page > > with > > >> > the latest proposals. Some noticeable changes include: > > >> > 1. A new generic communication mechanism between SplitEnumerator and > > >> > SourceReader. > > >> > 2. Some detail API method signature changes. > > >> > > > >> > We left a few things out of this FLIP and will address them in > > separate > > >> > FLIPs. Including: > > >> > 1. Per split event time. > > >> > 2. Event time alignment. > > >> > 3. Fine grained failover for SplitEnumerator failure. > > >> > > > >> > Please let us know if you have any question. > > >> > > > >> > Thanks, > > >> > > > >> > Jiangjie (Becket) Qin > > >> > > > >> > On Tue, Nov 19, 2019 at 10:28 AM Yijie Shen < > > [hidden email]> > > >> > wrote: > > >> > > > >> > > Hi everyone, > > >> > > > > >> > > I've put the catalog part design in separate doc with more details > > for > > >> > > easier communication. > > >> > > > > >> > > > > >> > > > > >> > > > >> > > > https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing > > >> > > > > >> > > I would love to hear your thoughts on this. > > >> > > > > >> > > Best, > > >> > > Yijie > > >> > > > > >> > > On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen < > > >> [hidden email]> > > >> > > wrote: > > >> > > > > >> > > > Hi everyone, > > >> > > > > > >> > > > Glad to receive your valuable feedbacks. > > >> > > > > > >> > > > I'd first separate the Pulsar catalog as another doc and show > more > > >> > design > > >> > > > and implementation details there. > > >> > > > > > >> > > > For the current FLIP-72, I would separate it into the sink part > > for > > >> > > > current work and keep the source part as future works until we > > reach > > >> > > > FLIP-27 finals. > > >> > > > > > >> > > > I also reply to some of the comments in the design doc. I will > > >> rewrite > > >> > > the > > >> > > > catalog part in regarding to Bowen's advice in both email and > > >> comments. > > >> > > > > > >> > > > Thanks for the help again. > > >> > > > > > >> > > > Best, > > >> > > > Yijie > > >> > > > > > >> > > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong <[hidden email] > > > > >> > wrote: > > >> > > > > > >> > > >> Hi Yijie, > > >> > > >> > > >> > > >> I also agree with Jark on separating the Catalog part into > > another > > >> > FLIP. > > >> > > >> > > >> > > >> With FLIP-27[1] also in the air, it is also probably great to > > split > > >> > and > > >> > > >> unblock the sink implementation contribution. > > >> > > >> I would suggest either putting in a detail implementation plan > > >> section > > >> > > in > > >> > > >> the doc, or (maybe too much separation?) splitting them into > > >> different > > >> > > >> FLIPs. What do you guys think? > > >> > > >> > > >> > > >> -- > > >> > > >> Rong > > >> > > >> > > >> > > >> [1] > > >> > > >> > > >> > > >> > > >> > > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > > >> > > >> > > >> > > >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <[hidden email]> > > wrote: > > >> > > >> > > >> > > >> > Hi Yijie, > > >> > > >> > > > >> > > >> > Thanks for the design document. I agree with Bowen that the > > >> catalog > > >> > > part > > >> > > >> > needs more details. > > >> > > >> > And I would suggest to separate Pulsar Catalog as another > FLIP. > > >> IMO, > > >> > > it > > >> > > >> has > > >> > > >> > little to do with source/sink. > > >> > > >> > Having a separate FLIP can unblock the contribution for sink > > (or > > >> > > source) > > >> > > >> > and keep the discussion more focus. > > >> > > >> > I also left some comments in the documentation. > > >> > > >> > > > >> > > >> > Thanks, > > >> > > >> > Jark > > >> > > >> > > > >> > > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen < > > >> [hidden email] > > >> > > > > >> > > >> > wrote: > > >> > > >> > > > >> > > >> > > Hi Bowen, > > >> > > >> > > > > >> > > >> > > Thanks for your comments. I'll add catalog details as you > > >> > suggested. > > >> > > >> > > > > >> > > >> > > One more question: since we decide to not implement source > > >> part of > > >> > > the > > >> > > >> > > connector at the moment. > > >> > > >> > > What can users do with a Pulsar catalog? > > >> > > >> > > Create a table backed by Pulsar and check existing pulsar > > >> tables > > >> > to > > >> > > >> see > > >> > > >> > > their schemas? Drop tables maybe? > > >> > > >> > > > > >> > > >> > > Best, > > >> > > >> > > Yijie > > >> > > >> > > > > >> > > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li < > > [hidden email]> > > >> > > wrote: > > >> > > >> > > > > >> > > >> > > > Hi Yijie, > > >> > > >> > > > > > >> > > >> > > > Per the discussion, maybe you can move pulsar source to > > >> 'future > > >> > > >> work' > > >> > > >> > > > section in the FLIP for now? > > >> > > >> > > > > > >> > > >> > > > Besides, the FLIP seems to be quite rough at the moment, > > and > > >> I'd > > >> > > >> > > recommend > > >> > > >> > > > to add more details . > > >> > > >> > > > > > >> > > >> > > > A few questions mainly regarding the proposed pulsar > > catalog. > > >> > > >> > > > > > >> > > >> > > > - Can you provide some background of pulsar schema > > >> registry > > >> > and > > >> > > >> how > > >> > > >> > it > > >> > > >> > > > works? > > >> > > >> > > > - The proposed design of pulsar catalog is very vague > > now, > > >> > can > > >> > > >> you > > >> > > >> > > > share some details of how a pulsar catalog would work > > >> > > internally? > > >> > > >> > E.g. > > >> > > >> > > > - which APIs does it support exactly? E.g. I see > from > > >> your > > >> > > >> > > > prototype that table creation is supported but not > > >> > > alteration. > > >> > > >> > > > - is it going to connect to a pulsar schema > registry > > >> via a > > >> > > >> http > > >> > > >> > > > client or a pulsar client, etc > > >> > > >> > > > - will it be able to handle multiple versions of > > >> pulsar, > > >> > or > > >> > > >> just > > >> > > >> > > > one? How is compatibility handles between different > > >> > > >> Flink-Pulsar > > >> > > >> > > versions? > > >> > > >> > > > - will it support only reading from pulsar schema > > >> > registry , > > >> > > >> or > > >> > > >> > > > both read/write? Will it work end-to-end in Flink > SQL > > >> for > > >> > > >> users > > >> > > >> > to > > >> > > >> > > create > > >> > > >> > > > and manipulate a pulsar table such as "CREATE > TABLE t > > >> WITH > > >> > > >> > > > PROPERTIES(type=pulsar)" and "DROP TABLE t"? > > >> > > >> > > > - Is a pulsar topic always gonna be a > non-partitioned > > >> > table? > > >> > > >> How > > >> > > >> > is > > >> > > >> > > > a partitioned topic mapped to a Flink table? > > >> > > >> > > > - How to map Flink's catalog/database namespace to > > >> pulsar's > > >> > > >> > > > multi-tenant namespaces? I'm not very familiar with > how > > >> multi > > >> > > >> > tenancy > > >> > > >> > > works > > >> > > >> > > > in pulsar, and some background context/use cases may > > help > > >> > here > > >> > > >> too. > > >> > > >> > > E.g. > > >> > > >> > > > - can a pulsar client/consumer/producer be > > >> multiple-tenant > > >> > > at > > >> > > >> the > > >> > > >> > > > same time? > > >> > > >> > > > - how does authentication work in pulsar's > > >> multi-tenancy > > >> > and > > >> > > >> the > > >> > > >> > > > catalog? asking since I didn't see the proposed > > pulsar > > >> > > catalog > > >> > > >> > has > > >> > > >> > > > username/password configs > > >> > > >> > > > - the FLIP seems propose mapping a pulsar cluster > and > > >> > > >> > > > 'tenant/namespace' respectively to Flink's > 'catalog' > > >> and > > >> > > >> > > 'database'. I > > >> > > >> > > > wonder whether it totally makes sense, or should we > > >> > actually > > >> > > >> map > > >> > > >> > > "tenant" > > >> > > >> > > > to "catalog", and "namespace" to "database"? > > >> > > >> > > > > > >> > > >> > > > Cheers, > > >> > > >> > > > Bowen > > >> > > >> > > > > > >> > > >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen < > > >> > > >> [hidden email]> > > >> > > >> > > > wrote: > > >> > > >> > > > > > >> > > >> > > >> Hi everyone, > > >> > > >> > > >> > > >> > > >> > > >> Per discussion in the previous thread > > >> > > >> > > >> < > > >> > > >> > > >> > > >> > > >> > > > > >> > > >> > > > >> > > >> > > >> > > > > >> > > > >> > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html > > >> > > >> > > >> >, > > >> > > >> > > >> I have created FLIP-72 to kick off a more detailed > > >> discussion > > >> > on > > >> > > >> the > > >> > > >> > > Flink > > >> > > >> > > >> Pulsar connector: > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > > > >> > > >> > > > >> > > >> > > >> > > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector > > >> > > >> > > >> > > >> > > >> > > >> In short, the connector has the following features: > > >> > > >> > > >> > > >> > > >> > > >> - > > >> > > >> > > >> > > >> > > >> > > >> Pulsar as a streaming source with exactly-once > > guarantee. > > >> > > >> > > >> - > > >> > > >> > > >> > > >> > > >> > > >> Sink streaming results to Pulsar with at-least-once > > >> > semantics. > > >> > > >> > > >> - > > >> > > >> > > >> > > >> > > >> > > >> Build upon Flink new Table API Type system (FLIP-37 > > >> > > >> > > >> < > > >> > > >> > > >> > > >> > > >> > > > > >> > > >> > > > >> > > >> > > >> > > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System > > >> > > >> > > >> > > > >> > > >> > > >> ), and can automatically (de)serialize messages with > > the > > >> > help > > >> > > of > > >> > > >> > > Pulsar > > >> > > >> > > >> schema. > > >> > > >> > > >> - > > >> > > >> > > >> > > >> > > >> > > >> Integrate with Flink new Catalog API (FLIP-30 > > >> > > >> > > >> < > > >> > > >> > > >> > > >> > > >> > > > > >> > > >> > > > >> > > >> > > >> > > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs > > >> > > >> > > >> >), > > >> > > >> > > >> which enables the use of Pulsar topics as tables in > > Table > > >> > API > > >> > > as > > >> > > >> > well > > >> > > >> > > >> as > > >> > > >> > > >> SQL client. > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > > > >> > > >> > > > >> > > >> > > >> > > > > >> > > > >> > > > https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> > > >> Would love to here your thoughts on this. > > >> > > >> > > >> > > >> > > >> > > >> Best, > > >> > > >> > > >> Yijie > > >> > > >> > > >> > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > >> > > >> > > > > > >> > > > > >> > > > >> > > > > > > |
Hi everyone,
I've updated the Catalog PR and make all settings small case. And tests are added as well. Hi Bowen, could you please take a look. https://github.com/apache/flink/pull/10455 For the sink part of the connector, I've made a separate PR https://github.com/apache/flink/pull/10875. Could someone help review this? Best, Yijie On Thu, Jan 9, 2020 at 8:44 AM Bowen Li <[hidden email]> wrote: > Hi Yijie, > > There's just one more concern on the yaml configs. Otherwise, I think we > should be good to go. > > Can you update your PR and ensure all tests pass? I can help review and > merge in the next couple weeks. > > Thanks, > Bowen > > > On Mon, Dec 23, 2019 at 7:03 PM Yijie Shen <[hidden email]> > wrote: > > > Hi Bowen, > > > > I've done updated the design doc, PTAL. > > Btw the PR for catalog is https://github.com/apache/flink/pull/10455, > > could > > you please take a look? > > > > Best, > > Yijie > > > > On Mon, Dec 9, 2019 at 8:44 AM Bowen Li <[hidden email]> wrote: > > > > > Hi Yijie, > > > > > > I took a look at the design doc. LGTM overall, left a few questions. > > > > > > On Tue, Dec 3, 2019 at 10:39 PM Becket Qin <[hidden email]> > wrote: > > > > > > > Yes, you are absolutely right. Cannot believe I posted in the wrong > > > > thread... > > > > > > > > On Wed, Dec 4, 2019 at 1:46 PM Jark Wu <[hidden email]> wrote: > > > > > > > >> Thanks Becket the the updating, > > > >> > > > >> But shouldn't this message be posted in FLIP-27 discussion > thread[1]? > > > >> > > > >> > > > >> Best, > > > >> Jark > > > >> > > > >> [1]: > > > >> > > > >> > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-27-Refactor-Source-Interface-td24952.html > > > >> > > > >> On Wed, 4 Dec 2019 at 12:12, Becket Qin <[hidden email]> > wrote: > > > >> > > > >> > Hi all, > > > >> > > > > >> > Sorry for the long belated update. I have updated FLIP-27 wiki > page > > > with > > > >> > the latest proposals. Some noticeable changes include: > > > >> > 1. A new generic communication mechanism between SplitEnumerator > and > > > >> > SourceReader. > > > >> > 2. Some detail API method signature changes. > > > >> > > > > >> > We left a few things out of this FLIP and will address them in > > > separate > > > >> > FLIPs. Including: > > > >> > 1. Per split event time. > > > >> > 2. Event time alignment. > > > >> > 3. Fine grained failover for SplitEnumerator failure. > > > >> > > > > >> > Please let us know if you have any question. > > > >> > > > > >> > Thanks, > > > >> > > > > >> > Jiangjie (Becket) Qin > > > >> > > > > >> > On Tue, Nov 19, 2019 at 10:28 AM Yijie Shen < > > > [hidden email]> > > > >> > wrote: > > > >> > > > > >> > > Hi everyone, > > > >> > > > > > >> > > I've put the catalog part design in separate doc with more > details > > > for > > > >> > > easier communication. > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > >> > > > > > > https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing > > > >> > > > > > >> > > I would love to hear your thoughts on this. > > > >> > > > > > >> > > Best, > > > >> > > Yijie > > > >> > > > > > >> > > On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen < > > > >> [hidden email]> > > > >> > > wrote: > > > >> > > > > > >> > > > Hi everyone, > > > >> > > > > > > >> > > > Glad to receive your valuable feedbacks. > > > >> > > > > > > >> > > > I'd first separate the Pulsar catalog as another doc and show > > more > > > >> > design > > > >> > > > and implementation details there. > > > >> > > > > > > >> > > > For the current FLIP-72, I would separate it into the sink > part > > > for > > > >> > > > current work and keep the source part as future works until we > > > reach > > > >> > > > FLIP-27 finals. > > > >> > > > > > > >> > > > I also reply to some of the comments in the design doc. I will > > > >> rewrite > > > >> > > the > > > >> > > > catalog part in regarding to Bowen's advice in both email and > > > >> comments. > > > >> > > > > > > >> > > > Thanks for the help again. > > > >> > > > > > > >> > > > Best, > > > >> > > > Yijie > > > >> > > > > > > >> > > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong < > [hidden email] > > > > > > >> > wrote: > > > >> > > > > > > >> > > >> Hi Yijie, > > > >> > > >> > > > >> > > >> I also agree with Jark on separating the Catalog part into > > > another > > > >> > FLIP. > > > >> > > >> > > > >> > > >> With FLIP-27[1] also in the air, it is also probably great to > > > split > > > >> > and > > > >> > > >> unblock the sink implementation contribution. > > > >> > > >> I would suggest either putting in a detail implementation > plan > > > >> section > > > >> > > in > > > >> > > >> the doc, or (maybe too much separation?) splitting them into > > > >> different > > > >> > > >> FLIPs. What do you guys think? > > > >> > > >> > > > >> > > >> -- > > > >> > > >> Rong > > > >> > > >> > > > >> > > >> [1] > > > >> > > >> > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > > > >> > > >> > > > >> > > >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <[hidden email]> > > > wrote: > > > >> > > >> > > > >> > > >> > Hi Yijie, > > > >> > > >> > > > > >> > > >> > Thanks for the design document. I agree with Bowen that the > > > >> catalog > > > >> > > part > > > >> > > >> > needs more details. > > > >> > > >> > And I would suggest to separate Pulsar Catalog as another > > FLIP. > > > >> IMO, > > > >> > > it > > > >> > > >> has > > > >> > > >> > little to do with source/sink. > > > >> > > >> > Having a separate FLIP can unblock the contribution for > sink > > > (or > > > >> > > source) > > > >> > > >> > and keep the discussion more focus. > > > >> > > >> > I also left some comments in the documentation. > > > >> > > >> > > > > >> > > >> > Thanks, > > > >> > > >> > Jark > > > >> > > >> > > > > >> > > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen < > > > >> [hidden email] > > > >> > > > > > >> > > >> > wrote: > > > >> > > >> > > > > >> > > >> > > Hi Bowen, > > > >> > > >> > > > > > >> > > >> > > Thanks for your comments. I'll add catalog details as you > > > >> > suggested. > > > >> > > >> > > > > > >> > > >> > > One more question: since we decide to not implement > source > > > >> part of > > > >> > > the > > > >> > > >> > > connector at the moment. > > > >> > > >> > > What can users do with a Pulsar catalog? > > > >> > > >> > > Create a table backed by Pulsar and check existing pulsar > > > >> tables > > > >> > to > > > >> > > >> see > > > >> > > >> > > their schemas? Drop tables maybe? > > > >> > > >> > > > > > >> > > >> > > Best, > > > >> > > >> > > Yijie > > > >> > > >> > > > > > >> > > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li < > > > [hidden email]> > > > >> > > wrote: > > > >> > > >> > > > > > >> > > >> > > > Hi Yijie, > > > >> > > >> > > > > > > >> > > >> > > > Per the discussion, maybe you can move pulsar source to > > > >> 'future > > > >> > > >> work' > > > >> > > >> > > > section in the FLIP for now? > > > >> > > >> > > > > > > >> > > >> > > > Besides, the FLIP seems to be quite rough at the > moment, > > > and > > > >> I'd > > > >> > > >> > > recommend > > > >> > > >> > > > to add more details . > > > >> > > >> > > > > > > >> > > >> > > > A few questions mainly regarding the proposed pulsar > > > catalog. > > > >> > > >> > > > > > > >> > > >> > > > - Can you provide some background of pulsar schema > > > >> registry > > > >> > and > > > >> > > >> how > > > >> > > >> > it > > > >> > > >> > > > works? > > > >> > > >> > > > - The proposed design of pulsar catalog is very > vague > > > now, > > > >> > can > > > >> > > >> you > > > >> > > >> > > > share some details of how a pulsar catalog would > work > > > >> > > internally? > > > >> > > >> > E.g. > > > >> > > >> > > > - which APIs does it support exactly? E.g. I see > > from > > > >> your > > > >> > > >> > > > prototype that table creation is supported but > not > > > >> > > alteration. > > > >> > > >> > > > - is it going to connect to a pulsar schema > > registry > > > >> via a > > > >> > > >> http > > > >> > > >> > > > client or a pulsar client, etc > > > >> > > >> > > > - will it be able to handle multiple versions of > > > >> pulsar, > > > >> > or > > > >> > > >> just > > > >> > > >> > > > one? How is compatibility handles between > different > > > >> > > >> Flink-Pulsar > > > >> > > >> > > versions? > > > >> > > >> > > > - will it support only reading from pulsar schema > > > >> > registry , > > > >> > > >> or > > > >> > > >> > > > both read/write? Will it work end-to-end in Flink > > SQL > > > >> for > > > >> > > >> users > > > >> > > >> > to > > > >> > > >> > > create > > > >> > > >> > > > and manipulate a pulsar table such as "CREATE > > TABLE t > > > >> WITH > > > >> > > >> > > > PROPERTIES(type=pulsar)" and "DROP TABLE t"? > > > >> > > >> > > > - Is a pulsar topic always gonna be a > > non-partitioned > > > >> > table? > > > >> > > >> How > > > >> > > >> > is > > > >> > > >> > > > a partitioned topic mapped to a Flink table? > > > >> > > >> > > > - How to map Flink's catalog/database namespace to > > > >> pulsar's > > > >> > > >> > > > multi-tenant namespaces? I'm not very familiar with > > how > > > >> multi > > > >> > > >> > tenancy > > > >> > > >> > > works > > > >> > > >> > > > in pulsar, and some background context/use cases may > > > help > > > >> > here > > > >> > > >> too. > > > >> > > >> > > E.g. > > > >> > > >> > > > - can a pulsar client/consumer/producer be > > > >> multiple-tenant > > > >> > > at > > > >> > > >> the > > > >> > > >> > > > same time? > > > >> > > >> > > > - how does authentication work in pulsar's > > > >> multi-tenancy > > > >> > and > > > >> > > >> the > > > >> > > >> > > > catalog? asking since I didn't see the proposed > > > pulsar > > > >> > > catalog > > > >> > > >> > has > > > >> > > >> > > > username/password configs > > > >> > > >> > > > - the FLIP seems propose mapping a pulsar cluster > > and > > > >> > > >> > > > 'tenant/namespace' respectively to Flink's > > 'catalog' > > > >> and > > > >> > > >> > > 'database'. I > > > >> > > >> > > > wonder whether it totally makes sense, or should > we > > > >> > actually > > > >> > > >> map > > > >> > > >> > > "tenant" > > > >> > > >> > > > to "catalog", and "namespace" to "database"? > > > >> > > >> > > > > > > >> > > >> > > > Cheers, > > > >> > > >> > > > Bowen > > > >> > > >> > > > > > > >> > > >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen < > > > >> > > >> [hidden email]> > > > >> > > >> > > > wrote: > > > >> > > >> > > > > > > >> > > >> > > >> Hi everyone, > > > >> > > >> > > >> > > > >> > > >> > > >> Per discussion in the previous thread > > > >> > > >> > > >> < > > > >> > > >> > > >> > > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html > > > >> > > >> > > >> >, > > > >> > > >> > > >> I have created FLIP-72 to kick off a more detailed > > > >> discussion > > > >> > on > > > >> > > >> the > > > >> > > >> > > Flink > > > >> > > >> > > >> Pulsar connector: > > > >> > > >> > > >> > > > >> > > >> > > >> > > > >> > > >> > > >> > > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector > > > >> > > >> > > >> > > > >> > > >> > > >> In short, the connector has the following features: > > > >> > > >> > > >> > > > >> > > >> > > >> - > > > >> > > >> > > >> > > > >> > > >> > > >> Pulsar as a streaming source with exactly-once > > > guarantee. > > > >> > > >> > > >> - > > > >> > > >> > > >> > > > >> > > >> > > >> Sink streaming results to Pulsar with at-least-once > > > >> > semantics. > > > >> > > >> > > >> - > > > >> > > >> > > >> > > > >> > > >> > > >> Build upon Flink new Table API Type system (FLIP-37 > > > >> > > >> > > >> < > > > >> > > >> > > >> > > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System > > > >> > > >> > > >> > > > > >> > > >> > > >> ), and can automatically (de)serialize messages > with > > > the > > > >> > help > > > >> > > of > > > >> > > >> > > Pulsar > > > >> > > >> > > >> schema. > > > >> > > >> > > >> - > > > >> > > >> > > >> > > > >> > > >> > > >> Integrate with Flink new Catalog API (FLIP-30 > > > >> > > >> > > >> < > > > >> > > >> > > >> > > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs > > > >> > > >> > > >> >), > > > >> > > >> > > >> which enables the use of Pulsar topics as tables in > > > Table > > > >> > API > > > >> > > as > > > >> > > >> > well > > > >> > > >> > > >> as > > > >> > > >> > > >> SQL client. > > > >> > > >> > > >> > > > >> > > >> > > >> > > > >> > > >> > > >> > > > >> > > >> > > >> > > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u > > > >> > > >> > > >> > > > >> > > >> > > >> > > > >> > > >> > > >> Would love to here your thoughts on this. > > > >> > > >> > > >> > > > >> > > >> > > >> Best, > > > >> > > >> > > >> Yijie > > > >> > > >> > > >> > > > >> > > >> > > > > > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > > > > > |
Free forum by Nabble | Edit this page |