Hello everybody,
I started a new FLIP to discuss about an HBaseCatalog implementation[1] after the opening of the relative issue by Bowen [2]. I drafted a very simple version of the FLIP just to discuss about the critical points (in red) in order to decide how to proceed. Best, Flavio [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-117%3A+HBase+catalog [2] https://issues.apache.org/jira/browse/FLINK-16575 |
Thanks Flavio for driving. Personally I am +1 for integrating HBase tables.
I start a new topic for discussion. It is related but not the core of this FLIP. In the FLIP, I can see: - Does HBase support the concept of partitions..? I don't think so.. - Does HBase support functions? I don't think so.. - Does HBase support statistics? I don't think so.. - Does HBase support views? I don't think so.. And in JDBC catalog [1]. There are lots of UnsupportedOperationExceptions too. And maybe for confluent catalog, UnsupportedOperationExceptions come again. Lots of UnsupportedOperationExceptions looks unhappy to this catalog api... So can we do some refactor to catalog api? I can see a lot of catalogs just need provide table information without partitions, functions, statistics, views... CC: @Dawid Wysakowicz <[hidden email]> @Bowen Li <[hidden email]> [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-93%3A+JDBC+catalog+and+Postgres+catalog Best, Jingsong Lee On Sat, Mar 14, 2020 at 7:36 AM Flavio Pompermaier <[hidden email]> wrote: > Hello everybody, > I started a new FLIP to discuss about an HBaseCatalog implementation[1] > after the opening of the relative issue by Bowen [2]. > I drafted a very simple version of the FLIP just to discuss about the > critical points (in red) in order to decide how to proceed. > > Best, > Flavio > > [1] > https://cwiki.apache.org/confluence/display/FLINK/FLIP-117%3A+HBase+catalog > [2] https://issues.apache.org/jira/browse/FLINK-16575 > -- Best, Jingsong Lee |
Hi,
I think core of the jira right now is to investigate if catalogs of schemaless systems like HBase and Elasticsearch bring practical value to users. I haven't used these SQL connectors before, and thus don't have much to say in this case. Can anyone describe how it would work? Maybe @Yu or @Zheng can chime in. w.r.t unsupported operation exception, they should be thrown in targeted getters (e.g. getView(), getFunction()). General listing APIs like listView(), listFunction() should not throw them but just return empty results, for the sake of not breaking user SQL experience. To dedup code, such common implementations can be moved to AbstractCatalog to make APIs look cleaner. I recall that there was an intention to refactor catalog API signatures, but haven't kept up with it. Bowen On Sun, Mar 15, 2020 at 10:19 PM Jingsong Li <[hidden email]> wrote: > Thanks Flavio for driving. Personally I am +1 for integrating HBase tables. > > I start a new topic for discussion. It is related but not the core of this > FLIP. > In the FLIP, I can see: > - Does HBase support the concept of partitions..? I don't think so.. > - Does HBase support functions? I don't think so.. > - Does HBase support statistics? I don't think so.. > - Does HBase support views? I don't think so.. > > And in JDBC catalog [1]. There are lots of UnsupportedOperationExceptions > too. > And maybe for confluent catalog, UnsupportedOperationExceptions come again. > Lots of UnsupportedOperationExceptions looks unhappy to this catalog api... > So can we do some refactor to catalog api? I can see a lot of catalogs > just need provide table information without partitions, functions, > statistics, views... > > CC: @Dawid Wysakowicz <[hidden email]> @Bowen Li > <[hidden email]> > > [1] > https://cwiki.apache.org/confluence/display/FLINK/FLIP-93%3A+JDBC+catalog+and+Postgres+catalog > > Best, > Jingsong Lee > > On Sat, Mar 14, 2020 at 7:36 AM Flavio Pompermaier <[hidden email]> > wrote: > >> Hello everybody, >> I started a new FLIP to discuss about an HBaseCatalog implementation[1] >> after the opening of the relative issue by Bowen [2]. >> I drafted a very simple version of the FLIP just to discuss about the >> critical points (in red) in order to decide how to proceed. >> >> Best, >> Flavio >> >> [1] >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-117%3A+HBase+catalog >> [2] https://issues.apache.org/jira/browse/FLINK-16575 >> > > > -- > Best, Jingsong Lee > |
Thanks for bringing up this discussion Flavio. And thanks Bowen for the
ping. For me, I'm not quite sure whether adding an HBase catalog suits into the existing Catalog interface. It seems to be coupled with SQL standard instead of a more general database catalog [1], which also reflects in the FLIP document, especially the below four questions: - Does HBase support the concept of partitions..? I don't think so.. - Does HBase support functions? I don't think so.. - Does HBase support statistics? I don't think so.. - Does HBase support views? I don't think so.. Partitions/Functions/Statistics/Views are all SQL concepts. Since HBase is a non-relational (NoSQL) database [2] [3], I don't think we could easily map the concepts, except for regions to partitions. You may find more concepts (such as functions [4], views [5]) aligned in Phoenix [6] with HBase as backing store, but that's off track for this FLIP. I'm also not sure whether it still benefits even if only parts of the concepts/methods could be matched/implemented, and I'd like to delegate the decision to experts on TableAPI/SQL modules. And to be explicit, I'm just giving some inputs instead of cutting a vote here (none of +1/+0/-1). Hopefully my input helps. Thanks. Best Regards, Yu [1] https://en.wikipedia.org/wiki/Database_catalog [2] https://en.wikipedia.org/wiki/Apache_HBase [3] https://www.mail-archive.com/announce@.../msg05739.html [4] https://phoenix.apache.org/language/functions.html [5] https://hexdocs.pm/phoenix/views.html [6] https://en.wikipedia.org/wiki/Apache_Phoenix On Tue, 17 Mar 2020 at 01:10, Bowen Li <[hidden email]> wrote: > Hi, > > I think core of the jira right now is to investigate if catalogs of > schemaless systems like HBase and Elasticsearch bring practical value to > users. I haven't used these SQL connectors before, and thus don't have much > to say in this case. Can anyone describe how it would work? Maybe @Yu > or @Zheng can chime in. > > w.r.t unsupported operation exception, they should be thrown in targeted > getters (e.g. getView(), getFunction()). General listing APIs like > listView(), listFunction() should not throw them but just return empty > results, for the sake of not breaking user SQL experience. To dedup code, > such common implementations can be moved to AbstractCatalog to make APIs > look cleaner. I recall that there was an intention to refactor catalog API > signatures, but haven't kept up with it. > > Bowen > > On Sun, Mar 15, 2020 at 10:19 PM Jingsong Li <[hidden email]> > wrote: > >> Thanks Flavio for driving. Personally I am +1 for integrating HBase >> tables. >> >> I start a new topic for discussion. It is related but not the core of >> this FLIP. >> In the FLIP, I can see: >> - Does HBase support the concept of partitions..? I don't think so.. >> - Does HBase support functions? I don't think so.. >> - Does HBase support statistics? I don't think so.. >> - Does HBase support views? I don't think so.. >> >> And in JDBC catalog [1]. There are lots of UnsupportedOperationExceptions >> too. >> And maybe for confluent catalog, UnsupportedOperationExceptions come >> again. >> Lots of UnsupportedOperationExceptions looks unhappy to this catalog >> api... >> So can we do some refactor to catalog api? I can see a lot of catalogs >> just need provide table information without partitions, functions, >> statistics, views... >> >> CC: @Dawid Wysakowicz <[hidden email]> @Bowen Li >> <[hidden email]> >> >> [1] >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-93%3A+JDBC+catalog+and+Postgres+catalog >> >> Best, >> Jingsong Lee >> >> On Sat, Mar 14, 2020 at 7:36 AM Flavio Pompermaier <[hidden email]> >> wrote: >> >>> Hello everybody, >>> I started a new FLIP to discuss about an HBaseCatalog implementation[1] >>> after the opening of the relative issue by Bowen [2]. >>> I drafted a very simple version of the FLIP just to discuss about the >>> critical points (in red) in order to decide how to proceed. >>> >>> Best, >>> Flavio >>> >>> [1] >>> >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-117%3A+HBase+catalog >>> [2] https://issues.apache.org/jira/browse/FLINK-16575 >>> >> >> >> -- >> Best, Jingsong Lee >> > |
Free forum by Nabble | Edit this page |