[DISCUSS] Overwrite and partition inserting support in 1.10

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Overwrite and partition inserting support in 1.10

Jingsong Li
Hi Dev,

After cutting out the branch of 1.10, I tried the following functions of
SQL-CLI and found that it does not support:
- insert overwrite
- PARTITION (partcol1=val1, partcol2=val2 ...)
The SQL pattern is:
INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION (partcol1=val1,
partcol2=val2 ...) select_statement1 FROM from_statement;
It is a surprise to me.
The reason is that we only allow these two grammars in hive dialect. And
SQL-CLI does not have an interface to switch dialects.

Because it directly hinders the SQL-CLI's insert syntax in hive integration
and seriously hinders the practicability of SQL-CLI.
And we have introduced these two grammars in FLIP-63 [1] to Flink.
Here are my question:
1.Should we remove hive dialect limitation for these two grammars?
2.Should we fix this in 1.10?

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support

Best,
Jingsong Lee
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Overwrite and partition inserting support in 1.10

bowen.li
Hi Jingsong,

Thanks a lot for reporting this issue.

IIRC, we added [INSERT OVERWRITE] and [PARTITION] clauses to support Hive
integration before FLIP-63 was proposed to introduce generic partition
support to Flink. Thus when we added these syntax, we were intentionally
conservative and limited their scope to Hive dialect. @Rui may help to
confirm that. I'm a bit surprised about it as well. As core APIs of FLIP-63
were done, I don't see why we would limit these syntax to Hive dialect
alone. It's just unfortunately that we may have forgot to revisit this
topic and we apparently missed some test cases on SQL CLI side. Sorry for
that.

From a product perspective, SQL CLI is super critical for Flink-Hive
integration and Flink SQL iteself. INSERT OVERWRITE and PARTITION are two
of the most commonly used syntax in Hive, the product wouldn't be useable
without such proper support.

Thus, I think this is a *bug*, we can stop limiting them in Hive dialect,
and should fix it in 1.10. We should also add more test coverage for SQL
CLI to avoid such surprise.

Cheers,
Bowen

On Wed, Dec 11, 2019 at 1:29 AM Jingsong Li <[hidden email]> wrote:

> Hi Dev,
>
> After cutting out the branch of 1.10, I tried the following functions of
> SQL-CLI and found that it does not support:
> - insert overwrite
> - PARTITION (partcol1=val1, partcol2=val2 ...)
> The SQL pattern is:
> INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION (partcol1=val1,
> partcol2=val2 ...) select_statement1 FROM from_statement;
> It is a surprise to me.
> The reason is that we only allow these two grammars in hive dialect. And
> SQL-CLI does not have an interface to switch dialects.
>
> Because it directly hinders the SQL-CLI's insert syntax in hive integration
> and seriously hinders the practicability of SQL-CLI.
> And we have introduced these two grammars in FLIP-63 [1] to Flink.
> Here are my question:
> 1.Should we remove hive dialect limitation for these two grammars?
> 2.Should we fix this in 1.10?
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
>
> Best,
> Jingsong Lee
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Overwrite and partition inserting support in 1.10

Jark Wu-2
Thanks Jingsong,

OVERWRITE and PARTITION are very fundamental features for Hive users.
I'm sorry to hear that it doesn't work in SQL CLI.

> Remove hive dialect limitation for these two grammars?
The dialect restriction is introduced on purpose, because OVERWRITE and
PARTITION syntax
are not SQL standard. Even in the discussion of FLIP-63, the community have
different opinion on
whether the partition fields should be declared in schema part or not [1].
The status of FLIP-63 is
that limit the proposed syntax in Hive dialect, and we may propose new
built-in syntax for creating
 partition tables in the future. So I'm -1 to removing dialect limitation.

> Should we fix this in 1.10?
From my point of view, the problem is that users can't switch dialects in
SQL CLI, because it is not
 exposed as a configuration. FLIP-89 [2] proposed to make TableConfig
configurable including the
 dialect configuration, i.g. `table.planner.sql-dialect`. So the only thing
we need to do is introducing
such a configuration, but this is definitely is not a *bug*. However,
considering it is a small change,
and from user perspective, I'm +1 to introducing such a configuration.

Best,
Jark

[1]:
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-63-Rework-table-partition-support-tp32770p33510.html
[2]:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-89%3A+Improve+usability+of+TableConfig

On Thu, 12 Dec 2019 at 03:24, Bowen Li <[hidden email]> wrote:

> Hi Jingsong,
>
> Thanks a lot for reporting this issue.
>
> IIRC, we added [INSERT OVERWRITE] and [PARTITION] clauses to support Hive
> integration before FLIP-63 was proposed to introduce generic partition
> support to Flink. Thus when we added these syntax, we were intentionally
> conservative and limited their scope to Hive dialect. @Rui may help to
> confirm that. I'm a bit surprised about it as well. As core APIs of FLIP-63
> were done, I don't see why we would limit these syntax to Hive dialect
> alone. It's just unfortunately that we may have forgot to revisit this
> topic and we apparently missed some test cases on SQL CLI side. Sorry for
> that.
>
> From a product perspective, SQL CLI is super critical for Flink-Hive
> integration and Flink SQL iteself. INSERT OVERWRITE and PARTITION are two
> of the most commonly used syntax in Hive, the product wouldn't be useable
> without such proper support.
>
> Thus, I think this is a *bug*, we can stop limiting them in Hive dialect,
> and should fix it in 1.10. We should also add more test coverage for SQL
> CLI to avoid such surprise.
>
> Cheers,
> Bowen
>
> On Wed, Dec 11, 2019 at 1:29 AM Jingsong Li <[hidden email]>
> wrote:
>
> > Hi Dev,
> >
> > After cutting out the branch of 1.10, I tried the following functions of
> > SQL-CLI and found that it does not support:
> > - insert overwrite
> > - PARTITION (partcol1=val1, partcol2=val2 ...)
> > The SQL pattern is:
> > INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION (partcol1=val1,
> > partcol2=val2 ...) select_statement1 FROM from_statement;
> > It is a surprise to me.
> > The reason is that we only allow these two grammars in hive dialect. And
> > SQL-CLI does not have an interface to switch dialects.
> >
> > Because it directly hinders the SQL-CLI's insert syntax in hive
> integration
> > and seriously hinders the practicability of SQL-CLI.
> > And we have introduced these two grammars in FLIP-63 [1] to Flink.
> > Here are my question:
> > 1.Should we remove hive dialect limitation for these two grammars?
> > 2.Should we fix this in 1.10?
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> >
> > Best,
> > Jingsong Lee
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Overwrite and partition inserting support in 1.10

Rui Li
+1 to fix it in 1.10. If this feature doesn't work via SQL CLI, I guess it
doesn't work for most Hive users.
+0 to remove the dialect check. I don't see much benefit this check can
bring to users, except that it prevents users from accidentally using some
Hive features, which doesn't seem to be very harmful anyway. But I don't
have a strong opinion for it and it probably needs more thorough discussion.


On Thu, Dec 12, 2019 at 7:49 AM Jark Wu <[hidden email]> wrote:

> Thanks Jingsong,
>
> OVERWRITE and PARTITION are very fundamental features for Hive users.
> I'm sorry to hear that it doesn't work in SQL CLI.
>
> > Remove hive dialect limitation for these two grammars?
> The dialect restriction is introduced on purpose, because OVERWRITE and
> PARTITION syntax
> are not SQL standard. Even in the discussion of FLIP-63, the community
> have different opinion on
> whether the partition fields should be declared in schema part or not [1].
> The status of FLIP-63 is
> that limit the proposed syntax in Hive dialect, and we may propose new
> built-in syntax for creating
>  partition tables in the future. So I'm -1 to removing dialect limitation.
>
> > Should we fix this in 1.10?
> From my point of view, the problem is that users can't switch dialects in
> SQL CLI, because it is not
>  exposed as a configuration. FLIP-89 [2] proposed to make TableConfig
> configurable including the
>  dialect configuration, i.g. `table.planner.sql-dialect`. So the only
> thing we need to do is introducing
> such a configuration, but this is definitely is not a *bug*. However,
> considering it is a small change,
> and from user perspective, I'm +1 to introducing such a configuration.
>
> Best,
> Jark
>
> [1]:
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-63-Rework-table-partition-support-tp32770p33510.html
> [2]:
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-89%3A+Improve+usability+of+TableConfig
>
> On Thu, 12 Dec 2019 at 03:24, Bowen Li <[hidden email]> wrote:
>
>> Hi Jingsong,
>>
>> Thanks a lot for reporting this issue.
>>
>> IIRC, we added [INSERT OVERWRITE] and [PARTITION] clauses to support Hive
>> integration before FLIP-63 was proposed to introduce generic partition
>> support to Flink. Thus when we added these syntax, we were intentionally
>> conservative and limited their scope to Hive dialect. @Rui may help to
>> confirm that. I'm a bit surprised about it as well. As core APIs of
>> FLIP-63
>> were done, I don't see why we would limit these syntax to Hive dialect
>> alone. It's just unfortunately that we may have forgot to revisit this
>> topic and we apparently missed some test cases on SQL CLI side. Sorry for
>> that.
>>
>> From a product perspective, SQL CLI is super critical for Flink-Hive
>> integration and Flink SQL iteself. INSERT OVERWRITE and PARTITION are two
>> of the most commonly used syntax in Hive, the product wouldn't be useable
>> without such proper support.
>>
>> Thus, I think this is a *bug*, we can stop limiting them in Hive dialect,
>> and should fix it in 1.10. We should also add more test coverage for SQL
>> CLI to avoid such surprise.
>>
>> Cheers,
>> Bowen
>>
>> On Wed, Dec 11, 2019 at 1:29 AM Jingsong Li <[hidden email]>
>> wrote:
>>
>> > Hi Dev,
>> >
>> > After cutting out the branch of 1.10, I tried the following functions of
>> > SQL-CLI and found that it does not support:
>> > - insert overwrite
>> > - PARTITION (partcol1=val1, partcol2=val2 ...)
>> > The SQL pattern is:
>> > INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION (partcol1=val1,
>> > partcol2=val2 ...) select_statement1 FROM from_statement;
>> > It is a surprise to me.
>> > The reason is that we only allow these two grammars in hive dialect. And
>> > SQL-CLI does not have an interface to switch dialects.
>> >
>> > Because it directly hinders the SQL-CLI's insert syntax in hive
>> integration
>> > and seriously hinders the practicability of SQL-CLI.
>> > And we have introduced these two grammars in FLIP-63 [1] to Flink.
>> > Here are my question:
>> > 1.Should we remove hive dialect limitation for these two grammars?
>> > 2.Should we fix this in 1.10?
>> >
>> > [1]
>> >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
>> >
>> > Best,
>> > Jingsong Lee
>> >
>>
>

--
Best regards!
Rui Li
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Overwrite and partition inserting support in 1.10

Jingsong Li
Hi Jark,

> The dialect restriction is introduced on purpose, because OVERWRITE and
PARTITION syntax are not SQL standard.

My understanding is that watermark [1] is also a non-standard grammar. We
can extend SQL standard syntax.

> Even in the discussion of FLIP-63, the community have different opinion
on whether the partition fields should be declared in schema part or not
[1]. The status of FLIP-63 is that limit the proposed syntax in Hive
dialect, and we may propose new built-in syntax for creating partition
tables in the future.

No, I'm not talking about creating partition tables. I'm talking about
inserting tables. FLIP-63 doesn't mean it's only for hive dialect. Flink
itself can need the support of partition table. I'm sure it will further
strengthen the work in 1.11.
I remember that we only had disputes on creating partition tables, so I
moved it to the chapter to be discussed, and limited it to hive dialect,
only creating partition tables, right?

> FLIP-89 proposed to make TableConfig configurable including the dialect
configuration, i.g. `table.planner.sql-dialect`. So the only thing we need
to do is introducing such a configuration.

I'm not sure we should introduce FLIP-89, and dialect has no other function
except for overwrite and partition inserting at present, I think they
should be Flink syntax.

And remove hive dialect limitation is a very small change/fix, Just remove
a few lines of check.

What do you think? (If you think so, after some discussion, we can start a
vote later.)

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-66%3A+Support+Time+Attribute+in+SQL+DDL

Best,
Jingsong Lee

On Thu, Dec 12, 2019 at 11:11 AM Rui Li <[hidden email]> wrote:

> +1 to fix it in 1.10. If this feature doesn't work via SQL CLI, I guess it
> doesn't work for most Hive users.
> +0 to remove the dialect check. I don't see much benefit this check can
> bring to users, except that it prevents users from accidentally using some
> Hive features, which doesn't seem to be very harmful anyway. But I don't
> have a strong opinion for it and it probably needs more thorough
> discussion.
>
>
> On Thu, Dec 12, 2019 at 7:49 AM Jark Wu <[hidden email]> wrote:
>
> > Thanks Jingsong,
> >
> > OVERWRITE and PARTITION are very fundamental features for Hive users.
> > I'm sorry to hear that it doesn't work in SQL CLI.
> >
> > > Remove hive dialect limitation for these two grammars?
> > The dialect restriction is introduced on purpose, because OVERWRITE and
> > PARTITION syntax
> > are not SQL standard. Even in the discussion of FLIP-63, the community
> > have different opinion on
> > whether the partition fields should be declared in schema part or not
> [1].
> > The status of FLIP-63 is
> > that limit the proposed syntax in Hive dialect, and we may propose new
> > built-in syntax for creating
> >  partition tables in the future. So I'm -1 to removing dialect
> limitation.
> >
> > > Should we fix this in 1.10?
> > From my point of view, the problem is that users can't switch dialects in
> > SQL CLI, because it is not
> >  exposed as a configuration. FLIP-89 [2] proposed to make TableConfig
> > configurable including the
> >  dialect configuration, i.g. `table.planner.sql-dialect`. So the only
> > thing we need to do is introducing
> > such a configuration, but this is definitely is not a *bug*. However,
> > considering it is a small change,
> > and from user perspective, I'm +1 to introducing such a configuration.
> >
> > Best,
> > Jark
> >
> > [1]:
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-63-Rework-table-partition-support-tp32770p33510.html
> > [2]:
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-89%3A+Improve+usability+of+TableConfig
> >
> > On Thu, 12 Dec 2019 at 03:24, Bowen Li <[hidden email]> wrote:
> >
> >> Hi Jingsong,
> >>
> >> Thanks a lot for reporting this issue.
> >>
> >> IIRC, we added [INSERT OVERWRITE] and [PARTITION] clauses to support
> Hive
> >> integration before FLIP-63 was proposed to introduce generic partition
> >> support to Flink. Thus when we added these syntax, we were intentionally
> >> conservative and limited their scope to Hive dialect. @Rui may help to
> >> confirm that. I'm a bit surprised about it as well. As core APIs of
> >> FLIP-63
> >> were done, I don't see why we would limit these syntax to Hive dialect
> >> alone. It's just unfortunately that we may have forgot to revisit this
> >> topic and we apparently missed some test cases on SQL CLI side. Sorry
> for
> >> that.
> >>
> >> From a product perspective, SQL CLI is super critical for Flink-Hive
> >> integration and Flink SQL iteself. INSERT OVERWRITE and PARTITION are
> two
> >> of the most commonly used syntax in Hive, the product wouldn't be
> useable
> >> without such proper support.
> >>
> >> Thus, I think this is a *bug*, we can stop limiting them in Hive
> dialect,
> >> and should fix it in 1.10. We should also add more test coverage for SQL
> >> CLI to avoid such surprise.
> >>
> >> Cheers,
> >> Bowen
> >>
> >> On Wed, Dec 11, 2019 at 1:29 AM Jingsong Li <[hidden email]>
> >> wrote:
> >>
> >> > Hi Dev,
> >> >
> >> > After cutting out the branch of 1.10, I tried the following functions
> of
> >> > SQL-CLI and found that it does not support:
> >> > - insert overwrite
> >> > - PARTITION (partcol1=val1, partcol2=val2 ...)
> >> > The SQL pattern is:
> >> > INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION
> (partcol1=val1,
> >> > partcol2=val2 ...) select_statement1 FROM from_statement;
> >> > It is a surprise to me.
> >> > The reason is that we only allow these two grammars in hive dialect.
> And
> >> > SQL-CLI does not have an interface to switch dialects.
> >> >
> >> > Because it directly hinders the SQL-CLI's insert syntax in hive
> >> integration
> >> > and seriously hinders the practicability of SQL-CLI.
> >> > And we have introduced these two grammars in FLIP-63 [1] to Flink.
> >> > Here are my question:
> >> > 1.Should we remove hive dialect limitation for these two grammars?
> >> > 2.Should we fix this in 1.10?
> >> >
> >> > [1]
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> >> >
> >> > Best,
> >> > Jingsong Lee
> >> >
> >>
> >
>
> --
> Best regards!
> Rui Li
>


--
Best, Jingsong Lee
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Overwrite and partition inserting support in 1.10

Danny Chan
In reply to this post by Jingsong Li
Thanks Jingsong for bring up this discussion ~

After reviewing FLIP-63, it seems that we have made a conclusion for the syntax

- INSERT OVERWRITE ...
- INSERT INTO … PARTITION

Which means that they should not have the Hive dialect limitation, so I’m inclined that the behaviors for SQL-CLI is unexpected, or a “bug” that need to fix.

We did not make a conclusion for the syntax:

- CREATE TABLE … PARTITIONED BY ...

Which means that the behavior of it is under-discussion, so it is okey to be without the HIVE dialect limitation, we do not actually have any table sources/sinks that support such a DDL so for current code base, users should not be influenced by the behaviors change.

So I’m

+1 to remove the hive dialect limitations for INSERT OVERWRITE and INSERT PARTITION
+0 to add yaml dialect conf to SQL-CLI because FLIP-89 is not finished yet, we better do this until FLIP-89 is resolved.

Best,
Danny Chan
在 2019年12月11日 +0800 PM5:29,Jingsong Li <[hidden email]>,写道:

> Hi Dev,
>
> After cutting out the branch of 1.10, I tried the following functions of
> SQL-CLI and found that it does not support:
> - insert overwrite
> - PARTITION (partcol1=val1, partcol2=val2 ...)
> The SQL pattern is:
> INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION (partcol1=val1,
> partcol2=val2 ...) select_statement1 FROM from_statement;
> It is a surprise to me.
> The reason is that we only allow these two grammars in hive dialect. And
> SQL-CLI does not have an interface to switch dialects.
>
> Because it directly hinders the SQL-CLI's insert syntax in hive integration
> and seriously hinders the practicability of SQL-CLI.
> And we have introduced these two grammars in FLIP-63 [1] to Flink.
> Here are my question:
> 1.Should we remove hive dialect limitation for these two grammars?
> 2.Should we fix this in 1.10?
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
>
> Best,
> Jingsong Lee
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Overwrite and partition inserting support in 1.10

Jark Wu-2
Hi jingsong,

Watermark is not a standard syntax, that's why we had a FLIP and long
discussion to add it to
Flink's SQL syntax. I think if we want to add INSERT OVERWRITE and
PARTITION syntax to
 Flink's own syntax,  we also need a FLIP or a VOTE, and this may can't
happen soon (we should
hear more people's opinions on this).

Regarding to the sql-dialect configuration, I was not saying to involve the
whole FLIP-89. I mean we can just
start a VOTE to expose it as `table.planner.sql-dialect` and include it in
1.10. The change can be very small, by
adding a ConfigOption and changing the implementation of
TableConfig#getSqlDialect/setSqlDialect. I believe
it is smaller and safer than changing the parser.

Btw, I cc'ed Yu Li and Gary into the discussion, because release managers
should be aware of this.

Best,
Jark



On Thu, 12 Dec 2019 at 11:47, Danny Chan <[hidden email]> wrote:

> Thanks Jingsong for bring up this discussion ~
>
> After reviewing FLIP-63, it seems that we have made a conclusion for the
> syntax
>
> - INSERT OVERWRITE ...
> - INSERT INTO … PARTITION
>
> Which means that they should not have the Hive dialect limitation, so I’m
> inclined that the behaviors for SQL-CLI is unexpected, or a “bug” that need
> to fix.
>
> We did not make a conclusion for the syntax:
>
> - CREATE TABLE … PARTITIONED BY ...
>
> Which means that the behavior of it is under-discussion, so it is okey to
> be without the HIVE dialect limitation, we do not actually have any table
> sources/sinks that support such a DDL so for current code base, users
> should not be influenced by the behaviors change.
>
> So I’m
>
> +1 to remove the hive dialect limitations for INSERT OVERWRITE and INSERT
> PARTITION
> +0 to add yaml dialect conf to SQL-CLI because FLIP-89 is not finished
> yet, we better do this until FLIP-89 is resolved.
>
> Best,
> Danny Chan
> 在 2019年12月11日 +0800 PM5:29,Jingsong Li <[hidden email]>,写道:
> > Hi Dev,
> >
> > After cutting out the branch of 1.10, I tried the following functions of
> > SQL-CLI and found that it does not support:
> > - insert overwrite
> > - PARTITION (partcol1=val1, partcol2=val2 ...)
> > The SQL pattern is:
> > INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION (partcol1=val1,
> > partcol2=val2 ...) select_statement1 FROM from_statement;
> > It is a surprise to me.
> > The reason is that we only allow these two grammars in hive dialect. And
> > SQL-CLI does not have an interface to switch dialects.
> >
> > Because it directly hinders the SQL-CLI's insert syntax in hive
> integration
> > and seriously hinders the practicability of SQL-CLI.
> > And we have introduced these two grammars in FLIP-63 [1] to Flink.
> > Here are my question:
> > 1.Should we remove hive dialect limitation for these two grammars?
> > 2.Should we fix this in 1.10?
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> >
> > Best,
> > Jingsong Lee
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Overwrite and partition inserting support in 1.10

Jingsong Li
Hi Jark,

Let's recall FLIP-63,
We supported these syntax in hive dialect at 1.9. All of my reasons for
launching FLIP-63 are to bring partition support to Flink itself.
Not only batch, but also we have the need to stream jobs to write partition
files today, which is also one of our very important application scenarios.

The original intention of FLIP-63 is to bring all partition syntax to
Flink, but in the end you and I have some different opinion in creating
partition table, so our consensus is to leave it in hive dialect, only it.

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support

Best,
Jingsong Lee

On Thu, Dec 12, 2019 at 4:05 PM Jark Wu <[hidden email]> wrote:

> Hi jingsong,
>
> Watermark is not a standard syntax, that's why we had a FLIP and long
> discussion to add it to
> Flink's SQL syntax. I think if we want to add INSERT OVERWRITE and
> PARTITION syntax to
>  Flink's own syntax,  we also need a FLIP or a VOTE, and this may can't
> happen soon (we should
> hear more people's opinions on this).
>
> Regarding to the sql-dialect configuration, I was not saying to involve the
> whole FLIP-89. I mean we can just
> start a VOTE to expose it as `table.planner.sql-dialect` and include it in
> 1.10. The change can be very small, by
> adding a ConfigOption and changing the implementation of
> TableConfig#getSqlDialect/setSqlDialect. I believe
> it is smaller and safer than changing the parser.
>
> Btw, I cc'ed Yu Li and Gary into the discussion, because release managers
> should be aware of this.
>
> Best,
> Jark
>
>
>
> On Thu, 12 Dec 2019 at 11:47, Danny Chan <[hidden email]> wrote:
>
> > Thanks Jingsong for bring up this discussion ~
> >
> > After reviewing FLIP-63, it seems that we have made a conclusion for the
> > syntax
> >
> > - INSERT OVERWRITE ...
> > - INSERT INTO … PARTITION
> >
> > Which means that they should not have the Hive dialect limitation, so I’m
> > inclined that the behaviors for SQL-CLI is unexpected, or a “bug” that
> need
> > to fix.
> >
> > We did not make a conclusion for the syntax:
> >
> > - CREATE TABLE … PARTITIONED BY ...
> >
> > Which means that the behavior of it is under-discussion, so it is okey to
> > be without the HIVE dialect limitation, we do not actually have any table
> > sources/sinks that support such a DDL so for current code base, users
> > should not be influenced by the behaviors change.
> >
> > So I’m
> >
> > +1 to remove the hive dialect limitations for INSERT OVERWRITE and INSERT
> > PARTITION
> > +0 to add yaml dialect conf to SQL-CLI because FLIP-89 is not finished
> > yet, we better do this until FLIP-89 is resolved.
> >
> > Best,
> > Danny Chan
> > 在 2019年12月11日 +0800 PM5:29,Jingsong Li <[hidden email]>,写道:
> > > Hi Dev,
> > >
> > > After cutting out the branch of 1.10, I tried the following functions
> of
> > > SQL-CLI and found that it does not support:
> > > - insert overwrite
> > > - PARTITION (partcol1=val1, partcol2=val2 ...)
> > > The SQL pattern is:
> > > INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION (partcol1=val1,
> > > partcol2=val2 ...) select_statement1 FROM from_statement;
> > > It is a surprise to me.
> > > The reason is that we only allow these two grammars in hive dialect.
> And
> > > SQL-CLI does not have an interface to switch dialects.
> > >
> > > Because it directly hinders the SQL-CLI's insert syntax in hive
> > integration
> > > and seriously hinders the practicability of SQL-CLI.
> > > And we have introduced these two grammars in FLIP-63 [1] to Flink.
> > > Here are my question:
> > > 1.Should we remove hive dialect limitation for these two grammars?
> > > 2.Should we fix this in 1.10?
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> > >
> > > Best,
> > > Jingsong Lee
> >
>


--
Best, Jingsong Lee
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Overwrite and partition inserting support in 1.10

Jark Wu-2
Hi Jingsong,

Thanks for the explanation, I think I misunderstood your point at the
begining.
As FLIP-63 proposed, INSERT OVERWRITE and INSERT PARTITION syntax are added
to Flink's
SQL syntax, but CREATE PARTITION TABLE should be limited under Hive
dialect.
However, the current implementation is opposite, INSERT OVERWRITE
and INSERT PARTITION are
under the dialect limitation, but CREATE PARTITION TABLE is not.

So it is indeed a bug which should be fixed in 1.10.

Best,
Jark

On Thu, 12 Dec 2019 at 16:35, Jingsong Li <[hidden email]> wrote:

> Hi Jark,
>
> Let's recall FLIP-63,
> We supported these syntax in hive dialect at 1.9. All of my reasons for
> launching FLIP-63 are to bring partition support to Flink itself.
> Not only batch, but also we have the need to stream jobs to write partition
> files today, which is also one of our very important application scenarios.
>
> The original intention of FLIP-63 is to bring all partition syntax to
> Flink, but in the end you and I have some different opinion in creating
> partition table, so our consensus is to leave it in hive dialect, only it.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
>
> Best,
> Jingsong Lee
>
> On Thu, Dec 12, 2019 at 4:05 PM Jark Wu <[hidden email]> wrote:
>
> > Hi jingsong,
> >
> > Watermark is not a standard syntax, that's why we had a FLIP and long
> > discussion to add it to
> > Flink's SQL syntax. I think if we want to add INSERT OVERWRITE and
> > PARTITION syntax to
> >  Flink's own syntax,  we also need a FLIP or a VOTE, and this may can't
> > happen soon (we should
> > hear more people's opinions on this).
> >
> > Regarding to the sql-dialect configuration, I was not saying to involve
> the
> > whole FLIP-89. I mean we can just
> > start a VOTE to expose it as `table.planner.sql-dialect` and include it
> in
> > 1.10. The change can be very small, by
> > adding a ConfigOption and changing the implementation of
> > TableConfig#getSqlDialect/setSqlDialect. I believe
> > it is smaller and safer than changing the parser.
> >
> > Btw, I cc'ed Yu Li and Gary into the discussion, because release managers
> > should be aware of this.
> >
> > Best,
> > Jark
> >
> >
> >
> > On Thu, 12 Dec 2019 at 11:47, Danny Chan <[hidden email]> wrote:
> >
> > > Thanks Jingsong for bring up this discussion ~
> > >
> > > After reviewing FLIP-63, it seems that we have made a conclusion for
> the
> > > syntax
> > >
> > > - INSERT OVERWRITE ...
> > > - INSERT INTO … PARTITION
> > >
> > > Which means that they should not have the Hive dialect limitation, so
> I’m
> > > inclined that the behaviors for SQL-CLI is unexpected, or a “bug” that
> > need
> > > to fix.
> > >
> > > We did not make a conclusion for the syntax:
> > >
> > > - CREATE TABLE … PARTITIONED BY ...
> > >
> > > Which means that the behavior of it is under-discussion, so it is okey
> to
> > > be without the HIVE dialect limitation, we do not actually have any
> table
> > > sources/sinks that support such a DDL so for current code base, users
> > > should not be influenced by the behaviors change.
> > >
> > > So I’m
> > >
> > > +1 to remove the hive dialect limitations for INSERT OVERWRITE and
> INSERT
> > > PARTITION
> > > +0 to add yaml dialect conf to SQL-CLI because FLIP-89 is not finished
> > > yet, we better do this until FLIP-89 is resolved.
> > >
> > > Best,
> > > Danny Chan
> > > 在 2019年12月11日 +0800 PM5:29,Jingsong Li <[hidden email]>,写道:
> > > > Hi Dev,
> > > >
> > > > After cutting out the branch of 1.10, I tried the following functions
> > of
> > > > SQL-CLI and found that it does not support:
> > > > - insert overwrite
> > > > - PARTITION (partcol1=val1, partcol2=val2 ...)
> > > > The SQL pattern is:
> > > > INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION
> (partcol1=val1,
> > > > partcol2=val2 ...) select_statement1 FROM from_statement;
> > > > It is a surprise to me.
> > > > The reason is that we only allow these two grammars in hive dialect.
> > And
> > > > SQL-CLI does not have an interface to switch dialects.
> > > >
> > > > Because it directly hinders the SQL-CLI's insert syntax in hive
> > > integration
> > > > and seriously hinders the practicability of SQL-CLI.
> > > > And we have introduced these two grammars in FLIP-63 [1] to Flink.
> > > > Here are my question:
> > > > 1.Should we remove hive dialect limitation for these two grammars?
> > > > 2.Should we fix this in 1.10?
> > > >
> > > > [1]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> > > >
> > > > Best,
> > > > Jingsong Lee
> > >
> >
>
>
> --
> Best, Jingsong Lee
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Overwrite and partition inserting support in 1.10

Timo Walther-2
Hi Jingsong,

I will also add my opinion here for future discussions:

We had long discussions around SQL syntax in the past (e.g. for
WATERMARK or the concept of SYSTEM/TEMPORARY catalog objects) but in the
end all parties were happy and we came up with a good long-term solution
that is unlikely to be changed in the near future. IMHO we can differ
from the standard esp. for DDL statements (every SQL vendors has custom
syntax there) but still we should have time to hear many opinions.
However, for DQL and DML statements we need to be even more cautious
because here we enter SQL standard areas.

Happy to discuss a partion syntax for 1.11 and go with the solution that
Jark proposed using a config option.

Thanks,
Timo


On 12.12.19 09:40, Jark Wu wrote:

> Hi Jingsong,
>
> Thanks for the explanation, I think I misunderstood your point at the
> begining.
> As FLIP-63 proposed, INSERT OVERWRITE and INSERT PARTITION syntax are added
> to Flink's
> SQL syntax, but CREATE PARTITION TABLE should be limited under Hive
> dialect.
> However, the current implementation is opposite, INSERT OVERWRITE
> and INSERT PARTITION are
> under the dialect limitation, but CREATE PARTITION TABLE is not.
>
> So it is indeed a bug which should be fixed in 1.10.
>
> Best,
> Jark
>
> On Thu, 12 Dec 2019 at 16:35, Jingsong Li <[hidden email]> wrote:
>
>> Hi Jark,
>>
>> Let's recall FLIP-63,
>> We supported these syntax in hive dialect at 1.9. All of my reasons for
>> launching FLIP-63 are to bring partition support to Flink itself.
>> Not only batch, but also we have the need to stream jobs to write partition
>> files today, which is also one of our very important application scenarios.
>>
>> The original intention of FLIP-63 is to bring all partition syntax to
>> Flink, but in the end you and I have some different opinion in creating
>> partition table, so our consensus is to leave it in hive dialect, only it.
>>
>> [1]
>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
>>
>> Best,
>> Jingsong Lee
>>
>> On Thu, Dec 12, 2019 at 4:05 PM Jark Wu <[hidden email]> wrote:
>>
>>> Hi jingsong,
>>>
>>> Watermark is not a standard syntax, that's why we had a FLIP and long
>>> discussion to add it to
>>> Flink's SQL syntax. I think if we want to add INSERT OVERWRITE and
>>> PARTITION syntax to
>>>   Flink's own syntax,  we also need a FLIP or a VOTE, and this may can't
>>> happen soon (we should
>>> hear more people's opinions on this).
>>>
>>> Regarding to the sql-dialect configuration, I was not saying to involve
>> the
>>> whole FLIP-89. I mean we can just
>>> start a VOTE to expose it as `table.planner.sql-dialect` and include it
>> in
>>> 1.10. The change can be very small, by
>>> adding a ConfigOption and changing the implementation of
>>> TableConfig#getSqlDialect/setSqlDialect. I believe
>>> it is smaller and safer than changing the parser.
>>>
>>> Btw, I cc'ed Yu Li and Gary into the discussion, because release managers
>>> should be aware of this.
>>>
>>> Best,
>>> Jark
>>>
>>>
>>>
>>> On Thu, 12 Dec 2019 at 11:47, Danny Chan <[hidden email]> wrote:
>>>
>>>> Thanks Jingsong for bring up this discussion ~
>>>>
>>>> After reviewing FLIP-63, it seems that we have made a conclusion for
>> the
>>>> syntax
>>>>
>>>> - INSERT OVERWRITE ...
>>>> - INSERT INTO … PARTITION
>>>>
>>>> Which means that they should not have the Hive dialect limitation, so
>> I’m
>>>> inclined that the behaviors for SQL-CLI is unexpected, or a “bug” that
>>> need
>>>> to fix.
>>>>
>>>> We did not make a conclusion for the syntax:
>>>>
>>>> - CREATE TABLE … PARTITIONED BY ...
>>>>
>>>> Which means that the behavior of it is under-discussion, so it is okey
>> to
>>>> be without the HIVE dialect limitation, we do not actually have any
>> table
>>>> sources/sinks that support such a DDL so for current code base, users
>>>> should not be influenced by the behaviors change.
>>>>
>>>> So I’m
>>>>
>>>> +1 to remove the hive dialect limitations for INSERT OVERWRITE and
>> INSERT
>>>> PARTITION
>>>> +0 to add yaml dialect conf to SQL-CLI because FLIP-89 is not finished
>>>> yet, we better do this until FLIP-89 is resolved.
>>>>
>>>> Best,
>>>> Danny Chan
>>>> 在 2019年12月11日 +0800 PM5:29,Jingsong Li <[hidden email]>,写道:
>>>>> Hi Dev,
>>>>>
>>>>> After cutting out the branch of 1.10, I tried the following functions
>>> of
>>>>> SQL-CLI and found that it does not support:
>>>>> - insert overwrite
>>>>> - PARTITION (partcol1=val1, partcol2=val2 ...)
>>>>> The SQL pattern is:
>>>>> INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION
>> (partcol1=val1,
>>>>> partcol2=val2 ...) select_statement1 FROM from_statement;
>>>>> It is a surprise to me.
>>>>> The reason is that we only allow these two grammars in hive dialect.
>>> And
>>>>> SQL-CLI does not have an interface to switch dialects.
>>>>>
>>>>> Because it directly hinders the SQL-CLI's insert syntax in hive
>>>> integration
>>>>> and seriously hinders the practicability of SQL-CLI.
>>>>> And we have introduced these two grammars in FLIP-63 [1] to Flink.
>>>>> Here are my question:
>>>>> 1.Should we remove hive dialect limitation for these two grammars?
>>>>> 2.Should we fix this in 1.10?
>>>>>
>>>>> [1]
>>>>>
>>>>
>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
>>>>>
>>>>> Best,
>>>>> Jingsong Lee
>>>>
>>>
>>
>>
>> --
>> Best, Jingsong Lee
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Overwrite and partition inserting support in 1.10

Jingsong Li
Hi Timo,

I am OK if you think they are not bug and they should not be included in
1.10.

I think they have been accepted in FLIP-63. And there is no objection. It
has been more than three months since the discussion of FLIP-63. It's been
six months since Flink added these two syntaxs.

But I can also start discussion and vote thread for FLIP-63 again, to make
sure once again that everyone is happy.

Best,
Jingsong Lee

On Thu, Dec 12, 2019 at 11:35 PM Timo Walther <[hidden email]> wrote:

> Hi Jingsong,
>
> I will also add my opinion here for future discussions:
>
> We had long discussions around SQL syntax in the past (e.g. for
> WATERMARK or the concept of SYSTEM/TEMPORARY catalog objects) but in the
> end all parties were happy and we came up with a good long-term solution
> that is unlikely to be changed in the near future. IMHO we can differ
> from the standard esp. for DDL statements (every SQL vendors has custom
> syntax there) but still we should have time to hear many opinions.
> However, for DQL and DML statements we need to be even more cautious
> because here we enter SQL standard areas.
>
> Happy to discuss a partion syntax for 1.11 and go with the solution that
> Jark proposed using a config option.
>
> Thanks,
> Timo
>
>
> On 12.12.19 09:40, Jark Wu wrote:
> > Hi Jingsong,
> >
> > Thanks for the explanation, I think I misunderstood your point at the
> > begining.
> > As FLIP-63 proposed, INSERT OVERWRITE and INSERT PARTITION syntax are
> added
> > to Flink's
> > SQL syntax, but CREATE PARTITION TABLE should be limited under Hive
> > dialect.
> > However, the current implementation is opposite, INSERT OVERWRITE
> > and INSERT PARTITION are
> > under the dialect limitation, but CREATE PARTITION TABLE is not.
> >
> > So it is indeed a bug which should be fixed in 1.10.
> >
> > Best,
> > Jark
> >
> > On Thu, 12 Dec 2019 at 16:35, Jingsong Li <[hidden email]>
> wrote:
> >
> >> Hi Jark,
> >>
> >> Let's recall FLIP-63,
> >> We supported these syntax in hive dialect at 1.9. All of my reasons for
> >> launching FLIP-63 are to bring partition support to Flink itself.
> >> Not only batch, but also we have the need to stream jobs to write
> partition
> >> files today, which is also one of our very important application
> scenarios.
> >>
> >> The original intention of FLIP-63 is to bring all partition syntax to
> >> Flink, but in the end you and I have some different opinion in creating
> >> partition table, so our consensus is to leave it in hive dialect, only
> it.
> >>
> >> [1]
> >>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> >>
> >> Best,
> >> Jingsong Lee
> >>
> >> On Thu, Dec 12, 2019 at 4:05 PM Jark Wu <[hidden email]> wrote:
> >>
> >>> Hi jingsong,
> >>>
> >>> Watermark is not a standard syntax, that's why we had a FLIP and long
> >>> discussion to add it to
> >>> Flink's SQL syntax. I think if we want to add INSERT OVERWRITE and
> >>> PARTITION syntax to
> >>>   Flink's own syntax,  we also need a FLIP or a VOTE, and this may
> can't
> >>> happen soon (we should
> >>> hear more people's opinions on this).
> >>>
> >>> Regarding to the sql-dialect configuration, I was not saying to involve
> >> the
> >>> whole FLIP-89. I mean we can just
> >>> start a VOTE to expose it as `table.planner.sql-dialect` and include it
> >> in
> >>> 1.10. The change can be very small, by
> >>> adding a ConfigOption and changing the implementation of
> >>> TableConfig#getSqlDialect/setSqlDialect. I believe
> >>> it is smaller and safer than changing the parser.
> >>>
> >>> Btw, I cc'ed Yu Li and Gary into the discussion, because release
> managers
> >>> should be aware of this.
> >>>
> >>> Best,
> >>> Jark
> >>>
> >>>
> >>>
> >>> On Thu, 12 Dec 2019 at 11:47, Danny Chan <[hidden email]> wrote:
> >>>
> >>>> Thanks Jingsong for bring up this discussion ~
> >>>>
> >>>> After reviewing FLIP-63, it seems that we have made a conclusion for
> >> the
> >>>> syntax
> >>>>
> >>>> - INSERT OVERWRITE ...
> >>>> - INSERT INTO … PARTITION
> >>>>
> >>>> Which means that they should not have the Hive dialect limitation, so
> >> I’m
> >>>> inclined that the behaviors for SQL-CLI is unexpected, or a “bug” that
> >>> need
> >>>> to fix.
> >>>>
> >>>> We did not make a conclusion for the syntax:
> >>>>
> >>>> - CREATE TABLE … PARTITIONED BY ...
> >>>>
> >>>> Which means that the behavior of it is under-discussion, so it is okey
> >> to
> >>>> be without the HIVE dialect limitation, we do not actually have any
> >> table
> >>>> sources/sinks that support such a DDL so for current code base, users
> >>>> should not be influenced by the behaviors change.
> >>>>
> >>>> So I’m
> >>>>
> >>>> +1 to remove the hive dialect limitations for INSERT OVERWRITE and
> >> INSERT
> >>>> PARTITION
> >>>> +0 to add yaml dialect conf to SQL-CLI because FLIP-89 is not finished
> >>>> yet, we better do this until FLIP-89 is resolved.
> >>>>
> >>>> Best,
> >>>> Danny Chan
> >>>> 在 2019年12月11日 +0800 PM5:29,Jingsong Li <[hidden email]>,写道:
> >>>>> Hi Dev,
> >>>>>
> >>>>> After cutting out the branch of 1.10, I tried the following functions
> >>> of
> >>>>> SQL-CLI and found that it does not support:
> >>>>> - insert overwrite
> >>>>> - PARTITION (partcol1=val1, partcol2=val2 ...)
> >>>>> The SQL pattern is:
> >>>>> INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION
> >> (partcol1=val1,
> >>>>> partcol2=val2 ...) select_statement1 FROM from_statement;
> >>>>> It is a surprise to me.
> >>>>> The reason is that we only allow these two grammars in hive dialect.
> >>> And
> >>>>> SQL-CLI does not have an interface to switch dialects.
> >>>>>
> >>>>> Because it directly hinders the SQL-CLI's insert syntax in hive
> >>>> integration
> >>>>> and seriously hinders the practicability of SQL-CLI.
> >>>>> And we have introduced these two grammars in FLIP-63 [1] to Flink.
> >>>>> Here are my question:
> >>>>> 1.Should we remove hive dialect limitation for these two grammars?
> >>>>> 2.Should we fix this in 1.10?
> >>>>>
> >>>>> [1]
> >>>>>
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> >>>>>
> >>>>> Best,
> >>>>> Jingsong Lee
> >>>>
> >>>
> >>
> >>
> >> --
> >> Best, Jingsong Lee
> >>
> >
>
>

--
Best, Jingsong Lee
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Overwrite and partition inserting support in 1.10

Rui Li
Hi Timo,

I understand we need further discussion about syntax/dialect for 1.11. But
as Jark has pointed out, the current implementation violates the accepted
design of FLIP-63, which IMO qualifies as a bug. Given that it's a bug and
has great impact on the usability of our Hive integration, do you think we
can fix it in 1.10?

On Fri, Dec 13, 2019 at 12:24 AM Jingsong Li <[hidden email]> wrote:

> Hi Timo,
>
> I am OK if you think they are not bug and they should not be included in
> 1.10.
>
> I think they have been accepted in FLIP-63. And there is no objection. It
> has been more than three months since the discussion of FLIP-63. It's been
> six months since Flink added these two syntaxs.
>
> But I can also start discussion and vote thread for FLIP-63 again, to make
> sure once again that everyone is happy.
>
> Best,
> Jingsong Lee
>
> On Thu, Dec 12, 2019 at 11:35 PM Timo Walther <[hidden email]> wrote:
>
> > Hi Jingsong,
> >
> > I will also add my opinion here for future discussions:
> >
> > We had long discussions around SQL syntax in the past (e.g. for
> > WATERMARK or the concept of SYSTEM/TEMPORARY catalog objects) but in the
> > end all parties were happy and we came up with a good long-term solution
> > that is unlikely to be changed in the near future. IMHO we can differ
> > from the standard esp. for DDL statements (every SQL vendors has custom
> > syntax there) but still we should have time to hear many opinions.
> > However, for DQL and DML statements we need to be even more cautious
> > because here we enter SQL standard areas.
> >
> > Happy to discuss a partion syntax for 1.11 and go with the solution that
> > Jark proposed using a config option.
> >
> > Thanks,
> > Timo
> >
> >
> > On 12.12.19 09:40, Jark Wu wrote:
> > > Hi Jingsong,
> > >
> > > Thanks for the explanation, I think I misunderstood your point at the
> > > begining.
> > > As FLIP-63 proposed, INSERT OVERWRITE and INSERT PARTITION syntax are
> > added
> > > to Flink's
> > > SQL syntax, but CREATE PARTITION TABLE should be limited under Hive
> > > dialect.
> > > However, the current implementation is opposite, INSERT OVERWRITE
> > > and INSERT PARTITION are
> > > under the dialect limitation, but CREATE PARTITION TABLE is not.
> > >
> > > So it is indeed a bug which should be fixed in 1.10.
> > >
> > > Best,
> > > Jark
> > >
> > > On Thu, 12 Dec 2019 at 16:35, Jingsong Li <[hidden email]>
> > wrote:
> > >
> > >> Hi Jark,
> > >>
> > >> Let's recall FLIP-63,
> > >> We supported these syntax in hive dialect at 1.9. All of my reasons
> for
> > >> launching FLIP-63 are to bring partition support to Flink itself.
> > >> Not only batch, but also we have the need to stream jobs to write
> > partition
> > >> files today, which is also one of our very important application
> > scenarios.
> > >>
> > >> The original intention of FLIP-63 is to bring all partition syntax to
> > >> Flink, but in the end you and I have some different opinion in
> creating
> > >> partition table, so our consensus is to leave it in hive dialect, only
> > it.
> > >>
> > >> [1]
> > >>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> > >>
> > >> Best,
> > >> Jingsong Lee
> > >>
> > >> On Thu, Dec 12, 2019 at 4:05 PM Jark Wu <[hidden email]> wrote:
> > >>
> > >>> Hi jingsong,
> > >>>
> > >>> Watermark is not a standard syntax, that's why we had a FLIP and long
> > >>> discussion to add it to
> > >>> Flink's SQL syntax. I think if we want to add INSERT OVERWRITE and
> > >>> PARTITION syntax to
> > >>>   Flink's own syntax,  we also need a FLIP or a VOTE, and this may
> > can't
> > >>> happen soon (we should
> > >>> hear more people's opinions on this).
> > >>>
> > >>> Regarding to the sql-dialect configuration, I was not saying to
> involve
> > >> the
> > >>> whole FLIP-89. I mean we can just
> > >>> start a VOTE to expose it as `table.planner.sql-dialect` and include
> it
> > >> in
> > >>> 1.10. The change can be very small, by
> > >>> adding a ConfigOption and changing the implementation of
> > >>> TableConfig#getSqlDialect/setSqlDialect. I believe
> > >>> it is smaller and safer than changing the parser.
> > >>>
> > >>> Btw, I cc'ed Yu Li and Gary into the discussion, because release
> > managers
> > >>> should be aware of this.
> > >>>
> > >>> Best,
> > >>> Jark
> > >>>
> > >>>
> > >>>
> > >>> On Thu, 12 Dec 2019 at 11:47, Danny Chan <[hidden email]>
> wrote:
> > >>>
> > >>>> Thanks Jingsong for bring up this discussion ~
> > >>>>
> > >>>> After reviewing FLIP-63, it seems that we have made a conclusion for
> > >> the
> > >>>> syntax
> > >>>>
> > >>>> - INSERT OVERWRITE ...
> > >>>> - INSERT INTO … PARTITION
> > >>>>
> > >>>> Which means that they should not have the Hive dialect limitation,
> so
> > >> I’m
> > >>>> inclined that the behaviors for SQL-CLI is unexpected, or a “bug”
> that
> > >>> need
> > >>>> to fix.
> > >>>>
> > >>>> We did not make a conclusion for the syntax:
> > >>>>
> > >>>> - CREATE TABLE … PARTITIONED BY ...
> > >>>>
> > >>>> Which means that the behavior of it is under-discussion, so it is
> okey
> > >> to
> > >>>> be without the HIVE dialect limitation, we do not actually have any
> > >> table
> > >>>> sources/sinks that support such a DDL so for current code base,
> users
> > >>>> should not be influenced by the behaviors change.
> > >>>>
> > >>>> So I’m
> > >>>>
> > >>>> +1 to remove the hive dialect limitations for INSERT OVERWRITE and
> > >> INSERT
> > >>>> PARTITION
> > >>>> +0 to add yaml dialect conf to SQL-CLI because FLIP-89 is not
> finished
> > >>>> yet, we better do this until FLIP-89 is resolved.
> > >>>>
> > >>>> Best,
> > >>>> Danny Chan
> > >>>> 在 2019年12月11日 +0800 PM5:29,Jingsong Li <[hidden email]>,写道:
> > >>>>> Hi Dev,
> > >>>>>
> > >>>>> After cutting out the branch of 1.10, I tried the following
> functions
> > >>> of
> > >>>>> SQL-CLI and found that it does not support:
> > >>>>> - insert overwrite
> > >>>>> - PARTITION (partcol1=val1, partcol2=val2 ...)
> > >>>>> The SQL pattern is:
> > >>>>> INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION
> > >> (partcol1=val1,
> > >>>>> partcol2=val2 ...) select_statement1 FROM from_statement;
> > >>>>> It is a surprise to me.
> > >>>>> The reason is that we only allow these two grammars in hive
> dialect.
> > >>> And
> > >>>>> SQL-CLI does not have an interface to switch dialects.
> > >>>>>
> > >>>>> Because it directly hinders the SQL-CLI's insert syntax in hive
> > >>>> integration
> > >>>>> and seriously hinders the practicability of SQL-CLI.
> > >>>>> And we have introduced these two grammars in FLIP-63 [1] to Flink.
> > >>>>> Here are my question:
> > >>>>> 1.Should we remove hive dialect limitation for these two grammars?
> > >>>>> 2.Should we fix this in 1.10?
> > >>>>>
> > >>>>> [1]
> > >>>>>
> > >>>>
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> > >>>>>
> > >>>>> Best,
> > >>>>> Jingsong Lee
> > >>>>
> > >>>
> > >>
> > >>
> > >> --
> > >> Best, Jingsong Lee
> > >>
> > >
> >
> >
>
> --
> Best, Jingsong Lee
>


--
Best regards!
Rui Li
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Overwrite and partition inserting support in 1.10

Timo Walther-2
Hi everyone,

sorry, I was not aware that FLIP-63 already lists a lot of additional
SQL grammar. It was accepted though an official voting process so I
guess we can adopt the listed grammar for Flink SQL.

The only thing that confuses me is the mentioning of `The DDL can like
this (With hive dialect)`, for the next time we should make it more
explicit what belongs to the Flink SQL dialect and what belongs to the
Hive dialect. Also the not intuitive `MSCK REPAIR TABLE` seems strange
to me. What does MSCK stand for? Maybe we should skip stuff like that
for now.

The current problem for the CLI is that it still does not use the
flink-sql-parser. We can support all new syntax in Flink 1.11 once the
CLI architecture has changed there. With the current architecture (using
simple regex), I'm not sure if we can support all the grammar listed in
the FLIP.

Regards,
Timo

On 13.12.19 04:22, Rui Li wrote:

> Hi Timo,
>
> I understand we need further discussion about syntax/dialect for 1.11. But
> as Jark has pointed out, the current implementation violates the accepted
> design of FLIP-63, which IMO qualifies as a bug. Given that it's a bug and
> has great impact on the usability of our Hive integration, do you think we
> can fix it in 1.10?
>
> On Fri, Dec 13, 2019 at 12:24 AM Jingsong Li <[hidden email]> wrote:
>
>> Hi Timo,
>>
>> I am OK if you think they are not bug and they should not be included in
>> 1.10.
>>
>> I think they have been accepted in FLIP-63. And there is no objection. It
>> has been more than three months since the discussion of FLIP-63. It's been
>> six months since Flink added these two syntaxs.
>>
>> But I can also start discussion and vote thread for FLIP-63 again, to make
>> sure once again that everyone is happy.
>>
>> Best,
>> Jingsong Lee
>>
>> On Thu, Dec 12, 2019 at 11:35 PM Timo Walther <[hidden email]> wrote:
>>
>>> Hi Jingsong,
>>>
>>> I will also add my opinion here for future discussions:
>>>
>>> We had long discussions around SQL syntax in the past (e.g. for
>>> WATERMARK or the concept of SYSTEM/TEMPORARY catalog objects) but in the
>>> end all parties were happy and we came up with a good long-term solution
>>> that is unlikely to be changed in the near future. IMHO we can differ
>>> from the standard esp. for DDL statements (every SQL vendors has custom
>>> syntax there) but still we should have time to hear many opinions.
>>> However, for DQL and DML statements we need to be even more cautious
>>> because here we enter SQL standard areas.
>>>
>>> Happy to discuss a partion syntax for 1.11 and go with the solution that
>>> Jark proposed using a config option.
>>>
>>> Thanks,
>>> Timo
>>>
>>>
>>> On 12.12.19 09:40, Jark Wu wrote:
>>>> Hi Jingsong,
>>>>
>>>> Thanks for the explanation, I think I misunderstood your point at the
>>>> begining.
>>>> As FLIP-63 proposed, INSERT OVERWRITE and INSERT PARTITION syntax are
>>> added
>>>> to Flink's
>>>> SQL syntax, but CREATE PARTITION TABLE should be limited under Hive
>>>> dialect.
>>>> However, the current implementation is opposite, INSERT OVERWRITE
>>>> and INSERT PARTITION are
>>>> under the dialect limitation, but CREATE PARTITION TABLE is not.
>>>>
>>>> So it is indeed a bug which should be fixed in 1.10.
>>>>
>>>> Best,
>>>> Jark
>>>>
>>>> On Thu, 12 Dec 2019 at 16:35, Jingsong Li <[hidden email]>
>>> wrote:
>>>>
>>>>> Hi Jark,
>>>>>
>>>>> Let's recall FLIP-63,
>>>>> We supported these syntax in hive dialect at 1.9. All of my reasons
>> for
>>>>> launching FLIP-63 are to bring partition support to Flink itself.
>>>>> Not only batch, but also we have the need to stream jobs to write
>>> partition
>>>>> files today, which is also one of our very important application
>>> scenarios.
>>>>>
>>>>> The original intention of FLIP-63 is to bring all partition syntax to
>>>>> Flink, but in the end you and I have some different opinion in
>> creating
>>>>> partition table, so our consensus is to leave it in hive dialect, only
>>> it.
>>>>>
>>>>> [1]
>>>>>
>>>>>
>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
>>>>>
>>>>> Best,
>>>>> Jingsong Lee
>>>>>
>>>>> On Thu, Dec 12, 2019 at 4:05 PM Jark Wu <[hidden email]> wrote:
>>>>>
>>>>>> Hi jingsong,
>>>>>>
>>>>>> Watermark is not a standard syntax, that's why we had a FLIP and long
>>>>>> discussion to add it to
>>>>>> Flink's SQL syntax. I think if we want to add INSERT OVERWRITE and
>>>>>> PARTITION syntax to
>>>>>>    Flink's own syntax,  we also need a FLIP or a VOTE, and this may
>>> can't
>>>>>> happen soon (we should
>>>>>> hear more people's opinions on this).
>>>>>>
>>>>>> Regarding to the sql-dialect configuration, I was not saying to
>> involve
>>>>> the
>>>>>> whole FLIP-89. I mean we can just
>>>>>> start a VOTE to expose it as `table.planner.sql-dialect` and include
>> it
>>>>> in
>>>>>> 1.10. The change can be very small, by
>>>>>> adding a ConfigOption and changing the implementation of
>>>>>> TableConfig#getSqlDialect/setSqlDialect. I believe
>>>>>> it is smaller and safer than changing the parser.
>>>>>>
>>>>>> Btw, I cc'ed Yu Li and Gary into the discussion, because release
>>> managers
>>>>>> should be aware of this.
>>>>>>
>>>>>> Best,
>>>>>> Jark
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, 12 Dec 2019 at 11:47, Danny Chan <[hidden email]>
>> wrote:
>>>>>>
>>>>>>> Thanks Jingsong for bring up this discussion ~
>>>>>>>
>>>>>>> After reviewing FLIP-63, it seems that we have made a conclusion for
>>>>> the
>>>>>>> syntax
>>>>>>>
>>>>>>> - INSERT OVERWRITE ...
>>>>>>> - INSERT INTO … PARTITION
>>>>>>>
>>>>>>> Which means that they should not have the Hive dialect limitation,
>> so
>>>>> I’m
>>>>>>> inclined that the behaviors for SQL-CLI is unexpected, or a “bug”
>> that
>>>>>> need
>>>>>>> to fix.
>>>>>>>
>>>>>>> We did not make a conclusion for the syntax:
>>>>>>>
>>>>>>> - CREATE TABLE … PARTITIONED BY ...
>>>>>>>
>>>>>>> Which means that the behavior of it is under-discussion, so it is
>> okey
>>>>> to
>>>>>>> be without the HIVE dialect limitation, we do not actually have any
>>>>> table
>>>>>>> sources/sinks that support such a DDL so for current code base,
>> users
>>>>>>> should not be influenced by the behaviors change.
>>>>>>>
>>>>>>> So I’m
>>>>>>>
>>>>>>> +1 to remove the hive dialect limitations for INSERT OVERWRITE and
>>>>> INSERT
>>>>>>> PARTITION
>>>>>>> +0 to add yaml dialect conf to SQL-CLI because FLIP-89 is not
>> finished
>>>>>>> yet, we better do this until FLIP-89 is resolved.
>>>>>>>
>>>>>>> Best,
>>>>>>> Danny Chan
>>>>>>> 在 2019年12月11日 +0800 PM5:29,Jingsong Li <[hidden email]>,写道:
>>>>>>>> Hi Dev,
>>>>>>>>
>>>>>>>> After cutting out the branch of 1.10, I tried the following
>> functions
>>>>>> of
>>>>>>>> SQL-CLI and found that it does not support:
>>>>>>>> - insert overwrite
>>>>>>>> - PARTITION (partcol1=val1, partcol2=val2 ...)
>>>>>>>> The SQL pattern is:
>>>>>>>> INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION
>>>>> (partcol1=val1,
>>>>>>>> partcol2=val2 ...) select_statement1 FROM from_statement;
>>>>>>>> It is a surprise to me.
>>>>>>>> The reason is that we only allow these two grammars in hive
>> dialect.
>>>>>> And
>>>>>>>> SQL-CLI does not have an interface to switch dialects.
>>>>>>>>
>>>>>>>> Because it directly hinders the SQL-CLI's insert syntax in hive
>>>>>>> integration
>>>>>>>> and seriously hinders the practicability of SQL-CLI.
>>>>>>>> And we have introduced these two grammars in FLIP-63 [1] to Flink.
>>>>>>>> Here are my question:
>>>>>>>> 1.Should we remove hive dialect limitation for these two grammars?
>>>>>>>> 2.Should we fix this in 1.10?
>>>>>>>>
>>>>>>>> [1]
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Jingsong Lee
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Best, Jingsong Lee
>>>>>
>>>>
>>>
>>>
>>
>> --
>> Best, Jingsong Lee
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Overwrite and partition inserting support in 1.10

Jingsong Li
Hi Timo,

Thanks for your feedback.

The reason of `The DDL can like this (With hive dialect)` is:
The syntax of creating partition table is controversial, so we think we
should put it aside for the time being to make it invisible to users. Since
we implemented this syntax in 1.9, we decided to put it under hive dialect.
(Subsequent votes will be taken to determine.)
MSCK means metastore check, It is listed in the document mainly for the
integrity of the story. Of course, it can be ignored. I fully agree that it
can be discussed slowly in the future.

>The current problem for the CLI is that it still does not use the
flink-sql-parser. We can support all new syntax in Flink 1.11 once the CLI
architecture has changed there. With the current architecture (using simple
regex), I'm not sure if we can support all the grammar listed in the FLIP.

You are right, we can not support any DDL grammars in FLIP. we should wait
refactor of SQL-CLI in 1.11.
What we can do is to remove the restriction of hiveDialect for the insert
(overwrite and partition) syntax. That's what we've supported, and so far
we haven't received any objections.

Best,
Jingsong Lee

On Fri, Dec 13, 2019 at 4:51 PM Timo Walther <[hidden email]> wrote:

> Hi everyone,
>
> sorry, I was not aware that FLIP-63 already lists a lot of additional
> SQL grammar. It was accepted though an official voting process so I
> guess we can adopt the listed grammar for Flink SQL.
>
> The only thing that confuses me is the mentioning of `The DDL can like
> this (With hive dialect)`, for the next time we should make it more
> explicit what belongs to the Flink SQL dialect and what belongs to the
> Hive dialect. Also the not intuitive `MSCK REPAIR TABLE` seems strange
> to me. What does MSCK stand for? Maybe we should skip stuff like that
> for now.
>
> The current problem for the CLI is that it still does not use the
> flink-sql-parser. We can support all new syntax in Flink 1.11 once the
> CLI architecture has changed there. With the current architecture (using
> simple regex), I'm not sure if we can support all the grammar listed in
> the FLIP.
>
> Regards,
> Timo
>
> On 13.12.19 04:22, Rui Li wrote:
> > Hi Timo,
> >
> > I understand we need further discussion about syntax/dialect for 1.11.
> But
> > as Jark has pointed out, the current implementation violates the accepted
> > design of FLIP-63, which IMO qualifies as a bug. Given that it's a bug
> and
> > has great impact on the usability of our Hive integration, do you think
> we
> > can fix it in 1.10?
> >
> > On Fri, Dec 13, 2019 at 12:24 AM Jingsong Li <[hidden email]>
> wrote:
> >
> >> Hi Timo,
> >>
> >> I am OK if you think they are not bug and they should not be included in
> >> 1.10.
> >>
> >> I think they have been accepted in FLIP-63. And there is no objection.
> It
> >> has been more than three months since the discussion of FLIP-63. It's
> been
> >> six months since Flink added these two syntaxs.
> >>
> >> But I can also start discussion and vote thread for FLIP-63 again, to
> make
> >> sure once again that everyone is happy.
> >>
> >> Best,
> >> Jingsong Lee
> >>
> >> On Thu, Dec 12, 2019 at 11:35 PM Timo Walther <[hidden email]>
> wrote:
> >>
> >>> Hi Jingsong,
> >>>
> >>> I will also add my opinion here for future discussions:
> >>>
> >>> We had long discussions around SQL syntax in the past (e.g. for
> >>> WATERMARK or the concept of SYSTEM/TEMPORARY catalog objects) but in
> the
> >>> end all parties were happy and we came up with a good long-term
> solution
> >>> that is unlikely to be changed in the near future. IMHO we can differ
> >>> from the standard esp. for DDL statements (every SQL vendors has custom
> >>> syntax there) but still we should have time to hear many opinions.
> >>> However, for DQL and DML statements we need to be even more cautious
> >>> because here we enter SQL standard areas.
> >>>
> >>> Happy to discuss a partion syntax for 1.11 and go with the solution
> that
> >>> Jark proposed using a config option.
> >>>
> >>> Thanks,
> >>> Timo
> >>>
> >>>
> >>> On 12.12.19 09:40, Jark Wu wrote:
> >>>> Hi Jingsong,
> >>>>
> >>>> Thanks for the explanation, I think I misunderstood your point at the
> >>>> begining.
> >>>> As FLIP-63 proposed, INSERT OVERWRITE and INSERT PARTITION syntax are
> >>> added
> >>>> to Flink's
> >>>> SQL syntax, but CREATE PARTITION TABLE should be limited under Hive
> >>>> dialect.
> >>>> However, the current implementation is opposite, INSERT OVERWRITE
> >>>> and INSERT PARTITION are
> >>>> under the dialect limitation, but CREATE PARTITION TABLE is not.
> >>>>
> >>>> So it is indeed a bug which should be fixed in 1.10.
> >>>>
> >>>> Best,
> >>>> Jark
> >>>>
> >>>> On Thu, 12 Dec 2019 at 16:35, Jingsong Li <[hidden email]>
> >>> wrote:
> >>>>
> >>>>> Hi Jark,
> >>>>>
> >>>>> Let's recall FLIP-63,
> >>>>> We supported these syntax in hive dialect at 1.9. All of my reasons
> >> for
> >>>>> launching FLIP-63 are to bring partition support to Flink itself.
> >>>>> Not only batch, but also we have the need to stream jobs to write
> >>> partition
> >>>>> files today, which is also one of our very important application
> >>> scenarios.
> >>>>>
> >>>>> The original intention of FLIP-63 is to bring all partition syntax to
> >>>>> Flink, but in the end you and I have some different opinion in
> >> creating
> >>>>> partition table, so our consensus is to leave it in hive dialect,
> only
> >>> it.
> >>>>>
> >>>>> [1]
> >>>>>
> >>>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> >>>>>
> >>>>> Best,
> >>>>> Jingsong Lee
> >>>>>
> >>>>> On Thu, Dec 12, 2019 at 4:05 PM Jark Wu <[hidden email]> wrote:
> >>>>>
> >>>>>> Hi jingsong,
> >>>>>>
> >>>>>> Watermark is not a standard syntax, that's why we had a FLIP and
> long
> >>>>>> discussion to add it to
> >>>>>> Flink's SQL syntax. I think if we want to add INSERT OVERWRITE and
> >>>>>> PARTITION syntax to
> >>>>>>    Flink's own syntax,  we also need a FLIP or a VOTE, and this may
> >>> can't
> >>>>>> happen soon (we should
> >>>>>> hear more people's opinions on this).
> >>>>>>
> >>>>>> Regarding to the sql-dialect configuration, I was not saying to
> >> involve
> >>>>> the
> >>>>>> whole FLIP-89. I mean we can just
> >>>>>> start a VOTE to expose it as `table.planner.sql-dialect` and include
> >> it
> >>>>> in
> >>>>>> 1.10. The change can be very small, by
> >>>>>> adding a ConfigOption and changing the implementation of
> >>>>>> TableConfig#getSqlDialect/setSqlDialect. I believe
> >>>>>> it is smaller and safer than changing the parser.
> >>>>>>
> >>>>>> Btw, I cc'ed Yu Li and Gary into the discussion, because release
> >>> managers
> >>>>>> should be aware of this.
> >>>>>>
> >>>>>> Best,
> >>>>>> Jark
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Thu, 12 Dec 2019 at 11:47, Danny Chan <[hidden email]>
> >> wrote:
> >>>>>>
> >>>>>>> Thanks Jingsong for bring up this discussion ~
> >>>>>>>
> >>>>>>> After reviewing FLIP-63, it seems that we have made a conclusion
> for
> >>>>> the
> >>>>>>> syntax
> >>>>>>>
> >>>>>>> - INSERT OVERWRITE ...
> >>>>>>> - INSERT INTO … PARTITION
> >>>>>>>
> >>>>>>> Which means that they should not have the Hive dialect limitation,
> >> so
> >>>>> I’m
> >>>>>>> inclined that the behaviors for SQL-CLI is unexpected, or a “bug”
> >> that
> >>>>>> need
> >>>>>>> to fix.
> >>>>>>>
> >>>>>>> We did not make a conclusion for the syntax:
> >>>>>>>
> >>>>>>> - CREATE TABLE … PARTITIONED BY ...
> >>>>>>>
> >>>>>>> Which means that the behavior of it is under-discussion, so it is
> >> okey
> >>>>> to
> >>>>>>> be without the HIVE dialect limitation, we do not actually have any
> >>>>> table
> >>>>>>> sources/sinks that support such a DDL so for current code base,
> >> users
> >>>>>>> should not be influenced by the behaviors change.
> >>>>>>>
> >>>>>>> So I’m
> >>>>>>>
> >>>>>>> +1 to remove the hive dialect limitations for INSERT OVERWRITE and
> >>>>> INSERT
> >>>>>>> PARTITION
> >>>>>>> +0 to add yaml dialect conf to SQL-CLI because FLIP-89 is not
> >> finished
> >>>>>>> yet, we better do this until FLIP-89 is resolved.
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Danny Chan
> >>>>>>> 在 2019年12月11日 +0800 PM5:29,Jingsong Li <[hidden email]
> >,写道:
> >>>>>>>> Hi Dev,
> >>>>>>>>
> >>>>>>>> After cutting out the branch of 1.10, I tried the following
> >> functions
> >>>>>> of
> >>>>>>>> SQL-CLI and found that it does not support:
> >>>>>>>> - insert overwrite
> >>>>>>>> - PARTITION (partcol1=val1, partcol2=val2 ...)
> >>>>>>>> The SQL pattern is:
> >>>>>>>> INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION
> >>>>> (partcol1=val1,
> >>>>>>>> partcol2=val2 ...) select_statement1 FROM from_statement;
> >>>>>>>> It is a surprise to me.
> >>>>>>>> The reason is that we only allow these two grammars in hive
> >> dialect.
> >>>>>> And
> >>>>>>>> SQL-CLI does not have an interface to switch dialects.
> >>>>>>>>
> >>>>>>>> Because it directly hinders the SQL-CLI's insert syntax in hive
> >>>>>>> integration
> >>>>>>>> and seriously hinders the practicability of SQL-CLI.
> >>>>>>>> And we have introduced these two grammars in FLIP-63 [1] to Flink.
> >>>>>>>> Here are my question:
> >>>>>>>> 1.Should we remove hive dialect limitation for these two grammars?
> >>>>>>>> 2.Should we fix this in 1.10?
> >>>>>>>>
> >>>>>>>> [1]
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Jingsong Lee
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Best, Jingsong Lee
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >> --
> >> Best, Jingsong Lee
> >>
> >
> >
>
>

--
Best, Jingsong Lee
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Overwrite and partition inserting support in 1.10

Xuefu Z
Thanks all for the healthy discussions. I'd just like to point out a light
difference between standard and standard compatibility. Most of DB vendors
meant the latter  when they claim following a sql standard. However, it
doesn't mean they don't have any syntax beyond the standard grammar.
Extension is very common.

Therefore, I think Flink SQL can also have more freedom in extending the
standard grammar and adopting non-standard grammars such as "INSERT
OVERWRITE" that comes from Hive. While Hive has a lot of good cases like
this, we are also free to reject or put for a specific dialect anything
that's too specific or doesn't make sense in Flink. To me, "NSERT
OVERWRITE" can be adopted, which "MSCK REPAIR TABLE" can be rejected or put
as "Hive dailect".

Lastly, if FLIP is officially accepted, I agree that we should respect and
fix things as bugs accordingly. If there is a second thought or debate on
anything, a vote is required to overturn it. Before that, the original FLIP
holds.

Thanks,
Xuefu


On Fri, Dec 13, 2019 at 1:20 AM Jingsong Li <[hidden email]> wrote:

> Hi Timo,
>
> Thanks for your feedback.
>
> The reason of `The DDL can like this (With hive dialect)` is:
> The syntax of creating partition table is controversial, so we think we
> should put it aside for the time being to make it invisible to users. Since
> we implemented this syntax in 1.9, we decided to put it under hive dialect.
> (Subsequent votes will be taken to determine.)
> MSCK means metastore check, It is listed in the document mainly for the
> integrity of the story. Of course, it can be ignored. I fully agree that it
> can be discussed slowly in the future.
>
> >The current problem for the CLI is that it still does not use the
> flink-sql-parser. We can support all new syntax in Flink 1.11 once the CLI
> architecture has changed there. With the current architecture (using simple
> regex), I'm not sure if we can support all the grammar listed in the FLIP.
>
> You are right, we can not support any DDL grammars in FLIP. we should wait
> refactor of SQL-CLI in 1.11.
> What we can do is to remove the restriction of hiveDialect for the insert
> (overwrite and partition) syntax. That's what we've supported, and so far
> we haven't received any objections.
>
> Best,
> Jingsong Lee
>
> On Fri, Dec 13, 2019 at 4:51 PM Timo Walther <[hidden email]> wrote:
>
> > Hi everyone,
> >
> > sorry, I was not aware that FLIP-63 already lists a lot of additional
> > SQL grammar. It was accepted though an official voting process so I
> > guess we can adopt the listed grammar for Flink SQL.
> >
> > The only thing that confuses me is the mentioning of `The DDL can like
> > this (With hive dialect)`, for the next time we should make it more
> > explicit what belongs to the Flink SQL dialect and what belongs to the
> > Hive dialect. Also the not intuitive `MSCK REPAIR TABLE` seems strange
> > to me. What does MSCK stand for? Maybe we should skip stuff like that
> > for now.
> >
> > The current problem for the CLI is that it still does not use the
> > flink-sql-parser. We can support all new syntax in Flink 1.11 once the
> > CLI architecture has changed there. With the current architecture (using
> > simple regex), I'm not sure if we can support all the grammar listed in
> > the FLIP.
> >
> > Regards,
> > Timo
> >
> > On 13.12.19 04:22, Rui Li wrote:
> > > Hi Timo,
> > >
> > > I understand we need further discussion about syntax/dialect for 1.11.
> > But
> > > as Jark has pointed out, the current implementation violates the
> accepted
> > > design of FLIP-63, which IMO qualifies as a bug. Given that it's a bug
> > and
> > > has great impact on the usability of our Hive integration, do you think
> > we
> > > can fix it in 1.10?
> > >
> > > On Fri, Dec 13, 2019 at 12:24 AM Jingsong Li <[hidden email]>
> > wrote:
> > >
> > >> Hi Timo,
> > >>
> > >> I am OK if you think they are not bug and they should not be included
> in
> > >> 1.10.
> > >>
> > >> I think they have been accepted in FLIP-63. And there is no objection.
> > It
> > >> has been more than three months since the discussion of FLIP-63. It's
> > been
> > >> six months since Flink added these two syntaxs.
> > >>
> > >> But I can also start discussion and vote thread for FLIP-63 again, to
> > make
> > >> sure once again that everyone is happy.
> > >>
> > >> Best,
> > >> Jingsong Lee
> > >>
> > >> On Thu, Dec 12, 2019 at 11:35 PM Timo Walther <[hidden email]>
> > wrote:
> > >>
> > >>> Hi Jingsong,
> > >>>
> > >>> I will also add my opinion here for future discussions:
> > >>>
> > >>> We had long discussions around SQL syntax in the past (e.g. for
> > >>> WATERMARK or the concept of SYSTEM/TEMPORARY catalog objects) but in
> > the
> > >>> end all parties were happy and we came up with a good long-term
> > solution
> > >>> that is unlikely to be changed in the near future. IMHO we can differ
> > >>> from the standard esp. for DDL statements (every SQL vendors has
> custom
> > >>> syntax there) but still we should have time to hear many opinions.
> > >>> However, for DQL and DML statements we need to be even more cautious
> > >>> because here we enter SQL standard areas.
> > >>>
> > >>> Happy to discuss a partion syntax for 1.11 and go with the solution
> > that
> > >>> Jark proposed using a config option.
> > >>>
> > >>> Thanks,
> > >>> Timo
> > >>>
> > >>>
> > >>> On 12.12.19 09:40, Jark Wu wrote:
> > >>>> Hi Jingsong,
> > >>>>
> > >>>> Thanks for the explanation, I think I misunderstood your point at
> the
> > >>>> begining.
> > >>>> As FLIP-63 proposed, INSERT OVERWRITE and INSERT PARTITION syntax
> are
> > >>> added
> > >>>> to Flink's
> > >>>> SQL syntax, but CREATE PARTITION TABLE should be limited under Hive
> > >>>> dialect.
> > >>>> However, the current implementation is opposite, INSERT OVERWRITE
> > >>>> and INSERT PARTITION are
> > >>>> under the dialect limitation, but CREATE PARTITION TABLE is not.
> > >>>>
> > >>>> So it is indeed a bug which should be fixed in 1.10.
> > >>>>
> > >>>> Best,
> > >>>> Jark
> > >>>>
> > >>>> On Thu, 12 Dec 2019 at 16:35, Jingsong Li <[hidden email]>
> > >>> wrote:
> > >>>>
> > >>>>> Hi Jark,
> > >>>>>
> > >>>>> Let's recall FLIP-63,
> > >>>>> We supported these syntax in hive dialect at 1.9. All of my reasons
> > >> for
> > >>>>> launching FLIP-63 are to bring partition support to Flink itself.
> > >>>>> Not only batch, but also we have the need to stream jobs to write
> > >>> partition
> > >>>>> files today, which is also one of our very important application
> > >>> scenarios.
> > >>>>>
> > >>>>> The original intention of FLIP-63 is to bring all partition syntax
> to
> > >>>>> Flink, but in the end you and I have some different opinion in
> > >> creating
> > >>>>> partition table, so our consensus is to leave it in hive dialect,
> > only
> > >>> it.
> > >>>>>
> > >>>>> [1]
> > >>>>>
> > >>>>>
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> > >>>>>
> > >>>>> Best,
> > >>>>> Jingsong Lee
> > >>>>>
> > >>>>> On Thu, Dec 12, 2019 at 4:05 PM Jark Wu <[hidden email]> wrote:
> > >>>>>
> > >>>>>> Hi jingsong,
> > >>>>>>
> > >>>>>> Watermark is not a standard syntax, that's why we had a FLIP and
> > long
> > >>>>>> discussion to add it to
> > >>>>>> Flink's SQL syntax. I think if we want to add INSERT OVERWRITE and
> > >>>>>> PARTITION syntax to
> > >>>>>>    Flink's own syntax,  we also need a FLIP or a VOTE, and this
> may
> > >>> can't
> > >>>>>> happen soon (we should
> > >>>>>> hear more people's opinions on this).
> > >>>>>>
> > >>>>>> Regarding to the sql-dialect configuration, I was not saying to
> > >> involve
> > >>>>> the
> > >>>>>> whole FLIP-89. I mean we can just
> > >>>>>> start a VOTE to expose it as `table.planner.sql-dialect` and
> include
> > >> it
> > >>>>> in
> > >>>>>> 1.10. The change can be very small, by
> > >>>>>> adding a ConfigOption and changing the implementation of
> > >>>>>> TableConfig#getSqlDialect/setSqlDialect. I believe
> > >>>>>> it is smaller and safer than changing the parser.
> > >>>>>>
> > >>>>>> Btw, I cc'ed Yu Li and Gary into the discussion, because release
> > >>> managers
> > >>>>>> should be aware of this.
> > >>>>>>
> > >>>>>> Best,
> > >>>>>> Jark
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> On Thu, 12 Dec 2019 at 11:47, Danny Chan <[hidden email]>
> > >> wrote:
> > >>>>>>
> > >>>>>>> Thanks Jingsong for bring up this discussion ~
> > >>>>>>>
> > >>>>>>> After reviewing FLIP-63, it seems that we have made a conclusion
> > for
> > >>>>> the
> > >>>>>>> syntax
> > >>>>>>>
> > >>>>>>> - INSERT OVERWRITE ...
> > >>>>>>> - INSERT INTO … PARTITION
> > >>>>>>>
> > >>>>>>> Which means that they should not have the Hive dialect
> limitation,
> > >> so
> > >>>>> I’m
> > >>>>>>> inclined that the behaviors for SQL-CLI is unexpected, or a “bug”
> > >> that
> > >>>>>> need
> > >>>>>>> to fix.
> > >>>>>>>
> > >>>>>>> We did not make a conclusion for the syntax:
> > >>>>>>>
> > >>>>>>> - CREATE TABLE … PARTITIONED BY ...
> > >>>>>>>
> > >>>>>>> Which means that the behavior of it is under-discussion, so it is
> > >> okey
> > >>>>> to
> > >>>>>>> be without the HIVE dialect limitation, we do not actually have
> any
> > >>>>> table
> > >>>>>>> sources/sinks that support such a DDL so for current code base,
> > >> users
> > >>>>>>> should not be influenced by the behaviors change.
> > >>>>>>>
> > >>>>>>> So I’m
> > >>>>>>>
> > >>>>>>> +1 to remove the hive dialect limitations for INSERT OVERWRITE
> and
> > >>>>> INSERT
> > >>>>>>> PARTITION
> > >>>>>>> +0 to add yaml dialect conf to SQL-CLI because FLIP-89 is not
> > >> finished
> > >>>>>>> yet, we better do this until FLIP-89 is resolved.
> > >>>>>>>
> > >>>>>>> Best,
> > >>>>>>> Danny Chan
> > >>>>>>> 在 2019年12月11日 +0800 PM5:29,Jingsong Li <[hidden email]
> > >,写道:
> > >>>>>>>> Hi Dev,
> > >>>>>>>>
> > >>>>>>>> After cutting out the branch of 1.10, I tried the following
> > >> functions
> > >>>>>> of
> > >>>>>>>> SQL-CLI and found that it does not support:
> > >>>>>>>> - insert overwrite
> > >>>>>>>> - PARTITION (partcol1=val1, partcol2=val2 ...)
> > >>>>>>>> The SQL pattern is:
> > >>>>>>>> INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION
> > >>>>> (partcol1=val1,
> > >>>>>>>> partcol2=val2 ...) select_statement1 FROM from_statement;
> > >>>>>>>> It is a surprise to me.
> > >>>>>>>> The reason is that we only allow these two grammars in hive
> > >> dialect.
> > >>>>>> And
> > >>>>>>>> SQL-CLI does not have an interface to switch dialects.
> > >>>>>>>>
> > >>>>>>>> Because it directly hinders the SQL-CLI's insert syntax in hive
> > >>>>>>> integration
> > >>>>>>>> and seriously hinders the practicability of SQL-CLI.
> > >>>>>>>> And we have introduced these two grammars in FLIP-63 [1] to
> Flink.
> > >>>>>>>> Here are my question:
> > >>>>>>>> 1.Should we remove hive dialect limitation for these two
> grammars?
> > >>>>>>>> 2.Should we fix this in 1.10?
> > >>>>>>>>
> > >>>>>>>> [1]
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> > >>>>>>>>
> > >>>>>>>> Best,
> > >>>>>>>> Jingsong Lee
> > >>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>>
> > >>>>> --
> > >>>>> Best, Jingsong Lee
> > >>>>>
> > >>>>
> > >>>
> > >>>
> > >>
> > >> --
> > >> Best, Jingsong Lee
> > >>
> > >
> > >
> >
> >
>
> --
> Best, Jingsong Lee
>


--
Xuefu Zhang

"In Honey We Trust!"
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Overwrite and partition inserting support in 1.10

Jingsong Li
Thanks all,

As your suggestion, I'd like to do:
- FLIP-63: move "CREATE TABLE ... PARTITIONED BY" and "MSCK REPAIR TABLE"
to further discussion chapter.
- Remove hive dialect limitation for supported "INSERT OVERWRITE" and
"INSERT ... PARTITION(...)".
- Limit "CREATE TABLE ... PARTITIONED BY" to hive dialect.
Thanks everyone again.

Best,
Jingsong Lee

On Sat, Dec 14, 2019 at 2:41 AM Xuefu Z <[hidden email]> wrote:

> Thanks all for the healthy discussions. I'd just like to point out a light
> difference between standard and standard compatibility. Most of DB vendors
> meant the latter  when they claim following a sql standard. However, it
> doesn't mean they don't have any syntax beyond the standard grammar.
> Extension is very common.
>
> Therefore, I think Flink SQL can also have more freedom in extending the
> standard grammar and adopting non-standard grammars such as "INSERT
> OVERWRITE" that comes from Hive. While Hive has a lot of good cases like
> this, we are also free to reject or put for a specific dialect anything
> that's too specific or doesn't make sense in Flink. To me, "NSERT
> OVERWRITE" can be adopted, which "MSCK REPAIR TABLE" can be rejected or put
> as "Hive dailect".
>
> Lastly, if FLIP is officially accepted, I agree that we should respect and
> fix things as bugs accordingly. If there is a second thought or debate on
> anything, a vote is required to overturn it. Before that, the original FLIP
> holds.
>
> Thanks,
> Xuefu
>
>
> On Fri, Dec 13, 2019 at 1:20 AM Jingsong Li <[hidden email]>
> wrote:
>
> > Hi Timo,
> >
> > Thanks for your feedback.
> >
> > The reason of `The DDL can like this (With hive dialect)` is:
> > The syntax of creating partition table is controversial, so we think we
> > should put it aside for the time being to make it invisible to users.
> Since
> > we implemented this syntax in 1.9, we decided to put it under hive
> dialect.
> > (Subsequent votes will be taken to determine.)
> > MSCK means metastore check, It is listed in the document mainly for the
> > integrity of the story. Of course, it can be ignored. I fully agree that
> it
> > can be discussed slowly in the future.
> >
> > >The current problem for the CLI is that it still does not use the
> > flink-sql-parser. We can support all new syntax in Flink 1.11 once the
> CLI
> > architecture has changed there. With the current architecture (using
> simple
> > regex), I'm not sure if we can support all the grammar listed in the
> FLIP.
> >
> > You are right, we can not support any DDL grammars in FLIP. we should
> wait
> > refactor of SQL-CLI in 1.11.
> > What we can do is to remove the restriction of hiveDialect for the insert
> > (overwrite and partition) syntax. That's what we've supported, and so far
> > we haven't received any objections.
> >
> > Best,
> > Jingsong Lee
> >
> > On Fri, Dec 13, 2019 at 4:51 PM Timo Walther <[hidden email]> wrote:
> >
> > > Hi everyone,
> > >
> > > sorry, I was not aware that FLIP-63 already lists a lot of additional
> > > SQL grammar. It was accepted though an official voting process so I
> > > guess we can adopt the listed grammar for Flink SQL.
> > >
> > > The only thing that confuses me is the mentioning of `The DDL can like
> > > this (With hive dialect)`, for the next time we should make it more
> > > explicit what belongs to the Flink SQL dialect and what belongs to the
> > > Hive dialect. Also the not intuitive `MSCK REPAIR TABLE` seems strange
> > > to me. What does MSCK stand for? Maybe we should skip stuff like that
> > > for now.
> > >
> > > The current problem for the CLI is that it still does not use the
> > > flink-sql-parser. We can support all new syntax in Flink 1.11 once the
> > > CLI architecture has changed there. With the current architecture
> (using
> > > simple regex), I'm not sure if we can support all the grammar listed in
> > > the FLIP.
> > >
> > > Regards,
> > > Timo
> > >
> > > On 13.12.19 04:22, Rui Li wrote:
> > > > Hi Timo,
> > > >
> > > > I understand we need further discussion about syntax/dialect for
> 1.11.
> > > But
> > > > as Jark has pointed out, the current implementation violates the
> > accepted
> > > > design of FLIP-63, which IMO qualifies as a bug. Given that it's a
> bug
> > > and
> > > > has great impact on the usability of our Hive integration, do you
> think
> > > we
> > > > can fix it in 1.10?
> > > >
> > > > On Fri, Dec 13, 2019 at 12:24 AM Jingsong Li <[hidden email]
> >
> > > wrote:
> > > >
> > > >> Hi Timo,
> > > >>
> > > >> I am OK if you think they are not bug and they should not be
> included
> > in
> > > >> 1.10.
> > > >>
> > > >> I think they have been accepted in FLIP-63. And there is no
> objection.
> > > It
> > > >> has been more than three months since the discussion of FLIP-63.
> It's
> > > been
> > > >> six months since Flink added these two syntaxs.
> > > >>
> > > >> But I can also start discussion and vote thread for FLIP-63 again,
> to
> > > make
> > > >> sure once again that everyone is happy.
> > > >>
> > > >> Best,
> > > >> Jingsong Lee
> > > >>
> > > >> On Thu, Dec 12, 2019 at 11:35 PM Timo Walther <[hidden email]>
> > > wrote:
> > > >>
> > > >>> Hi Jingsong,
> > > >>>
> > > >>> I will also add my opinion here for future discussions:
> > > >>>
> > > >>> We had long discussions around SQL syntax in the past (e.g. for
> > > >>> WATERMARK or the concept of SYSTEM/TEMPORARY catalog objects) but
> in
> > > the
> > > >>> end all parties were happy and we came up with a good long-term
> > > solution
> > > >>> that is unlikely to be changed in the near future. IMHO we can
> differ
> > > >>> from the standard esp. for DDL statements (every SQL vendors has
> > custom
> > > >>> syntax there) but still we should have time to hear many opinions.
> > > >>> However, for DQL and DML statements we need to be even more
> cautious
> > > >>> because here we enter SQL standard areas.
> > > >>>
> > > >>> Happy to discuss a partion syntax for 1.11 and go with the solution
> > > that
> > > >>> Jark proposed using a config option.
> > > >>>
> > > >>> Thanks,
> > > >>> Timo
> > > >>>
> > > >>>
> > > >>> On 12.12.19 09:40, Jark Wu wrote:
> > > >>>> Hi Jingsong,
> > > >>>>
> > > >>>> Thanks for the explanation, I think I misunderstood your point at
> > the
> > > >>>> begining.
> > > >>>> As FLIP-63 proposed, INSERT OVERWRITE and INSERT PARTITION syntax
> > are
> > > >>> added
> > > >>>> to Flink's
> > > >>>> SQL syntax, but CREATE PARTITION TABLE should be limited under
> Hive
> > > >>>> dialect.
> > > >>>> However, the current implementation is opposite, INSERT OVERWRITE
> > > >>>> and INSERT PARTITION are
> > > >>>> under the dialect limitation, but CREATE PARTITION TABLE is not.
> > > >>>>
> > > >>>> So it is indeed a bug which should be fixed in 1.10.
> > > >>>>
> > > >>>> Best,
> > > >>>> Jark
> > > >>>>
> > > >>>> On Thu, 12 Dec 2019 at 16:35, Jingsong Li <[hidden email]
> >
> > > >>> wrote:
> > > >>>>
> > > >>>>> Hi Jark,
> > > >>>>>
> > > >>>>> Let's recall FLIP-63,
> > > >>>>> We supported these syntax in hive dialect at 1.9. All of my
> reasons
> > > >> for
> > > >>>>> launching FLIP-63 are to bring partition support to Flink itself.
> > > >>>>> Not only batch, but also we have the need to stream jobs to write
> > > >>> partition
> > > >>>>> files today, which is also one of our very important application
> > > >>> scenarios.
> > > >>>>>
> > > >>>>> The original intention of FLIP-63 is to bring all partition
> syntax
> > to
> > > >>>>> Flink, but in the end you and I have some different opinion in
> > > >> creating
> > > >>>>> partition table, so our consensus is to leave it in hive dialect,
> > > only
> > > >>> it.
> > > >>>>>
> > > >>>>> [1]
> > > >>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> > > >>>>>
> > > >>>>> Best,
> > > >>>>> Jingsong Lee
> > > >>>>>
> > > >>>>> On Thu, Dec 12, 2019 at 4:05 PM Jark Wu <[hidden email]>
> wrote:
> > > >>>>>
> > > >>>>>> Hi jingsong,
> > > >>>>>>
> > > >>>>>> Watermark is not a standard syntax, that's why we had a FLIP and
> > > long
> > > >>>>>> discussion to add it to
> > > >>>>>> Flink's SQL syntax. I think if we want to add INSERT OVERWRITE
> and
> > > >>>>>> PARTITION syntax to
> > > >>>>>>    Flink's own syntax,  we also need a FLIP or a VOTE, and this
> > may
> > > >>> can't
> > > >>>>>> happen soon (we should
> > > >>>>>> hear more people's opinions on this).
> > > >>>>>>
> > > >>>>>> Regarding to the sql-dialect configuration, I was not saying to
> > > >> involve
> > > >>>>> the
> > > >>>>>> whole FLIP-89. I mean we can just
> > > >>>>>> start a VOTE to expose it as `table.planner.sql-dialect` and
> > include
> > > >> it
> > > >>>>> in
> > > >>>>>> 1.10. The change can be very small, by
> > > >>>>>> adding a ConfigOption and changing the implementation of
> > > >>>>>> TableConfig#getSqlDialect/setSqlDialect. I believe
> > > >>>>>> it is smaller and safer than changing the parser.
> > > >>>>>>
> > > >>>>>> Btw, I cc'ed Yu Li and Gary into the discussion, because release
> > > >>> managers
> > > >>>>>> should be aware of this.
> > > >>>>>>
> > > >>>>>> Best,
> > > >>>>>> Jark
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> On Thu, 12 Dec 2019 at 11:47, Danny Chan <[hidden email]>
> > > >> wrote:
> > > >>>>>>
> > > >>>>>>> Thanks Jingsong for bring up this discussion ~
> > > >>>>>>>
> > > >>>>>>> After reviewing FLIP-63, it seems that we have made a
> conclusion
> > > for
> > > >>>>> the
> > > >>>>>>> syntax
> > > >>>>>>>
> > > >>>>>>> - INSERT OVERWRITE ...
> > > >>>>>>> - INSERT INTO … PARTITION
> > > >>>>>>>
> > > >>>>>>> Which means that they should not have the Hive dialect
> > limitation,
> > > >> so
> > > >>>>> I’m
> > > >>>>>>> inclined that the behaviors for SQL-CLI is unexpected, or a
> “bug”
> > > >> that
> > > >>>>>> need
> > > >>>>>>> to fix.
> > > >>>>>>>
> > > >>>>>>> We did not make a conclusion for the syntax:
> > > >>>>>>>
> > > >>>>>>> - CREATE TABLE … PARTITIONED BY ...
> > > >>>>>>>
> > > >>>>>>> Which means that the behavior of it is under-discussion, so it
> is
> > > >> okey
> > > >>>>> to
> > > >>>>>>> be without the HIVE dialect limitation, we do not actually have
> > any
> > > >>>>> table
> > > >>>>>>> sources/sinks that support such a DDL so for current code base,
> > > >> users
> > > >>>>>>> should not be influenced by the behaviors change.
> > > >>>>>>>
> > > >>>>>>> So I’m
> > > >>>>>>>
> > > >>>>>>> +1 to remove the hive dialect limitations for INSERT OVERWRITE
> > and
> > > >>>>> INSERT
> > > >>>>>>> PARTITION
> > > >>>>>>> +0 to add yaml dialect conf to SQL-CLI because FLIP-89 is not
> > > >> finished
> > > >>>>>>> yet, we better do this until FLIP-89 is resolved.
> > > >>>>>>>
> > > >>>>>>> Best,
> > > >>>>>>> Danny Chan
> > > >>>>>>> 在 2019年12月11日 +0800 PM5:29,Jingsong Li <[hidden email]
> > > >,写道:
> > > >>>>>>>> Hi Dev,
> > > >>>>>>>>
> > > >>>>>>>> After cutting out the branch of 1.10, I tried the following
> > > >> functions
> > > >>>>>> of
> > > >>>>>>>> SQL-CLI and found that it does not support:
> > > >>>>>>>> - insert overwrite
> > > >>>>>>>> - PARTITION (partcol1=val1, partcol2=val2 ...)
> > > >>>>>>>> The SQL pattern is:
> > > >>>>>>>> INSERT { INTO | OVERWRITE } TABLE tablename1 [PARTITION
> > > >>>>> (partcol1=val1,
> > > >>>>>>>> partcol2=val2 ...) select_statement1 FROM from_statement;
> > > >>>>>>>> It is a surprise to me.
> > > >>>>>>>> The reason is that we only allow these two grammars in hive
> > > >> dialect.
> > > >>>>>> And
> > > >>>>>>>> SQL-CLI does not have an interface to switch dialects.
> > > >>>>>>>>
> > > >>>>>>>> Because it directly hinders the SQL-CLI's insert syntax in
> hive
> > > >>>>>>> integration
> > > >>>>>>>> and seriously hinders the practicability of SQL-CLI.
> > > >>>>>>>> And we have introduced these two grammars in FLIP-63 [1] to
> > Flink.
> > > >>>>>>>> Here are my question:
> > > >>>>>>>> 1.Should we remove hive dialect limitation for these two
> > grammars?
> > > >>>>>>>> 2.Should we fix this in 1.10?
> > > >>>>>>>>
> > > >>>>>>>> [1]
> > > >>>>>>>>
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>
> > > >>
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-63%3A+Rework+table+partition+support
> > > >>>>>>>>
> > > >>>>>>>> Best,
> > > >>>>>>>> Jingsong Lee
> > > >>>>>>>
> > > >>>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> --
> > > >>>>> Best, Jingsong Lee
> > > >>>>>
> > > >>>>
> > > >>>
> > > >>>
> > > >>
> > > >> --
> > > >> Best, Jingsong Lee
> > > >>
> > > >
> > > >
> > >
> > >
> >
> > --
> > Best, Jingsong Lee
> >
>
>
> --
> Xuefu Zhang
>
> "In Honey We Trust!"
>


--
Best, Jingsong Lee