[DISCUSS] FLIP-84: Improve & Refactor API of Table Module

classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] FLIP-84: Improve & Refactor API of Table Module

Terry Wang
Hi everyone,

TableEnvironment has provided two `Table sqlQuery(String sql)` and `void sqlUpdate(String sql)` interfaces to create a table(actually a view here) or describe an update action from one sql string.
But with more use cases come up, there are some fatal shortcomings in current API design. Such as  `sqlUpdate()` don’t support get a return value and buggy support for buffer sql exception and so on.

So I’d like to kick off a discussion on improvement and refactor the api of table module:

google doc: https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit?usp=sharing <https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit?usp=sharing>
Flip link: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878 <https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878>

In short, it:
        - Discuss buffering sql execute problem
        - Discuss current `sqlQuery/sqlUpdate` and propose two new api
        - Introduce one new `executeBatch` method to support batch sql execute
        - Discuss how SQL CLI should deal with multiple statements

Looking forward to all your guys comments.

Best,
Terry Wang



Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor API of Table Module

Kurt Young
Thanks Terry for bringing this up. TableEnv's interface is really critical
not only
to users, but also for components built upon it like SQL CLI. Your proposal
solved some pain points we currently have, so +1 to the proposal.

I left some comments in the document.

Best,
Kurt


On Thu, Oct 31, 2019 at 10:38 AM Terry Wang <[hidden email]> wrote:

> Hi everyone,
>
> TableEnvironment has provided two `Table sqlQuery(String sql)` and `void
> sqlUpdate(String sql)` interfaces to create a table(actually a view here)
> or describe an update action from one sql string.
> But with more use cases come up, there are some fatal shortcomings in
> current API design. Such as  `sqlUpdate()` don’t support get a return value
> and buggy support for buffer sql exception and so on.
>
> So I’d like to kick off a discussion on improvement and refactor the api
> of table module:
>
> google doc:
> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit?usp=sharing
> <
> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit?usp=sharing
> >
> Flip link:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878
> <
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878
> >
>
> In short, it:
>         - Discuss buffering sql execute problem
>         - Discuss current `sqlQuery/sqlUpdate` and propose two new api
>         - Introduce one new `executeBatch` method to support batch sql
> execute
>         - Discuss how SQL CLI should deal with multiple statements
>
> Looking forward to all your guys comments.
>
> Best,
> Terry Wang
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor API of Table Module

Kurt Young
cc @Fabian here, thought you might be interesting to review this.

Best,
Kurt


On Thu, Oct 31, 2019 at 1:39 PM Kurt Young <[hidden email]> wrote:

> Thanks Terry for bringing this up. TableEnv's interface is really critical
> not only
> to users, but also for components built upon it like SQL CLI. Your
> proposal
> solved some pain points we currently have, so +1 to the proposal.
>
> I left some comments in the document.
>
> Best,
> Kurt
>
>
> On Thu, Oct 31, 2019 at 10:38 AM Terry Wang <[hidden email]> wrote:
>
>> Hi everyone,
>>
>> TableEnvironment has provided two `Table sqlQuery(String sql)` and `void
>> sqlUpdate(String sql)` interfaces to create a table(actually a view here)
>> or describe an update action from one sql string.
>> But with more use cases come up, there are some fatal shortcomings in
>> current API design. Such as  `sqlUpdate()` don’t support get a return value
>> and buggy support for buffer sql exception and so on.
>>
>> So I’d like to kick off a discussion on improvement and refactor the api
>> of table module:
>>
>> google doc:
>> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit?usp=sharing
>> <
>> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit?usp=sharing
>> >
>> Flip link:
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878
>> <
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878
>> >
>>
>> In short, it:
>>         - Discuss buffering sql execute problem
>>         - Discuss current `sqlQuery/sqlUpdate` and propose two new api
>>         - Introduce one new `executeBatch` method to support batch sql
>> execute
>>         - Discuss how SQL CLI should deal with multiple statements
>>
>> Looking forward to all your guys comments.
>>
>> Best,
>> Terry Wang
>>
>>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor API of Table Module

Jark Wu-2
Hi Terry,

I would suggest to change the title a bit.
For example, "Improve & Refactor TableEnvironment APIs".
Or more specifically, "Improve & Refactor TableEnvironment
execute/sqlQuery/sqlUpdate.. APIs"

Currently, the title is a little wide (there are so many APIs in table
module) .
Make the title more specifically can attract more people who care about it.

Best,
Jark



On Tue, 5 Nov 2019 at 14:51, Kurt Young <[hidden email]> wrote:

> cc @Fabian here, thought you might be interesting to review this.
>
> Best,
> Kurt
>
>
> On Thu, Oct 31, 2019 at 1:39 PM Kurt Young <[hidden email]> wrote:
>
> > Thanks Terry for bringing this up. TableEnv's interface is really
> critical
> > not only
> > to users, but also for components built upon it like SQL CLI. Your
> > proposal
> > solved some pain points we currently have, so +1 to the proposal.
> >
> > I left some comments in the document.
> >
> > Best,
> > Kurt
> >
> >
> > On Thu, Oct 31, 2019 at 10:38 AM Terry Wang <[hidden email]> wrote:
> >
> >> Hi everyone,
> >>
> >> TableEnvironment has provided two `Table sqlQuery(String sql)` and `void
> >> sqlUpdate(String sql)` interfaces to create a table(actually a view
> here)
> >> or describe an update action from one sql string.
> >> But with more use cases come up, there are some fatal shortcomings in
> >> current API design. Such as  `sqlUpdate()` don’t support get a return
> value
> >> and buggy support for buffer sql exception and so on.
> >>
> >> So I’d like to kick off a discussion on improvement and refactor the api
> >> of table module:
> >>
> >> google doc:
> >>
> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit?usp=sharing
> >> <
> >>
> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit?usp=sharing
> >> >
> >> Flip link:
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878
> >> <
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878
> >> >
> >>
> >> In short, it:
> >>         - Discuss buffering sql execute problem
> >>         - Discuss current `sqlQuery/sqlUpdate` and propose two new api
> >>         - Introduce one new `executeBatch` method to support batch sql
> >> execute
> >>         - Discuss how SQL CLI should deal with multiple statements
> >>
> >> Looking forward to all your guys comments.
> >>
> >> Best,
> >> Terry Wang
> >>
> >>
> >>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

Terry Wang
Hi Jark,

Thanks for your suggestion!
Change the title and wait for more comments.

Best,
Terry Wang



> 2019年11月6日 15:52,Jark Wu <[hidden email]> 写道:
>
> Hi Terry,
>
> I would suggest to change the title a bit.
> For example, "Improve & Refactor TableEnvironment APIs".
> Or more specifically, "Improve & Refactor TableEnvironment
> execute/sqlQuery/sqlUpdate.. APIs"
>
> Currently, the title is a little wide (there are so many APIs in table
> module) .
> Make the title more specifically can attract more people who care about it.
>
> Best,
> Jark
>
>
>
> On Tue, 5 Nov 2019 at 14:51, Kurt Young <[hidden email]> wrote:
>
>> cc @Fabian here, thought you might be interesting to review this.
>>
>> Best,
>> Kurt
>>
>>
>> On Thu, Oct 31, 2019 at 1:39 PM Kurt Young <[hidden email]> wrote:
>>
>>> Thanks Terry for bringing this up. TableEnv's interface is really
>> critical
>>> not only
>>> to users, but also for components built upon it like SQL CLI. Your
>>> proposal
>>> solved some pain points we currently have, so +1 to the proposal.
>>>
>>> I left some comments in the document.
>>>
>>> Best,
>>> Kurt
>>>
>>>
>>> On Thu, Oct 31, 2019 at 10:38 AM Terry Wang <[hidden email]> wrote:
>>>
>>>> Hi everyone,
>>>>
>>>> TableEnvironment has provided two `Table sqlQuery(String sql)` and `void
>>>> sqlUpdate(String sql)` interfaces to create a table(actually a view
>> here)
>>>> or describe an update action from one sql string.
>>>> But with more use cases come up, there are some fatal shortcomings in
>>>> current API design. Such as  `sqlUpdate()` don’t support get a return
>> value
>>>> and buggy support for buffer sql exception and so on.
>>>>
>>>> So I’d like to kick off a discussion on improvement and refactor the api
>>>> of table module:
>>>>
>>>> google doc:
>>>>
>> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit?usp=sharing
>>>> <
>>>>
>> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit?usp=sharing
>>>>>
>>>> Flip link:
>>>>
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878
>>>> <
>>>>
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878
>>>>>
>>>>
>>>> In short, it:
>>>>        - Discuss buffering sql execute problem
>>>>        - Discuss current `sqlQuery/sqlUpdate` and propose two new api
>>>>        - Introduce one new `executeBatch` method to support batch sql
>>>> execute
>>>>        - Discuss how SQL CLI should deal with multiple statements
>>>>
>>>> Looking forward to all your guys comments.
>>>>
>>>> Best,
>>>> Terry Wang
>>>>
>>>>
>>>>
>>>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

godfreyhe
Hi everyone,

I'd like to resume the discussion for FlIP-84 [0]. I had updated the
document, the mainly changes are:

1. about "`void sqlUpdate(String sql)`" section
  a) change "Optional<ResultTable> executeSql(String sql) throws Exception"
to "ResultTable executeStatement(String statement, String jobName) throws
Exception". The reason is: "statement" is a more general concept than "sql",
e.g. "show xx" is not a sql command (refer to [1]), but is a statement (just
like JDBC). "insert" statement also has return value which is the affected
row count, we can unify the return type to "ResultTable" instead of
"Optional<ResultTable>".
  b) add two sub-interfaces for "ResultTable": "RowResultTable" is used for
non-streaming select statement and will not contain change flag;
"RowWithChangeFlagResultTable" is used for streaming select statement and
will contain change flag.

2) about "Support batch sql execute and explain" section
introduce "DmlBatch" to support both sql and Table API (which is borrowed
from the ideas Dawid mentioned in the slack)

interface TableEnvironment {
    DmlBatch startDmlBatch();
}

interface DmlBatch {
  /**
  * add insert statement to the batch
  */
    void addInsert(String insert);

 /**
  * add Table with given sink name to the batch
  */
    void addInsert(String sinkName, Table table);
   
 /**
  * execute the dml statements as a batch
  */
  ResultTable execute(String jobName) throws Exception
   
  /**
 * Returns the AST and the execution plan to compute the result of the batch
dml statement.
  */
  String explain(boolean extended);
}

3) about "Discuss a parse method for multiple statements execute in SQL CLI"
section
add the pros and cons for each solution

4) update the "Examples" section and "Summary" section based on the above
changes

Please refer the design doc[1] for more details and welcome any feedback.

Bests,
godfreyhe


[0]
https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit
[1] https://www.geeksforgeeks.org/sql-ddl-dql-dml-dcl-tcl-commands/



--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

Jingsong Li
Hi Godfrey,

Thanks for updating. +1 sketchy.

I have no idea to change "sqlQuery" to "fromQuery", I think "sqlQuery" is
OK, It's not that confusing with return values.

Can we change the "DmlBatch" to "Inserts"?  I don't see any other needs.
"Dml" seems a little weird.
It is better to support "Inserts addInsert" too. Users can
"inserts.addInsert().addInsert()...."

I try to match the new interfaces with the old interfaces simply.
- "startInserts -> addInsert" replace old "sqlUpdate(insert)" and
"insertInto".
- "executeStatement" new one, execute all kinds of sqls immediately.
Including old "sqlUpdate(DDLs)".

Best,
Jingsong Lee

On Wed, Feb 12, 2020 at 11:10 AM godfreyhe <[hidden email]> wrote:

> Hi everyone,
>
> I'd like to resume the discussion for FlIP-84 [0]. I had updated the
> document, the mainly changes are:
>
> 1. about "`void sqlUpdate(String sql)`" section
>   a) change "Optional<ResultTable> executeSql(String sql) throws Exception"
> to "ResultTable executeStatement(String statement, String jobName) throws
> Exception". The reason is: "statement" is a more general concept than
> "sql",
> e.g. "show xx" is not a sql command (refer to [1]), but is a statement
> (just
> like JDBC). "insert" statement also has return value which is the affected
> row count, we can unify the return type to "ResultTable" instead of
> "Optional<ResultTable>".
>   b) add two sub-interfaces for "ResultTable": "RowResultTable" is used for
> non-streaming select statement and will not contain change flag;
> "RowWithChangeFlagResultTable" is used for streaming select statement and
> will contain change flag.
>
> 2) about "Support batch sql execute and explain" section
> introduce "DmlBatch" to support both sql and Table API (which is borrowed
> from the ideas Dawid mentioned in the slack)
>
> interface TableEnvironment {
>     DmlBatch startDmlBatch();
> }
>
> interface DmlBatch {
>   /**
>   * add insert statement to the batch
>   */
>     void addInsert(String insert);
>
>  /**
>   * add Table with given sink name to the batch
>   */
>     void addInsert(String sinkName, Table table);
>
>  /**
>   * execute the dml statements as a batch
>   */
>   ResultTable execute(String jobName) throws Exception
>
>   /**
>  * Returns the AST and the execution plan to compute the result of the
> batch
> dml statement.
>   */
>   String explain(boolean extended);
> }
>
> 3) about "Discuss a parse method for multiple statements execute in SQL
> CLI"
> section
> add the pros and cons for each solution
>
> 4) update the "Examples" section and "Summary" section based on the above
> changes
>
> Please refer the design doc[1] for more details and welcome any feedback.
>
> Bests,
> godfreyhe
>
>
> [0]
>
> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit
> [1] https://www.geeksforgeeks.org/sql-ddl-dql-dml-dcl-tcl-commands/
>
>
>
> --
> Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>


--
Best, Jingsong Lee
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

Jark Wu-2
I agree with Jingsong.

+1 to keep `sqlQuery`, it's clear from the method name and return type that
it accepts a SELECT query and returns a logic representation `Table`.
The `fromQuery` is a little confused users with the `Table from(String
tableName)` method.

Regarding to the `DmlBatch`, I agree with Jingsong, AFAIK, the purpose of
`DmlBatch` is used to batching insert statements.
Besides, DML terminology is not commonly know among users. So what about
`InsertsBatching startBatchingInserts()` ?

Best,
Jark

On Thu, 13 Feb 2020 at 15:50, Jingsong Li <[hidden email]> wrote:

> Hi Godfrey,
>
> Thanks for updating. +1 sketchy.
>
> I have no idea to change "sqlQuery" to "fromQuery", I think "sqlQuery" is
> OK, It's not that confusing with return values.
>
> Can we change the "DmlBatch" to "Inserts"?  I don't see any other needs.
> "Dml" seems a little weird.
> It is better to support "Inserts addInsert" too. Users can
> "inserts.addInsert().addInsert()...."
>
> I try to match the new interfaces with the old interfaces simply.
> - "startInserts -> addInsert" replace old "sqlUpdate(insert)" and
> "insertInto".
> - "executeStatement" new one, execute all kinds of sqls immediately.
> Including old "sqlUpdate(DDLs)".
>
> Best,
> Jingsong Lee
>
> On Wed, Feb 12, 2020 at 11:10 AM godfreyhe <[hidden email]> wrote:
>
> > Hi everyone,
> >
> > I'd like to resume the discussion for FlIP-84 [0]. I had updated the
> > document, the mainly changes are:
> >
> > 1. about "`void sqlUpdate(String sql)`" section
> >   a) change "Optional<ResultTable> executeSql(String sql) throws
> Exception"
> > to "ResultTable executeStatement(String statement, String jobName) throws
> > Exception". The reason is: "statement" is a more general concept than
> > "sql",
> > e.g. "show xx" is not a sql command (refer to [1]), but is a statement
> > (just
> > like JDBC). "insert" statement also has return value which is the
> affected
> > row count, we can unify the return type to "ResultTable" instead of
> > "Optional<ResultTable>".
> >   b) add two sub-interfaces for "ResultTable": "RowResultTable" is used
> for
> > non-streaming select statement and will not contain change flag;
> > "RowWithChangeFlagResultTable" is used for streaming select statement and
> > will contain change flag.
> >
> > 2) about "Support batch sql execute and explain" section
> > introduce "DmlBatch" to support both sql and Table API (which is borrowed
> > from the ideas Dawid mentioned in the slack)
> >
> > interface TableEnvironment {
> >     DmlBatch startDmlBatch();
> > }
> >
> > interface DmlBatch {
> >   /**
> >   * add insert statement to the batch
> >   */
> >     void addInsert(String insert);
> >
> >  /**
> >   * add Table with given sink name to the batch
> >   */
> >     void addInsert(String sinkName, Table table);
> >
> >  /**
> >   * execute the dml statements as a batch
> >   */
> >   ResultTable execute(String jobName) throws Exception
> >
> >   /**
> >  * Returns the AST and the execution plan to compute the result of the
> > batch
> > dml statement.
> >   */
> >   String explain(boolean extended);
> > }
> >
> > 3) about "Discuss a parse method for multiple statements execute in SQL
> > CLI"
> > section
> > add the pros and cons for each solution
> >
> > 4) update the "Examples" section and "Summary" section based on the above
> > changes
> >
> > Please refer the design doc[1] for more details and welcome any feedback.
> >
> > Bests,
> > godfreyhe
> >
> >
> > [0]
> >
> >
> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit
> > [1] https://www.geeksforgeeks.org/sql-ddl-dql-dml-dcl-tcl-commands/
> >
> >
> >
> > --
> > Sent from:
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> >
>
>
> --
> Best, Jingsong Lee
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

Kurt Young
Regarding to "fromQuery" is confusing users with "Table from(String
tableName)", I have
a just opposite opinion. I think this "fromXXX" pattern can make users
quite clear when they
want to get a Table from TableEnvironment. Similar interfaces will also
include like "fromElements".

Regarding to the name of DmlBatch, I think it's mainly for
future flexibility, in case we can support
other statement in a single batch. If that happens, the name "Inserts" will
be weird.

Best,
Kurt


On Thu, Feb 13, 2020 at 4:03 PM Jark Wu <[hidden email]> wrote:

> I agree with Jingsong.
>
> +1 to keep `sqlQuery`, it's clear from the method name and return type that
> it accepts a SELECT query and returns a logic representation `Table`.
> The `fromQuery` is a little confused users with the `Table from(String
> tableName)` method.
>
> Regarding to the `DmlBatch`, I agree with Jingsong, AFAIK, the purpose of
> `DmlBatch` is used to batching insert statements.
> Besides, DML terminology is not commonly know among users. So what about
> `InsertsBatching startBatchingInserts()` ?
>
> Best,
> Jark
>
> On Thu, 13 Feb 2020 at 15:50, Jingsong Li <[hidden email]> wrote:
>
> > Hi Godfrey,
> >
> > Thanks for updating. +1 sketchy.
> >
> > I have no idea to change "sqlQuery" to "fromQuery", I think "sqlQuery" is
> > OK, It's not that confusing with return values.
> >
> > Can we change the "DmlBatch" to "Inserts"?  I don't see any other needs.
> > "Dml" seems a little weird.
> > It is better to support "Inserts addInsert" too. Users can
> > "inserts.addInsert().addInsert()...."
> >
> > I try to match the new interfaces with the old interfaces simply.
> > - "startInserts -> addInsert" replace old "sqlUpdate(insert)" and
> > "insertInto".
> > - "executeStatement" new one, execute all kinds of sqls immediately.
> > Including old "sqlUpdate(DDLs)".
> >
> > Best,
> > Jingsong Lee
> >
> > On Wed, Feb 12, 2020 at 11:10 AM godfreyhe <[hidden email]> wrote:
> >
> > > Hi everyone,
> > >
> > > I'd like to resume the discussion for FlIP-84 [0]. I had updated the
> > > document, the mainly changes are:
> > >
> > > 1. about "`void sqlUpdate(String sql)`" section
> > >   a) change "Optional<ResultTable> executeSql(String sql) throws
> > Exception"
> > > to "ResultTable executeStatement(String statement, String jobName)
> throws
> > > Exception". The reason is: "statement" is a more general concept than
> > > "sql",
> > > e.g. "show xx" is not a sql command (refer to [1]), but is a statement
> > > (just
> > > like JDBC). "insert" statement also has return value which is the
> > affected
> > > row count, we can unify the return type to "ResultTable" instead of
> > > "Optional<ResultTable>".
> > >   b) add two sub-interfaces for "ResultTable": "RowResultTable" is used
> > for
> > > non-streaming select statement and will not contain change flag;
> > > "RowWithChangeFlagResultTable" is used for streaming select statement
> and
> > > will contain change flag.
> > >
> > > 2) about "Support batch sql execute and explain" section
> > > introduce "DmlBatch" to support both sql and Table API (which is
> borrowed
> > > from the ideas Dawid mentioned in the slack)
> > >
> > > interface TableEnvironment {
> > >     DmlBatch startDmlBatch();
> > > }
> > >
> > > interface DmlBatch {
> > >   /**
> > >   * add insert statement to the batch
> > >   */
> > >     void addInsert(String insert);
> > >
> > >  /**
> > >   * add Table with given sink name to the batch
> > >   */
> > >     void addInsert(String sinkName, Table table);
> > >
> > >  /**
> > >   * execute the dml statements as a batch
> > >   */
> > >   ResultTable execute(String jobName) throws Exception
> > >
> > >   /**
> > >  * Returns the AST and the execution plan to compute the result of the
> > > batch
> > > dml statement.
> > >   */
> > >   String explain(boolean extended);
> > > }
> > >
> > > 3) about "Discuss a parse method for multiple statements execute in SQL
> > > CLI"
> > > section
> > > add the pros and cons for each solution
> > >
> > > 4) update the "Examples" section and "Summary" section based on the
> above
> > > changes
> > >
> > > Please refer the design doc[1] for more details and welcome any
> feedback.
> > >
> > > Bests,
> > > godfreyhe
> > >
> > >
> > > [0]
> > >
> > >
> >
> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit
> > > [1] https://www.geeksforgeeks.org/sql-ddl-dql-dml-dcl-tcl-commands/
> > >
> > >
> > >
> > > --
> > > Sent from:
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> > >
> >
> >
> > --
> > Best, Jingsong Lee
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

godfreyhe
hi kurt,jark,jingsong

Regarding to "fromQuery", I agree with kurt. In addition, I think `Table
from(String tableName)` should be renamed to `Table fromCatalog(String
tableName)`.

Regarding to the "DmlBatch", DML contains "INSERT", "UPDATE", "DELETE", and
they can be executed in a same batch in the future. So we can add
"addUpdate" method and "addDelete" method to support them.

Regarding to the "Inserts addInsert", maybe we can add a "DmlBatchBuilder".

open to more discussion

Best,
godfrey



Kurt Young <[hidden email]> 于2020年2月13日周四 下午4:56写道:

> Regarding to "fromQuery" is confusing users with "Table from(String
> tableName)", I have
> a just opposite opinion. I think this "fromXXX" pattern can make users
> quite clear when they
> want to get a Table from TableEnvironment. Similar interfaces will also
> include like "fromElements".
>
> Regarding to the name of DmlBatch, I think it's mainly for
> future flexibility, in case we can support
> other statement in a single batch. If that happens, the name "Inserts" will
> be weird.
>
> Best,
> Kurt
>
>
> On Thu, Feb 13, 2020 at 4:03 PM Jark Wu <[hidden email]> wrote:
>
> > I agree with Jingsong.
> >
> > +1 to keep `sqlQuery`, it's clear from the method name and return type
> that
> > it accepts a SELECT query and returns a logic representation `Table`.
> > The `fromQuery` is a little confused users with the `Table from(String
> > tableName)` method.
> >
> > Regarding to the `DmlBatch`, I agree with Jingsong, AFAIK, the purpose of
> > `DmlBatch` is used to batching insert statements.
> > Besides, DML terminology is not commonly know among users. So what about
> > `InsertsBatching startBatchingInserts()` ?
> >
> > Best,
> > Jark
> >
> > On Thu, 13 Feb 2020 at 15:50, Jingsong Li <[hidden email]>
> wrote:
> >
> > > Hi Godfrey,
> > >
> > > Thanks for updating. +1 sketchy.
> > >
> > > I have no idea to change "sqlQuery" to "fromQuery", I think "sqlQuery"
> is
> > > OK, It's not that confusing with return values.
> > >
> > > Can we change the "DmlBatch" to "Inserts"?  I don't see any other
> needs.
> > > "Dml" seems a little weird.
> > > It is better to support "Inserts addInsert" too. Users can
> > > "inserts.addInsert().addInsert()...."
> > >
> > > I try to match the new interfaces with the old interfaces simply.
> > > - "startInserts -> addInsert" replace old "sqlUpdate(insert)" and
> > > "insertInto".
> > > - "executeStatement" new one, execute all kinds of sqls immediately.
> > > Including old "sqlUpdate(DDLs)".
> > >
> > > Best,
> > > Jingsong Lee
> > >
> > > On Wed, Feb 12, 2020 at 11:10 AM godfreyhe <[hidden email]>
> wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > I'd like to resume the discussion for FlIP-84 [0]. I had updated the
> > > > document, the mainly changes are:
> > > >
> > > > 1. about "`void sqlUpdate(String sql)`" section
> > > >   a) change "Optional<ResultTable> executeSql(String sql) throws
> > > Exception"
> > > > to "ResultTable executeStatement(String statement, String jobName)
> > throws
> > > > Exception". The reason is: "statement" is a more general concept than
> > > > "sql",
> > > > e.g. "show xx" is not a sql command (refer to [1]), but is a
> statement
> > > > (just
> > > > like JDBC). "insert" statement also has return value which is the
> > > affected
> > > > row count, we can unify the return type to "ResultTable" instead of
> > > > "Optional<ResultTable>".
> > > >   b) add two sub-interfaces for "ResultTable": "RowResultTable" is
> used
> > > for
> > > > non-streaming select statement and will not contain change flag;
> > > > "RowWithChangeFlagResultTable" is used for streaming select statement
> > and
> > > > will contain change flag.
> > > >
> > > > 2) about "Support batch sql execute and explain" section
> > > > introduce "DmlBatch" to support both sql and Table API (which is
> > borrowed
> > > > from the ideas Dawid mentioned in the slack)
> > > >
> > > > interface TableEnvironment {
> > > >     DmlBatch startDmlBatch();
> > > > }
> > > >
> > > > interface DmlBatch {
> > > >   /**
> > > >   * add insert statement to the batch
> > > >   */
> > > >     void addInsert(String insert);
> > > >
> > > >  /**
> > > >   * add Table with given sink name to the batch
> > > >   */
> > > >     void addInsert(String sinkName, Table table);
> > > >
> > > >  /**
> > > >   * execute the dml statements as a batch
> > > >   */
> > > >   ResultTable execute(String jobName) throws Exception
> > > >
> > > >   /**
> > > >  * Returns the AST and the execution plan to compute the result of
> the
> > > > batch
> > > > dml statement.
> > > >   */
> > > >   String explain(boolean extended);
> > > > }
> > > >
> > > > 3) about "Discuss a parse method for multiple statements execute in
> SQL
> > > > CLI"
> > > > section
> > > > add the pros and cons for each solution
> > > >
> > > > 4) update the "Examples" section and "Summary" section based on the
> > above
> > > > changes
> > > >
> > > > Please refer the design doc[1] for more details and welcome any
> > feedback.
> > > >
> > > > Bests,
> > > > godfreyhe
> > > >
> > > >
> > > > [0]
> > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit
> > > > [1] https://www.geeksforgeeks.org/sql-ddl-dql-dml-dcl-tcl-commands/
> > > >
> > > >
> > > >
> > > > --
> > > > Sent from:
> > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> > > >
> > >
> > >
> > > --
> > > Best, Jingsong Lee
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

Jark Wu-2
Thanks Kurt and Godfrey for the explanation,

It makes sense to me that renaming `from(tableName)` to
`fromCatalog(tableName)`.
However, I still think `sqlQuery(query)` is clear and works well. Is it
necessary to change it?

We removed `sql(query)` and introduced `sqlQuery(query)`, we removed
`scan(tableName)` and introduced `from(tableName)`,
and now we want to remove them again. Users will feel like the interface is
very unstable, that really frustrates users.
I think we should be cautious to remove interface and only when it is
necessary.

Best,
Jark



On Thu, 13 Feb 2020 at 20:58, godfrey he <[hidden email]> wrote:

> hi kurt,jark,jingsong
>
> Regarding to "fromQuery", I agree with kurt. In addition, I think `Table
> from(String tableName)` should be renamed to `Table fromCatalog(String
> tableName)`.
>
> Regarding to the "DmlBatch", DML contains "INSERT", "UPDATE", "DELETE", and
> they can be executed in a same batch in the future. So we can add
> "addUpdate" method and "addDelete" method to support them.
>
> Regarding to the "Inserts addInsert", maybe we can add a "DmlBatchBuilder".
>
> open to more discussion
>
> Best,
> godfrey
>
>
>
> Kurt Young <[hidden email]> 于2020年2月13日周四 下午4:56写道:
>
> > Regarding to "fromQuery" is confusing users with "Table from(String
> > tableName)", I have
> > a just opposite opinion. I think this "fromXXX" pattern can make users
> > quite clear when they
> > want to get a Table from TableEnvironment. Similar interfaces will also
> > include like "fromElements".
> >
> > Regarding to the name of DmlBatch, I think it's mainly for
> > future flexibility, in case we can support
> > other statement in a single batch. If that happens, the name "Inserts"
> will
> > be weird.
> >
> > Best,
> > Kurt
> >
> >
> > On Thu, Feb 13, 2020 at 4:03 PM Jark Wu <[hidden email]> wrote:
> >
> > > I agree with Jingsong.
> > >
> > > +1 to keep `sqlQuery`, it's clear from the method name and return type
> > that
> > > it accepts a SELECT query and returns a logic representation `Table`.
> > > The `fromQuery` is a little confused users with the `Table from(String
> > > tableName)` method.
> > >
> > > Regarding to the `DmlBatch`, I agree with Jingsong, AFAIK, the purpose
> of
> > > `DmlBatch` is used to batching insert statements.
> > > Besides, DML terminology is not commonly know among users. So what
> about
> > > `InsertsBatching startBatchingInserts()` ?
> > >
> > > Best,
> > > Jark
> > >
> > > On Thu, 13 Feb 2020 at 15:50, Jingsong Li <[hidden email]>
> > wrote:
> > >
> > > > Hi Godfrey,
> > > >
> > > > Thanks for updating. +1 sketchy.
> > > >
> > > > I have no idea to change "sqlQuery" to "fromQuery", I think
> "sqlQuery"
> > is
> > > > OK, It's not that confusing with return values.
> > > >
> > > > Can we change the "DmlBatch" to "Inserts"?  I don't see any other
> > needs.
> > > > "Dml" seems a little weird.
> > > > It is better to support "Inserts addInsert" too. Users can
> > > > "inserts.addInsert().addInsert()...."
> > > >
> > > > I try to match the new interfaces with the old interfaces simply.
> > > > - "startInserts -> addInsert" replace old "sqlUpdate(insert)" and
> > > > "insertInto".
> > > > - "executeStatement" new one, execute all kinds of sqls immediately.
> > > > Including old "sqlUpdate(DDLs)".
> > > >
> > > > Best,
> > > > Jingsong Lee
> > > >
> > > > On Wed, Feb 12, 2020 at 11:10 AM godfreyhe <[hidden email]>
> > wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > I'd like to resume the discussion for FlIP-84 [0]. I had updated
> the
> > > > > document, the mainly changes are:
> > > > >
> > > > > 1. about "`void sqlUpdate(String sql)`" section
> > > > >   a) change "Optional<ResultTable> executeSql(String sql) throws
> > > > Exception"
> > > > > to "ResultTable executeStatement(String statement, String jobName)
> > > throws
> > > > > Exception". The reason is: "statement" is a more general concept
> than
> > > > > "sql",
> > > > > e.g. "show xx" is not a sql command (refer to [1]), but is a
> > statement
> > > > > (just
> > > > > like JDBC). "insert" statement also has return value which is the
> > > > affected
> > > > > row count, we can unify the return type to "ResultTable" instead of
> > > > > "Optional<ResultTable>".
> > > > >   b) add two sub-interfaces for "ResultTable": "RowResultTable" is
> > used
> > > > for
> > > > > non-streaming select statement and will not contain change flag;
> > > > > "RowWithChangeFlagResultTable" is used for streaming select
> statement
> > > and
> > > > > will contain change flag.
> > > > >
> > > > > 2) about "Support batch sql execute and explain" section
> > > > > introduce "DmlBatch" to support both sql and Table API (which is
> > > borrowed
> > > > > from the ideas Dawid mentioned in the slack)
> > > > >
> > > > > interface TableEnvironment {
> > > > >     DmlBatch startDmlBatch();
> > > > > }
> > > > >
> > > > > interface DmlBatch {
> > > > >   /**
> > > > >   * add insert statement to the batch
> > > > >   */
> > > > >     void addInsert(String insert);
> > > > >
> > > > >  /**
> > > > >   * add Table with given sink name to the batch
> > > > >   */
> > > > >     void addInsert(String sinkName, Table table);
> > > > >
> > > > >  /**
> > > > >   * execute the dml statements as a batch
> > > > >   */
> > > > >   ResultTable execute(String jobName) throws Exception
> > > > >
> > > > >   /**
> > > > >  * Returns the AST and the execution plan to compute the result of
> > the
> > > > > batch
> > > > > dml statement.
> > > > >   */
> > > > >   String explain(boolean extended);
> > > > > }
> > > > >
> > > > > 3) about "Discuss a parse method for multiple statements execute in
> > SQL
> > > > > CLI"
> > > > > section
> > > > > add the pros and cons for each solution
> > > > >
> > > > > 4) update the "Examples" section and "Summary" section based on the
> > > above
> > > > > changes
> > > > >
> > > > > Please refer the design doc[1] for more details and welcome any
> > > feedback.
> > > > >
> > > > > Bests,
> > > > > godfreyhe
> > > > >
> > > > >
> > > > > [0]
> > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit
> > > > > [1]
> https://www.geeksforgeeks.org/sql-ddl-dql-dml-dcl-tcl-commands/
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Sent from:
> > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> > > > >
> > > >
> > > >
> > > > --
> > > > Best, Jingsong Lee
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

Kurt Young
I don't think we should change `from` to `fromCatalog`, especially `from`
is just
introduced in 1.10. I agree with Jark we should change interface only when
necessary,
e.g. the semantic is broken or confusing. So I'm +1 to keep `sqlQuery` as
it is.

Best,
Kurt


On Sat, Feb 15, 2020 at 3:59 PM Jark Wu <[hidden email]> wrote:

> Thanks Kurt and Godfrey for the explanation,
>
> It makes sense to me that renaming `from(tableName)` to
> `fromCatalog(tableName)`.
> However, I still think `sqlQuery(query)` is clear and works well. Is it
> necessary to change it?
>
> We removed `sql(query)` and introduced `sqlQuery(query)`, we removed
> `scan(tableName)` and introduced `from(tableName)`,
> and now we want to remove them again. Users will feel like the interface is
> very unstable, that really frustrates users.
> I think we should be cautious to remove interface and only when it is
> necessary.
>
> Best,
> Jark
>
>
>
> On Thu, 13 Feb 2020 at 20:58, godfrey he <[hidden email]> wrote:
>
> > hi kurt,jark,jingsong
> >
> > Regarding to "fromQuery", I agree with kurt. In addition, I think `Table
> > from(String tableName)` should be renamed to `Table fromCatalog(String
> > tableName)`.
> >
> > Regarding to the "DmlBatch", DML contains "INSERT", "UPDATE", "DELETE",
> and
> > they can be executed in a same batch in the future. So we can add
> > "addUpdate" method and "addDelete" method to support them.
> >
> > Regarding to the "Inserts addInsert", maybe we can add a
> "DmlBatchBuilder".
> >
> > open to more discussion
> >
> > Best,
> > godfrey
> >
> >
> >
> > Kurt Young <[hidden email]> 于2020年2月13日周四 下午4:56写道:
> >
> > > Regarding to "fromQuery" is confusing users with "Table from(String
> > > tableName)", I have
> > > a just opposite opinion. I think this "fromXXX" pattern can make users
> > > quite clear when they
> > > want to get a Table from TableEnvironment. Similar interfaces will also
> > > include like "fromElements".
> > >
> > > Regarding to the name of DmlBatch, I think it's mainly for
> > > future flexibility, in case we can support
> > > other statement in a single batch. If that happens, the name "Inserts"
> > will
> > > be weird.
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Thu, Feb 13, 2020 at 4:03 PM Jark Wu <[hidden email]> wrote:
> > >
> > > > I agree with Jingsong.
> > > >
> > > > +1 to keep `sqlQuery`, it's clear from the method name and return
> type
> > > that
> > > > it accepts a SELECT query and returns a logic representation `Table`.
> > > > The `fromQuery` is a little confused users with the `Table
> from(String
> > > > tableName)` method.
> > > >
> > > > Regarding to the `DmlBatch`, I agree with Jingsong, AFAIK, the
> purpose
> > of
> > > > `DmlBatch` is used to batching insert statements.
> > > > Besides, DML terminology is not commonly know among users. So what
> > about
> > > > `InsertsBatching startBatchingInserts()` ?
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > > On Thu, 13 Feb 2020 at 15:50, Jingsong Li <[hidden email]>
> > > wrote:
> > > >
> > > > > Hi Godfrey,
> > > > >
> > > > > Thanks for updating. +1 sketchy.
> > > > >
> > > > > I have no idea to change "sqlQuery" to "fromQuery", I think
> > "sqlQuery"
> > > is
> > > > > OK, It's not that confusing with return values.
> > > > >
> > > > > Can we change the "DmlBatch" to "Inserts"?  I don't see any other
> > > needs.
> > > > > "Dml" seems a little weird.
> > > > > It is better to support "Inserts addInsert" too. Users can
> > > > > "inserts.addInsert().addInsert()...."
> > > > >
> > > > > I try to match the new interfaces with the old interfaces simply.
> > > > > - "startInserts -> addInsert" replace old "sqlUpdate(insert)" and
> > > > > "insertInto".
> > > > > - "executeStatement" new one, execute all kinds of sqls
> immediately.
> > > > > Including old "sqlUpdate(DDLs)".
> > > > >
> > > > > Best,
> > > > > Jingsong Lee
> > > > >
> > > > > On Wed, Feb 12, 2020 at 11:10 AM godfreyhe <[hidden email]>
> > > wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > I'd like to resume the discussion for FlIP-84 [0]. I had updated
> > the
> > > > > > document, the mainly changes are:
> > > > > >
> > > > > > 1. about "`void sqlUpdate(String sql)`" section
> > > > > >   a) change "Optional<ResultTable> executeSql(String sql) throws
> > > > > Exception"
> > > > > > to "ResultTable executeStatement(String statement, String
> jobName)
> > > > throws
> > > > > > Exception". The reason is: "statement" is a more general concept
> > than
> > > > > > "sql",
> > > > > > e.g. "show xx" is not a sql command (refer to [1]), but is a
> > > statement
> > > > > > (just
> > > > > > like JDBC). "insert" statement also has return value which is the
> > > > > affected
> > > > > > row count, we can unify the return type to "ResultTable" instead
> of
> > > > > > "Optional<ResultTable>".
> > > > > >   b) add two sub-interfaces for "ResultTable": "RowResultTable"
> is
> > > used
> > > > > for
> > > > > > non-streaming select statement and will not contain change flag;
> > > > > > "RowWithChangeFlagResultTable" is used for streaming select
> > statement
> > > > and
> > > > > > will contain change flag.
> > > > > >
> > > > > > 2) about "Support batch sql execute and explain" section
> > > > > > introduce "DmlBatch" to support both sql and Table API (which is
> > > > borrowed
> > > > > > from the ideas Dawid mentioned in the slack)
> > > > > >
> > > > > > interface TableEnvironment {
> > > > > >     DmlBatch startDmlBatch();
> > > > > > }
> > > > > >
> > > > > > interface DmlBatch {
> > > > > >   /**
> > > > > >   * add insert statement to the batch
> > > > > >   */
> > > > > >     void addInsert(String insert);
> > > > > >
> > > > > >  /**
> > > > > >   * add Table with given sink name to the batch
> > > > > >   */
> > > > > >     void addInsert(String sinkName, Table table);
> > > > > >
> > > > > >  /**
> > > > > >   * execute the dml statements as a batch
> > > > > >   */
> > > > > >   ResultTable execute(String jobName) throws Exception
> > > > > >
> > > > > >   /**
> > > > > >  * Returns the AST and the execution plan to compute the result
> of
> > > the
> > > > > > batch
> > > > > > dml statement.
> > > > > >   */
> > > > > >   String explain(boolean extended);
> > > > > > }
> > > > > >
> > > > > > 3) about "Discuss a parse method for multiple statements execute
> in
> > > SQL
> > > > > > CLI"
> > > > > > section
> > > > > > add the pros and cons for each solution
> > > > > >
> > > > > > 4) update the "Examples" section and "Summary" section based on
> the
> > > > above
> > > > > > changes
> > > > > >
> > > > > > Please refer the design doc[1] for more details and welcome any
> > > > feedback.
> > > > > >
> > > > > > Bests,
> > > > > > godfreyhe
> > > > > >
> > > > > >
> > > > > > [0]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit
> > > > > > [1]
> > https://www.geeksforgeeks.org/sql-ddl-dql-dml-dcl-tcl-commands/
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Sent from:
> > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best, Jingsong Lee
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

Jingsong Li
Hi Kurt and Godfrey,

Thank you for your explanation.

Regarding to the "DmlBatch",
I see there are some description for JDBC Statement.addBatch in the
document.
What do you think about introducing "addBatch" to the TableEnv instead of
introducing a new class?
The advantage is:
- Consistent with JDBC statement.
- Consistent with current interface, what we need do is just modify method
name.

Best,
Jingsong Lee


On Sat, Feb 15, 2020 at 4:48 PM Kurt Young <[hidden email]> wrote:

> I don't think we should change `from` to `fromCatalog`, especially `from`
> is just
> introduced in 1.10. I agree with Jark we should change interface only when
> necessary,
> e.g. the semantic is broken or confusing. So I'm +1 to keep `sqlQuery` as
> it is.
>
> Best,
> Kurt
>
>
> On Sat, Feb 15, 2020 at 3:59 PM Jark Wu <[hidden email]> wrote:
>
> > Thanks Kurt and Godfrey for the explanation,
> >
> > It makes sense to me that renaming `from(tableName)` to
> > `fromCatalog(tableName)`.
> > However, I still think `sqlQuery(query)` is clear and works well. Is it
> > necessary to change it?
> >
> > We removed `sql(query)` and introduced `sqlQuery(query)`, we removed
> > `scan(tableName)` and introduced `from(tableName)`,
> > and now we want to remove them again. Users will feel like the interface
> is
> > very unstable, that really frustrates users.
> > I think we should be cautious to remove interface and only when it is
> > necessary.
> >
> > Best,
> > Jark
> >
> >
> >
> > On Thu, 13 Feb 2020 at 20:58, godfrey he <[hidden email]> wrote:
> >
> > > hi kurt,jark,jingsong
> > >
> > > Regarding to "fromQuery", I agree with kurt. In addition, I think
> `Table
> > > from(String tableName)` should be renamed to `Table fromCatalog(String
> > > tableName)`.
> > >
> > > Regarding to the "DmlBatch", DML contains "INSERT", "UPDATE", "DELETE",
> > and
> > > they can be executed in a same batch in the future. So we can add
> > > "addUpdate" method and "addDelete" method to support them.
> > >
> > > Regarding to the "Inserts addInsert", maybe we can add a
> > "DmlBatchBuilder".
> > >
> > > open to more discussion
> > >
> > > Best,
> > > godfrey
> > >
> > >
> > >
> > > Kurt Young <[hidden email]> 于2020年2月13日周四 下午4:56写道:
> > >
> > > > Regarding to "fromQuery" is confusing users with "Table from(String
> > > > tableName)", I have
> > > > a just opposite opinion. I think this "fromXXX" pattern can make
> users
> > > > quite clear when they
> > > > want to get a Table from TableEnvironment. Similar interfaces will
> also
> > > > include like "fromElements".
> > > >
> > > > Regarding to the name of DmlBatch, I think it's mainly for
> > > > future flexibility, in case we can support
> > > > other statement in a single batch. If that happens, the name
> "Inserts"
> > > will
> > > > be weird.
> > > >
> > > > Best,
> > > > Kurt
> > > >
> > > >
> > > > On Thu, Feb 13, 2020 at 4:03 PM Jark Wu <[hidden email]> wrote:
> > > >
> > > > > I agree with Jingsong.
> > > > >
> > > > > +1 to keep `sqlQuery`, it's clear from the method name and return
> > type
> > > > that
> > > > > it accepts a SELECT query and returns a logic representation
> `Table`.
> > > > > The `fromQuery` is a little confused users with the `Table
> > from(String
> > > > > tableName)` method.
> > > > >
> > > > > Regarding to the `DmlBatch`, I agree with Jingsong, AFAIK, the
> > purpose
> > > of
> > > > > `DmlBatch` is used to batching insert statements.
> > > > > Besides, DML terminology is not commonly know among users. So what
> > > about
> > > > > `InsertsBatching startBatchingInserts()` ?
> > > > >
> > > > > Best,
> > > > > Jark
> > > > >
> > > > > On Thu, 13 Feb 2020 at 15:50, Jingsong Li <[hidden email]>
> > > > wrote:
> > > > >
> > > > > > Hi Godfrey,
> > > > > >
> > > > > > Thanks for updating. +1 sketchy.
> > > > > >
> > > > > > I have no idea to change "sqlQuery" to "fromQuery", I think
> > > "sqlQuery"
> > > > is
> > > > > > OK, It's not that confusing with return values.
> > > > > >
> > > > > > Can we change the "DmlBatch" to "Inserts"?  I don't see any other
> > > > needs.
> > > > > > "Dml" seems a little weird.
> > > > > > It is better to support "Inserts addInsert" too. Users can
> > > > > > "inserts.addInsert().addInsert()...."
> > > > > >
> > > > > > I try to match the new interfaces with the old interfaces simply.
> > > > > > - "startInserts -> addInsert" replace old "sqlUpdate(insert)" and
> > > > > > "insertInto".
> > > > > > - "executeStatement" new one, execute all kinds of sqls
> > immediately.
> > > > > > Including old "sqlUpdate(DDLs)".
> > > > > >
> > > > > > Best,
> > > > > > Jingsong Lee
> > > > > >
> > > > > > On Wed, Feb 12, 2020 at 11:10 AM godfreyhe <[hidden email]>
> > > > wrote:
> > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > I'd like to resume the discussion for FlIP-84 [0]. I had
> updated
> > > the
> > > > > > > document, the mainly changes are:
> > > > > > >
> > > > > > > 1. about "`void sqlUpdate(String sql)`" section
> > > > > > >   a) change "Optional<ResultTable> executeSql(String sql)
> throws
> > > > > > Exception"
> > > > > > > to "ResultTable executeStatement(String statement, String
> > jobName)
> > > > > throws
> > > > > > > Exception". The reason is: "statement" is a more general
> concept
> > > than
> > > > > > > "sql",
> > > > > > > e.g. "show xx" is not a sql command (refer to [1]), but is a
> > > > statement
> > > > > > > (just
> > > > > > > like JDBC). "insert" statement also has return value which is
> the
> > > > > > affected
> > > > > > > row count, we can unify the return type to "ResultTable"
> instead
> > of
> > > > > > > "Optional<ResultTable>".
> > > > > > >   b) add two sub-interfaces for "ResultTable": "RowResultTable"
> > is
> > > > used
> > > > > > for
> > > > > > > non-streaming select statement and will not contain change
> flag;
> > > > > > > "RowWithChangeFlagResultTable" is used for streaming select
> > > statement
> > > > > and
> > > > > > > will contain change flag.
> > > > > > >
> > > > > > > 2) about "Support batch sql execute and explain" section
> > > > > > > introduce "DmlBatch" to support both sql and Table API (which
> is
> > > > > borrowed
> > > > > > > from the ideas Dawid mentioned in the slack)
> > > > > > >
> > > > > > > interface TableEnvironment {
> > > > > > >     DmlBatch startDmlBatch();
> > > > > > > }
> > > > > > >
> > > > > > > interface DmlBatch {
> > > > > > >   /**
> > > > > > >   * add insert statement to the batch
> > > > > > >   */
> > > > > > >     void addInsert(String insert);
> > > > > > >
> > > > > > >  /**
> > > > > > >   * add Table with given sink name to the batch
> > > > > > >   */
> > > > > > >     void addInsert(String sinkName, Table table);
> > > > > > >
> > > > > > >  /**
> > > > > > >   * execute the dml statements as a batch
> > > > > > >   */
> > > > > > >   ResultTable execute(String jobName) throws Exception
> > > > > > >
> > > > > > >   /**
> > > > > > >  * Returns the AST and the execution plan to compute the result
> > of
> > > > the
> > > > > > > batch
> > > > > > > dml statement.
> > > > > > >   */
> > > > > > >   String explain(boolean extended);
> > > > > > > }
> > > > > > >
> > > > > > > 3) about "Discuss a parse method for multiple statements
> execute
> > in
> > > > SQL
> > > > > > > CLI"
> > > > > > > section
> > > > > > > add the pros and cons for each solution
> > > > > > >
> > > > > > > 4) update the "Examples" section and "Summary" section based on
> > the
> > > > > above
> > > > > > > changes
> > > > > > >
> > > > > > > Please refer the design doc[1] for more details and welcome any
> > > > > feedback.
> > > > > > >
> > > > > > > Bests,
> > > > > > > godfreyhe
> > > > > > >
> > > > > > >
> > > > > > > [0]
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit
> > > > > > > [1]
> > > https://www.geeksforgeeks.org/sql-ddl-dql-dml-dcl-tcl-commands/
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Sent from:
> > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best, Jingsong Lee
> > > > > >
> > > > >
> > > >
> > >
> >
>


--
Best, Jingsong Lee
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

godfreyhe
Thanks Kurt and Jark for explanation, I now also think we should make the
TableEnvironment interface more statable and should not change "sqlQuery"
method and "from" method.

Hi Jingsong. Regarding to the "DmlBatch", I totally agree with advantages
of "addBatch" method. However, there are two more questions need to solve:
one is how users write multi-sink programs in a Table API ? and another is
how users explain multi-sink program in both SQL and Table API ?
Currently, "DmlBatch" class can solve those questions. (the main
disadvantages is Inconsistent with the current interface)

Bests,
godfrey

Jingsong Li <[hidden email]> 于2020年2月15日周六 下午9:09写道:

> Hi Kurt and Godfrey,
>
> Thank you for your explanation.
>
> Regarding to the "DmlBatch",
> I see there are some description for JDBC Statement.addBatch in the
> document.
> What do you think about introducing "addBatch" to the TableEnv instead of
> introducing a new class?
> The advantage is:
> - Consistent with JDBC statement.
> - Consistent with current interface, what we need do is just modify method
> name.
>
> Best,
> Jingsong Lee
>
>
> On Sat, Feb 15, 2020 at 4:48 PM Kurt Young <[hidden email]> wrote:
>
> > I don't think we should change `from` to `fromCatalog`, especially `from`
> > is just
> > introduced in 1.10. I agree with Jark we should change interface only
> when
> > necessary,
> > e.g. the semantic is broken or confusing. So I'm +1 to keep `sqlQuery` as
> > it is.
> >
> > Best,
> > Kurt
> >
> >
> > On Sat, Feb 15, 2020 at 3:59 PM Jark Wu <[hidden email]> wrote:
> >
> > > Thanks Kurt and Godfrey for the explanation,
> > >
> > > It makes sense to me that renaming `from(tableName)` to
> > > `fromCatalog(tableName)`.
> > > However, I still think `sqlQuery(query)` is clear and works well. Is it
> > > necessary to change it?
> > >
> > > We removed `sql(query)` and introduced `sqlQuery(query)`, we removed
> > > `scan(tableName)` and introduced `from(tableName)`,
> > > and now we want to remove them again. Users will feel like the
> interface
> > is
> > > very unstable, that really frustrates users.
> > > I think we should be cautious to remove interface and only when it is
> > > necessary.
> > >
> > > Best,
> > > Jark
> > >
> > >
> > >
> > > On Thu, 13 Feb 2020 at 20:58, godfrey he <[hidden email]> wrote:
> > >
> > > > hi kurt,jark,jingsong
> > > >
> > > > Regarding to "fromQuery", I agree with kurt. In addition, I think
> > `Table
> > > > from(String tableName)` should be renamed to `Table
> fromCatalog(String
> > > > tableName)`.
> > > >
> > > > Regarding to the "DmlBatch", DML contains "INSERT", "UPDATE",
> "DELETE",
> > > and
> > > > they can be executed in a same batch in the future. So we can add
> > > > "addUpdate" method and "addDelete" method to support them.
> > > >
> > > > Regarding to the "Inserts addInsert", maybe we can add a
> > > "DmlBatchBuilder".
> > > >
> > > > open to more discussion
> > > >
> > > > Best,
> > > > godfrey
> > > >
> > > >
> > > >
> > > > Kurt Young <[hidden email]> 于2020年2月13日周四 下午4:56写道:
> > > >
> > > > > Regarding to "fromQuery" is confusing users with "Table from(String
> > > > > tableName)", I have
> > > > > a just opposite opinion. I think this "fromXXX" pattern can make
> > users
> > > > > quite clear when they
> > > > > want to get a Table from TableEnvironment. Similar interfaces will
> > also
> > > > > include like "fromElements".
> > > > >
> > > > > Regarding to the name of DmlBatch, I think it's mainly for
> > > > > future flexibility, in case we can support
> > > > > other statement in a single batch. If that happens, the name
> > "Inserts"
> > > > will
> > > > > be weird.
> > > > >
> > > > > Best,
> > > > > Kurt
> > > > >
> > > > >
> > > > > On Thu, Feb 13, 2020 at 4:03 PM Jark Wu <[hidden email]> wrote:
> > > > >
> > > > > > I agree with Jingsong.
> > > > > >
> > > > > > +1 to keep `sqlQuery`, it's clear from the method name and return
> > > type
> > > > > that
> > > > > > it accepts a SELECT query and returns a logic representation
> > `Table`.
> > > > > > The `fromQuery` is a little confused users with the `Table
> > > from(String
> > > > > > tableName)` method.
> > > > > >
> > > > > > Regarding to the `DmlBatch`, I agree with Jingsong, AFAIK, the
> > > purpose
> > > > of
> > > > > > `DmlBatch` is used to batching insert statements.
> > > > > > Besides, DML terminology is not commonly know among users. So
> what
> > > > about
> > > > > > `InsertsBatching startBatchingInserts()` ?
> > > > > >
> > > > > > Best,
> > > > > > Jark
> > > > > >
> > > > > > On Thu, 13 Feb 2020 at 15:50, Jingsong Li <
> [hidden email]>
> > > > > wrote:
> > > > > >
> > > > > > > Hi Godfrey,
> > > > > > >
> > > > > > > Thanks for updating. +1 sketchy.
> > > > > > >
> > > > > > > I have no idea to change "sqlQuery" to "fromQuery", I think
> > > > "sqlQuery"
> > > > > is
> > > > > > > OK, It's not that confusing with return values.
> > > > > > >
> > > > > > > Can we change the "DmlBatch" to "Inserts"?  I don't see any
> other
> > > > > needs.
> > > > > > > "Dml" seems a little weird.
> > > > > > > It is better to support "Inserts addInsert" too. Users can
> > > > > > > "inserts.addInsert().addInsert()...."
> > > > > > >
> > > > > > > I try to match the new interfaces with the old interfaces
> simply.
> > > > > > > - "startInserts -> addInsert" replace old "sqlUpdate(insert)"
> and
> > > > > > > "insertInto".
> > > > > > > - "executeStatement" new one, execute all kinds of sqls
> > > immediately.
> > > > > > > Including old "sqlUpdate(DDLs)".
> > > > > > >
> > > > > > > Best,
> > > > > > > Jingsong Lee
> > > > > > >
> > > > > > > On Wed, Feb 12, 2020 at 11:10 AM godfreyhe <
> [hidden email]>
> > > > > wrote:
> > > > > > >
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > I'd like to resume the discussion for FlIP-84 [0]. I had
> > updated
> > > > the
> > > > > > > > document, the mainly changes are:
> > > > > > > >
> > > > > > > > 1. about "`void sqlUpdate(String sql)`" section
> > > > > > > >   a) change "Optional<ResultTable> executeSql(String sql)
> > throws
> > > > > > > Exception"
> > > > > > > > to "ResultTable executeStatement(String statement, String
> > > jobName)
> > > > > > throws
> > > > > > > > Exception". The reason is: "statement" is a more general
> > concept
> > > > than
> > > > > > > > "sql",
> > > > > > > > e.g. "show xx" is not a sql command (refer to [1]), but is a
> > > > > statement
> > > > > > > > (just
> > > > > > > > like JDBC). "insert" statement also has return value which is
> > the
> > > > > > > affected
> > > > > > > > row count, we can unify the return type to "ResultTable"
> > instead
> > > of
> > > > > > > > "Optional<ResultTable>".
> > > > > > > >   b) add two sub-interfaces for "ResultTable":
> "RowResultTable"
> > > is
> > > > > used
> > > > > > > for
> > > > > > > > non-streaming select statement and will not contain change
> > flag;
> > > > > > > > "RowWithChangeFlagResultTable" is used for streaming select
> > > > statement
> > > > > > and
> > > > > > > > will contain change flag.
> > > > > > > >
> > > > > > > > 2) about "Support batch sql execute and explain" section
> > > > > > > > introduce "DmlBatch" to support both sql and Table API (which
> > is
> > > > > > borrowed
> > > > > > > > from the ideas Dawid mentioned in the slack)
> > > > > > > >
> > > > > > > > interface TableEnvironment {
> > > > > > > >     DmlBatch startDmlBatch();
> > > > > > > > }
> > > > > > > >
> > > > > > > > interface DmlBatch {
> > > > > > > >   /**
> > > > > > > >   * add insert statement to the batch
> > > > > > > >   */
> > > > > > > >     void addInsert(String insert);
> > > > > > > >
> > > > > > > >  /**
> > > > > > > >   * add Table with given sink name to the batch
> > > > > > > >   */
> > > > > > > >     void addInsert(String sinkName, Table table);
> > > > > > > >
> > > > > > > >  /**
> > > > > > > >   * execute the dml statements as a batch
> > > > > > > >   */
> > > > > > > >   ResultTable execute(String jobName) throws Exception
> > > > > > > >
> > > > > > > >   /**
> > > > > > > >  * Returns the AST and the execution plan to compute the
> result
> > > of
> > > > > the
> > > > > > > > batch
> > > > > > > > dml statement.
> > > > > > > >   */
> > > > > > > >   String explain(boolean extended);
> > > > > > > > }
> > > > > > > >
> > > > > > > > 3) about "Discuss a parse method for multiple statements
> > execute
> > > in
> > > > > SQL
> > > > > > > > CLI"
> > > > > > > > section
> > > > > > > > add the pros and cons for each solution
> > > > > > > >
> > > > > > > > 4) update the "Examples" section and "Summary" section based
> on
> > > the
> > > > > > above
> > > > > > > > changes
> > > > > > > >
> > > > > > > > Please refer the design doc[1] for more details and welcome
> any
> > > > > > feedback.
> > > > > > > >
> > > > > > > > Bests,
> > > > > > > > godfreyhe
> > > > > > > >
> > > > > > > >
> > > > > > > > [0]
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit
> > > > > > > > [1]
> > > > https://www.geeksforgeeks.org/sql-ddl-dql-dml-dcl-tcl-commands/
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Sent from:
> > > > > > >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Best, Jingsong Lee
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
> --
> Best, Jingsong Lee
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

Benchao Li
Hi Terry,

Thanks for the propose, and sorry for joining the party late.

I have one question about this FLIP:
executeStatement  accepts DML, what if it's a streaming DML ?
    does it submit the job to cluster directly and blocks forever? what's
the behavior for the next statements?

nit: there's a typo in "the table describing the result for each kind of
statement", "*Result Scheam" -> "Result Schema"*


godfrey he <[hidden email]> 于2020年2月18日周二 下午4:41写道:

> Thanks Kurt and Jark for explanation, I now also think we should make the
> TableEnvironment interface more statable and should not change "sqlQuery"
> method and "from" method.
>
> Hi Jingsong. Regarding to the "DmlBatch", I totally agree with advantages
> of "addBatch" method. However, there are two more questions need to solve:
> one is how users write multi-sink programs in a Table API ? and another is
> how users explain multi-sink program in both SQL and Table API ?
> Currently, "DmlBatch" class can solve those questions. (the main
> disadvantages is Inconsistent with the current interface)
>
> Bests,
> godfrey
>
> Jingsong Li <[hidden email]> 于2020年2月15日周六 下午9:09写道:
>
> > Hi Kurt and Godfrey,
> >
> > Thank you for your explanation.
> >
> > Regarding to the "DmlBatch",
> > I see there are some description for JDBC Statement.addBatch in the
> > document.
> > What do you think about introducing "addBatch" to the TableEnv instead of
> > introducing a new class?
> > The advantage is:
> > - Consistent with JDBC statement.
> > - Consistent with current interface, what we need do is just modify
> method
> > name.
> >
> > Best,
> > Jingsong Lee
> >
> >
> > On Sat, Feb 15, 2020 at 4:48 PM Kurt Young <[hidden email]> wrote:
> >
> > > I don't think we should change `from` to `fromCatalog`, especially
> `from`
> > > is just
> > > introduced in 1.10. I agree with Jark we should change interface only
> > when
> > > necessary,
> > > e.g. the semantic is broken or confusing. So I'm +1 to keep `sqlQuery`
> as
> > > it is.
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Sat, Feb 15, 2020 at 3:59 PM Jark Wu <[hidden email]> wrote:
> > >
> > > > Thanks Kurt and Godfrey for the explanation,
> > > >
> > > > It makes sense to me that renaming `from(tableName)` to
> > > > `fromCatalog(tableName)`.
> > > > However, I still think `sqlQuery(query)` is clear and works well. Is
> it
> > > > necessary to change it?
> > > >
> > > > We removed `sql(query)` and introduced `sqlQuery(query)`, we removed
> > > > `scan(tableName)` and introduced `from(tableName)`,
> > > > and now we want to remove them again. Users will feel like the
> > interface
> > > is
> > > > very unstable, that really frustrates users.
> > > > I think we should be cautious to remove interface and only when it is
> > > > necessary.
> > > >
> > > > Best,
> > > > Jark
> > > >
> > > >
> > > >
> > > > On Thu, 13 Feb 2020 at 20:58, godfrey he <[hidden email]>
> wrote:
> > > >
> > > > > hi kurt,jark,jingsong
> > > > >
> > > > > Regarding to "fromQuery", I agree with kurt. In addition, I think
> > > `Table
> > > > > from(String tableName)` should be renamed to `Table
> > fromCatalog(String
> > > > > tableName)`.
> > > > >
> > > > > Regarding to the "DmlBatch", DML contains "INSERT", "UPDATE",
> > "DELETE",
> > > > and
> > > > > they can be executed in a same batch in the future. So we can add
> > > > > "addUpdate" method and "addDelete" method to support them.
> > > > >
> > > > > Regarding to the "Inserts addInsert", maybe we can add a
> > > > "DmlBatchBuilder".
> > > > >
> > > > > open to more discussion
> > > > >
> > > > > Best,
> > > > > godfrey
> > > > >
> > > > >
> > > > >
> > > > > Kurt Young <[hidden email]> 于2020年2月13日周四 下午4:56写道:
> > > > >
> > > > > > Regarding to "fromQuery" is confusing users with "Table
> from(String
> > > > > > tableName)", I have
> > > > > > a just opposite opinion. I think this "fromXXX" pattern can make
> > > users
> > > > > > quite clear when they
> > > > > > want to get a Table from TableEnvironment. Similar interfaces
> will
> > > also
> > > > > > include like "fromElements".
> > > > > >
> > > > > > Regarding to the name of DmlBatch, I think it's mainly for
> > > > > > future flexibility, in case we can support
> > > > > > other statement in a single batch. If that happens, the name
> > > "Inserts"
> > > > > will
> > > > > > be weird.
> > > > > >
> > > > > > Best,
> > > > > > Kurt
> > > > > >
> > > > > >
> > > > > > On Thu, Feb 13, 2020 at 4:03 PM Jark Wu <[hidden email]>
> wrote:
> > > > > >
> > > > > > > I agree with Jingsong.
> > > > > > >
> > > > > > > +1 to keep `sqlQuery`, it's clear from the method name and
> return
> > > > type
> > > > > > that
> > > > > > > it accepts a SELECT query and returns a logic representation
> > > `Table`.
> > > > > > > The `fromQuery` is a little confused users with the `Table
> > > > from(String
> > > > > > > tableName)` method.
> > > > > > >
> > > > > > > Regarding to the `DmlBatch`, I agree with Jingsong, AFAIK, the
> > > > purpose
> > > > > of
> > > > > > > `DmlBatch` is used to batching insert statements.
> > > > > > > Besides, DML terminology is not commonly know among users. So
> > what
> > > > > about
> > > > > > > `InsertsBatching startBatchingInserts()` ?
> > > > > > >
> > > > > > > Best,
> > > > > > > Jark
> > > > > > >
> > > > > > > On Thu, 13 Feb 2020 at 15:50, Jingsong Li <
> > [hidden email]>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Godfrey,
> > > > > > > >
> > > > > > > > Thanks for updating. +1 sketchy.
> > > > > > > >
> > > > > > > > I have no idea to change "sqlQuery" to "fromQuery", I think
> > > > > "sqlQuery"
> > > > > > is
> > > > > > > > OK, It's not that confusing with return values.
> > > > > > > >
> > > > > > > > Can we change the "DmlBatch" to "Inserts"?  I don't see any
> > other
> > > > > > needs.
> > > > > > > > "Dml" seems a little weird.
> > > > > > > > It is better to support "Inserts addInsert" too. Users can
> > > > > > > > "inserts.addInsert().addInsert()...."
> > > > > > > >
> > > > > > > > I try to match the new interfaces with the old interfaces
> > simply.
> > > > > > > > - "startInserts -> addInsert" replace old "sqlUpdate(insert)"
> > and
> > > > > > > > "insertInto".
> > > > > > > > - "executeStatement" new one, execute all kinds of sqls
> > > > immediately.
> > > > > > > > Including old "sqlUpdate(DDLs)".
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Jingsong Lee
> > > > > > > >
> > > > > > > > On Wed, Feb 12, 2020 at 11:10 AM godfreyhe <
> > [hidden email]>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi everyone,
> > > > > > > > >
> > > > > > > > > I'd like to resume the discussion for FlIP-84 [0]. I had
> > > updated
> > > > > the
> > > > > > > > > document, the mainly changes are:
> > > > > > > > >
> > > > > > > > > 1. about "`void sqlUpdate(String sql)`" section
> > > > > > > > >   a) change "Optional<ResultTable> executeSql(String sql)
> > > throws
> > > > > > > > Exception"
> > > > > > > > > to "ResultTable executeStatement(String statement, String
> > > > jobName)
> > > > > > > throws
> > > > > > > > > Exception". The reason is: "statement" is a more general
> > > concept
> > > > > than
> > > > > > > > > "sql",
> > > > > > > > > e.g. "show xx" is not a sql command (refer to [1]), but is
> a
> > > > > > statement
> > > > > > > > > (just
> > > > > > > > > like JDBC). "insert" statement also has return value which
> is
> > > the
> > > > > > > > affected
> > > > > > > > > row count, we can unify the return type to "ResultTable"
> > > instead
> > > > of
> > > > > > > > > "Optional<ResultTable>".
> > > > > > > > >   b) add two sub-interfaces for "ResultTable":
> > "RowResultTable"
> > > > is
> > > > > > used
> > > > > > > > for
> > > > > > > > > non-streaming select statement and will not contain change
> > > flag;
> > > > > > > > > "RowWithChangeFlagResultTable" is used for streaming select
> > > > > statement
> > > > > > > and
> > > > > > > > > will contain change flag.
> > > > > > > > >
> > > > > > > > > 2) about "Support batch sql execute and explain" section
> > > > > > > > > introduce "DmlBatch" to support both sql and Table API
> (which
> > > is
> > > > > > > borrowed
> > > > > > > > > from the ideas Dawid mentioned in the slack)
> > > > > > > > >
> > > > > > > > > interface TableEnvironment {
> > > > > > > > >     DmlBatch startDmlBatch();
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > interface DmlBatch {
> > > > > > > > >   /**
> > > > > > > > >   * add insert statement to the batch
> > > > > > > > >   */
> > > > > > > > >     void addInsert(String insert);
> > > > > > > > >
> > > > > > > > >  /**
> > > > > > > > >   * add Table with given sink name to the batch
> > > > > > > > >   */
> > > > > > > > >     void addInsert(String sinkName, Table table);
> > > > > > > > >
> > > > > > > > >  /**
> > > > > > > > >   * execute the dml statements as a batch
> > > > > > > > >   */
> > > > > > > > >   ResultTable execute(String jobName) throws Exception
> > > > > > > > >
> > > > > > > > >   /**
> > > > > > > > >  * Returns the AST and the execution plan to compute the
> > result
> > > > of
> > > > > > the
> > > > > > > > > batch
> > > > > > > > > dml statement.
> > > > > > > > >   */
> > > > > > > > >   String explain(boolean extended);
> > > > > > > > > }
> > > > > > > > >
> > > > > > > > > 3) about "Discuss a parse method for multiple statements
> > > execute
> > > > in
> > > > > > SQL
> > > > > > > > > CLI"
> > > > > > > > > section
> > > > > > > > > add the pros and cons for each solution
> > > > > > > > >
> > > > > > > > > 4) update the "Examples" section and "Summary" section
> based
> > on
> > > > the
> > > > > > > above
> > > > > > > > > changes
> > > > > > > > >
> > > > > > > > > Please refer the design doc[1] for more details and welcome
> > any
> > > > > > > feedback.
> > > > > > > > >
> > > > > > > > > Bests,
> > > > > > > > > godfreyhe
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > [0]
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit
> > > > > > > > > [1]
> > > > > https://www.geeksforgeeks.org/sql-ddl-dql-dml-dcl-tcl-commands/
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Sent from:
> > > > > > > >
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Best, Jingsong Lee
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> > --
> > Best, Jingsong Lee
> >
>


--

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: [hidden email]; [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

godfreyhe
Hi Benchao,

> I have one question about this FLIP:
> executeStatement  accepts DML, what if it's a streaming DML ?
>    does it submit the job to cluster directly and blocks forever? what's
> the behavior for the next statements?
`executeStatement` is a synchronous method, will execute the statement once
calling this method and return the result until the job is finished.
We will introduce asynchronous method like `executeStatementAsync` in the
future.

> nit: there's a typo in "the table describing the result for each kind of
> statement", "*Result Scheam" -> "Result Schema"*
Thanks for the reminding, I will fix it now.

Bests,
Godfrey

Benchao Li <[hidden email]> 于2020年2月28日周五 下午4:00写道:

> Hi Terry,
>
> Thanks for the propose, and sorry for joining the party late.
>
> I have one question about this FLIP:
> executeStatement  accepts DML, what if it's a streaming DML ?
>     does it submit the job to cluster directly and blocks forever? what's
> the behavior for the next statements?
>
> nit: there's a typo in "the table describing the result for each kind of
> statement", "*Result Scheam" -> "Result Schema"*
>
>
> godfrey he <[hidden email]> 于2020年2月18日周二 下午4:41写道:
>
> > Thanks Kurt and Jark for explanation, I now also think we should make the
> > TableEnvironment interface more statable and should not change "sqlQuery"
> > method and "from" method.
> >
> > Hi Jingsong. Regarding to the "DmlBatch", I totally agree with advantages
> > of "addBatch" method. However, there are two more questions need to
> solve:
> > one is how users write multi-sink programs in a Table API ? and another
> is
> > how users explain multi-sink program in both SQL and Table API ?
> > Currently, "DmlBatch" class can solve those questions. (the main
> > disadvantages is Inconsistent with the current interface)
> >
> > Bests,
> > godfrey
> >
> > Jingsong Li <[hidden email]> 于2020年2月15日周六 下午9:09写道:
> >
> > > Hi Kurt and Godfrey,
> > >
> > > Thank you for your explanation.
> > >
> > > Regarding to the "DmlBatch",
> > > I see there are some description for JDBC Statement.addBatch in the
> > > document.
> > > What do you think about introducing "addBatch" to the TableEnv instead
> of
> > > introducing a new class?
> > > The advantage is:
> > > - Consistent with JDBC statement.
> > > - Consistent with current interface, what we need do is just modify
> > method
> > > name.
> > >
> > > Best,
> > > Jingsong Lee
> > >
> > >
> > > On Sat, Feb 15, 2020 at 4:48 PM Kurt Young <[hidden email]> wrote:
> > >
> > > > I don't think we should change `from` to `fromCatalog`, especially
> > `from`
> > > > is just
> > > > introduced in 1.10. I agree with Jark we should change interface only
> > > when
> > > > necessary,
> > > > e.g. the semantic is broken or confusing. So I'm +1 to keep
> `sqlQuery`
> > as
> > > > it is.
> > > >
> > > > Best,
> > > > Kurt
> > > >
> > > >
> > > > On Sat, Feb 15, 2020 at 3:59 PM Jark Wu <[hidden email]> wrote:
> > > >
> > > > > Thanks Kurt and Godfrey for the explanation,
> > > > >
> > > > > It makes sense to me that renaming `from(tableName)` to
> > > > > `fromCatalog(tableName)`.
> > > > > However, I still think `sqlQuery(query)` is clear and works well.
> Is
> > it
> > > > > necessary to change it?
> > > > >
> > > > > We removed `sql(query)` and introduced `sqlQuery(query)`, we
> removed
> > > > > `scan(tableName)` and introduced `from(tableName)`,
> > > > > and now we want to remove them again. Users will feel like the
> > > interface
> > > > is
> > > > > very unstable, that really frustrates users.
> > > > > I think we should be cautious to remove interface and only when it
> is
> > > > > necessary.
> > > > >
> > > > > Best,
> > > > > Jark
> > > > >
> > > > >
> > > > >
> > > > > On Thu, 13 Feb 2020 at 20:58, godfrey he <[hidden email]>
> > wrote:
> > > > >
> > > > > > hi kurt,jark,jingsong
> > > > > >
> > > > > > Regarding to "fromQuery", I agree with kurt. In addition, I think
> > > > `Table
> > > > > > from(String tableName)` should be renamed to `Table
> > > fromCatalog(String
> > > > > > tableName)`.
> > > > > >
> > > > > > Regarding to the "DmlBatch", DML contains "INSERT", "UPDATE",
> > > "DELETE",
> > > > > and
> > > > > > they can be executed in a same batch in the future. So we can add
> > > > > > "addUpdate" method and "addDelete" method to support them.
> > > > > >
> > > > > > Regarding to the "Inserts addInsert", maybe we can add a
> > > > > "DmlBatchBuilder".
> > > > > >
> > > > > > open to more discussion
> > > > > >
> > > > > > Best,
> > > > > > godfrey
> > > > > >
> > > > > >
> > > > > >
> > > > > > Kurt Young <[hidden email]> 于2020年2月13日周四 下午4:56写道:
> > > > > >
> > > > > > > Regarding to "fromQuery" is confusing users with "Table
> > from(String
> > > > > > > tableName)", I have
> > > > > > > a just opposite opinion. I think this "fromXXX" pattern can
> make
> > > > users
> > > > > > > quite clear when they
> > > > > > > want to get a Table from TableEnvironment. Similar interfaces
> > will
> > > > also
> > > > > > > include like "fromElements".
> > > > > > >
> > > > > > > Regarding to the name of DmlBatch, I think it's mainly for
> > > > > > > future flexibility, in case we can support
> > > > > > > other statement in a single batch. If that happens, the name
> > > > "Inserts"
> > > > > > will
> > > > > > > be weird.
> > > > > > >
> > > > > > > Best,
> > > > > > > Kurt
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Feb 13, 2020 at 4:03 PM Jark Wu <[hidden email]>
> > wrote:
> > > > > > >
> > > > > > > > I agree with Jingsong.
> > > > > > > >
> > > > > > > > +1 to keep `sqlQuery`, it's clear from the method name and
> > return
> > > > > type
> > > > > > > that
> > > > > > > > it accepts a SELECT query and returns a logic representation
> > > > `Table`.
> > > > > > > > The `fromQuery` is a little confused users with the `Table
> > > > > from(String
> > > > > > > > tableName)` method.
> > > > > > > >
> > > > > > > > Regarding to the `DmlBatch`, I agree with Jingsong, AFAIK,
> the
> > > > > purpose
> > > > > > of
> > > > > > > > `DmlBatch` is used to batching insert statements.
> > > > > > > > Besides, DML terminology is not commonly know among users. So
> > > what
> > > > > > about
> > > > > > > > `InsertsBatching startBatchingInserts()` ?
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Jark
> > > > > > > >
> > > > > > > > On Thu, 13 Feb 2020 at 15:50, Jingsong Li <
> > > [hidden email]>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Godfrey,
> > > > > > > > >
> > > > > > > > > Thanks for updating. +1 sketchy.
> > > > > > > > >
> > > > > > > > > I have no idea to change "sqlQuery" to "fromQuery", I think
> > > > > > "sqlQuery"
> > > > > > > is
> > > > > > > > > OK, It's not that confusing with return values.
> > > > > > > > >
> > > > > > > > > Can we change the "DmlBatch" to "Inserts"?  I don't see any
> > > other
> > > > > > > needs.
> > > > > > > > > "Dml" seems a little weird.
> > > > > > > > > It is better to support "Inserts addInsert" too. Users can
> > > > > > > > > "inserts.addInsert().addInsert()...."
> > > > > > > > >
> > > > > > > > > I try to match the new interfaces with the old interfaces
> > > simply.
> > > > > > > > > - "startInserts -> addInsert" replace old
> "sqlUpdate(insert)"
> > > and
> > > > > > > > > "insertInto".
> > > > > > > > > - "executeStatement" new one, execute all kinds of sqls
> > > > > immediately.
> > > > > > > > > Including old "sqlUpdate(DDLs)".
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Jingsong Lee
> > > > > > > > >
> > > > > > > > > On Wed, Feb 12, 2020 at 11:10 AM godfreyhe <
> > > [hidden email]>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi everyone,
> > > > > > > > > >
> > > > > > > > > > I'd like to resume the discussion for FlIP-84 [0]. I had
> > > > updated
> > > > > > the
> > > > > > > > > > document, the mainly changes are:
> > > > > > > > > >
> > > > > > > > > > 1. about "`void sqlUpdate(String sql)`" section
> > > > > > > > > >   a) change "Optional<ResultTable> executeSql(String sql)
> > > > throws
> > > > > > > > > Exception"
> > > > > > > > > > to "ResultTable executeStatement(String statement, String
> > > > > jobName)
> > > > > > > > throws
> > > > > > > > > > Exception". The reason is: "statement" is a more general
> > > > concept
> > > > > > than
> > > > > > > > > > "sql",
> > > > > > > > > > e.g. "show xx" is not a sql command (refer to [1]), but
> is
> > a
> > > > > > > statement
> > > > > > > > > > (just
> > > > > > > > > > like JDBC). "insert" statement also has return value
> which
> > is
> > > > the
> > > > > > > > > affected
> > > > > > > > > > row count, we can unify the return type to "ResultTable"
> > > > instead
> > > > > of
> > > > > > > > > > "Optional<ResultTable>".
> > > > > > > > > >   b) add two sub-interfaces for "ResultTable":
> > > "RowResultTable"
> > > > > is
> > > > > > > used
> > > > > > > > > for
> > > > > > > > > > non-streaming select statement and will not contain
> change
> > > > flag;
> > > > > > > > > > "RowWithChangeFlagResultTable" is used for streaming
> select
> > > > > > statement
> > > > > > > > and
> > > > > > > > > > will contain change flag.
> > > > > > > > > >
> > > > > > > > > > 2) about "Support batch sql execute and explain" section
> > > > > > > > > > introduce "DmlBatch" to support both sql and Table API
> > (which
> > > > is
> > > > > > > > borrowed
> > > > > > > > > > from the ideas Dawid mentioned in the slack)
> > > > > > > > > >
> > > > > > > > > > interface TableEnvironment {
> > > > > > > > > >     DmlBatch startDmlBatch();
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > interface DmlBatch {
> > > > > > > > > >   /**
> > > > > > > > > >   * add insert statement to the batch
> > > > > > > > > >   */
> > > > > > > > > >     void addInsert(String insert);
> > > > > > > > > >
> > > > > > > > > >  /**
> > > > > > > > > >   * add Table with given sink name to the batch
> > > > > > > > > >   */
> > > > > > > > > >     void addInsert(String sinkName, Table table);
> > > > > > > > > >
> > > > > > > > > >  /**
> > > > > > > > > >   * execute the dml statements as a batch
> > > > > > > > > >   */
> > > > > > > > > >   ResultTable execute(String jobName) throws Exception
> > > > > > > > > >
> > > > > > > > > >   /**
> > > > > > > > > >  * Returns the AST and the execution plan to compute the
> > > result
> > > > > of
> > > > > > > the
> > > > > > > > > > batch
> > > > > > > > > > dml statement.
> > > > > > > > > >   */
> > > > > > > > > >   String explain(boolean extended);
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > 3) about "Discuss a parse method for multiple statements
> > > > execute
> > > > > in
> > > > > > > SQL
> > > > > > > > > > CLI"
> > > > > > > > > > section
> > > > > > > > > > add the pros and cons for each solution
> > > > > > > > > >
> > > > > > > > > > 4) update the "Examples" section and "Summary" section
> > based
> > > on
> > > > > the
> > > > > > > > above
> > > > > > > > > > changes
> > > > > > > > > >
> > > > > > > > > > Please refer the design doc[1] for more details and
> welcome
> > > any
> > > > > > > > feedback.
> > > > > > > > > >
> > > > > > > > > > Bests,
> > > > > > > > > > godfreyhe
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > [0]
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit
> > > > > > > > > > [1]
> > > > > > https://www.geeksforgeeks.org/sql-ddl-dql-dml-dcl-tcl-commands/
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Sent from:
> > > > > > > > >
> > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Best, Jingsong Lee
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best, Jingsong Lee
> > >
> >
>
>
> --
>
> Benchao Li
> School of Electronics Engineering and Computer Science, Peking University
> Tel:+86-15650713730
> Email: [hidden email]; [hidden email]
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

Benchao Li
Hi godfrey,

Thanks for your explanation.

Do we need to clarify this in the FLIP? Maybe this confuses other users as
well.

godfrey he <[hidden email]> 于2020年2月28日周五 下午4:54写道:

> Hi Benchao,
>
> > I have one question about this FLIP:
> > executeStatement  accepts DML, what if it's a streaming DML ?
> >    does it submit the job to cluster directly and blocks forever? what's
> > the behavior for the next statements?
> `executeStatement` is a synchronous method, will execute the statement once
> calling this method and return the result until the job is finished.
> We will introduce asynchronous method like `executeStatementAsync` in the
> future.
>
> > nit: there's a typo in "the table describing the result for each kind of
> > statement", "*Result Scheam" -> "Result Schema"*
> Thanks for the reminding, I will fix it now.
>
> Bests,
> Godfrey
>
> Benchao Li <[hidden email]> 于2020年2月28日周五 下午4:00写道:
>
> > Hi Terry,
> >
> > Thanks for the propose, and sorry for joining the party late.
> >
> > I have one question about this FLIP:
> > executeStatement  accepts DML, what if it's a streaming DML ?
> >     does it submit the job to cluster directly and blocks forever? what's
> > the behavior for the next statements?
> >
> > nit: there's a typo in "the table describing the result for each kind of
> > statement", "*Result Scheam" -> "Result Schema"*
> >
> >
> > godfrey he <[hidden email]> 于2020年2月18日周二 下午4:41写道:
> >
> > > Thanks Kurt and Jark for explanation, I now also think we should make
> the
> > > TableEnvironment interface more statable and should not change
> "sqlQuery"
> > > method and "from" method.
> > >
> > > Hi Jingsong. Regarding to the "DmlBatch", I totally agree with
> advantages
> > > of "addBatch" method. However, there are two more questions need to
> > solve:
> > > one is how users write multi-sink programs in a Table API ? and another
> > is
> > > how users explain multi-sink program in both SQL and Table API ?
> > > Currently, "DmlBatch" class can solve those questions. (the main
> > > disadvantages is Inconsistent with the current interface)
> > >
> > > Bests,
> > > godfrey
> > >
> > > Jingsong Li <[hidden email]> 于2020年2月15日周六 下午9:09写道:
> > >
> > > > Hi Kurt and Godfrey,
> > > >
> > > > Thank you for your explanation.
> > > >
> > > > Regarding to the "DmlBatch",
> > > > I see there are some description for JDBC Statement.addBatch in the
> > > > document.
> > > > What do you think about introducing "addBatch" to the TableEnv
> instead
> > of
> > > > introducing a new class?
> > > > The advantage is:
> > > > - Consistent with JDBC statement.
> > > > - Consistent with current interface, what we need do is just modify
> > > method
> > > > name.
> > > >
> > > > Best,
> > > > Jingsong Lee
> > > >
> > > >
> > > > On Sat, Feb 15, 2020 at 4:48 PM Kurt Young <[hidden email]> wrote:
> > > >
> > > > > I don't think we should change `from` to `fromCatalog`, especially
> > > `from`
> > > > > is just
> > > > > introduced in 1.10. I agree with Jark we should change interface
> only
> > > > when
> > > > > necessary,
> > > > > e.g. the semantic is broken or confusing. So I'm +1 to keep
> > `sqlQuery`
> > > as
> > > > > it is.
> > > > >
> > > > > Best,
> > > > > Kurt
> > > > >
> > > > >
> > > > > On Sat, Feb 15, 2020 at 3:59 PM Jark Wu <[hidden email]> wrote:
> > > > >
> > > > > > Thanks Kurt and Godfrey for the explanation,
> > > > > >
> > > > > > It makes sense to me that renaming `from(tableName)` to
> > > > > > `fromCatalog(tableName)`.
> > > > > > However, I still think `sqlQuery(query)` is clear and works well.
> > Is
> > > it
> > > > > > necessary to change it?
> > > > > >
> > > > > > We removed `sql(query)` and introduced `sqlQuery(query)`, we
> > removed
> > > > > > `scan(tableName)` and introduced `from(tableName)`,
> > > > > > and now we want to remove them again. Users will feel like the
> > > > interface
> > > > > is
> > > > > > very unstable, that really frustrates users.
> > > > > > I think we should be cautious to remove interface and only when
> it
> > is
> > > > > > necessary.
> > > > > >
> > > > > > Best,
> > > > > > Jark
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, 13 Feb 2020 at 20:58, godfrey he <[hidden email]>
> > > wrote:
> > > > > >
> > > > > > > hi kurt,jark,jingsong
> > > > > > >
> > > > > > > Regarding to "fromQuery", I agree with kurt. In addition, I
> think
> > > > > `Table
> > > > > > > from(String tableName)` should be renamed to `Table
> > > > fromCatalog(String
> > > > > > > tableName)`.
> > > > > > >
> > > > > > > Regarding to the "DmlBatch", DML contains "INSERT", "UPDATE",
> > > > "DELETE",
> > > > > > and
> > > > > > > they can be executed in a same batch in the future. So we can
> add
> > > > > > > "addUpdate" method and "addDelete" method to support them.
> > > > > > >
> > > > > > > Regarding to the "Inserts addInsert", maybe we can add a
> > > > > > "DmlBatchBuilder".
> > > > > > >
> > > > > > > open to more discussion
> > > > > > >
> > > > > > > Best,
> > > > > > > godfrey
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Kurt Young <[hidden email]> 于2020年2月13日周四 下午4:56写道:
> > > > > > >
> > > > > > > > Regarding to "fromQuery" is confusing users with "Table
> > > from(String
> > > > > > > > tableName)", I have
> > > > > > > > a just opposite opinion. I think this "fromXXX" pattern can
> > make
> > > > > users
> > > > > > > > quite clear when they
> > > > > > > > want to get a Table from TableEnvironment. Similar interfaces
> > > will
> > > > > also
> > > > > > > > include like "fromElements".
> > > > > > > >
> > > > > > > > Regarding to the name of DmlBatch, I think it's mainly for
> > > > > > > > future flexibility, in case we can support
> > > > > > > > other statement in a single batch. If that happens, the name
> > > > > "Inserts"
> > > > > > > will
> > > > > > > > be weird.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Kurt
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Feb 13, 2020 at 4:03 PM Jark Wu <[hidden email]>
> > > wrote:
> > > > > > > >
> > > > > > > > > I agree with Jingsong.
> > > > > > > > >
> > > > > > > > > +1 to keep `sqlQuery`, it's clear from the method name and
> > > return
> > > > > > type
> > > > > > > > that
> > > > > > > > > it accepts a SELECT query and returns a logic
> representation
> > > > > `Table`.
> > > > > > > > > The `fromQuery` is a little confused users with the `Table
> > > > > > from(String
> > > > > > > > > tableName)` method.
> > > > > > > > >
> > > > > > > > > Regarding to the `DmlBatch`, I agree with Jingsong, AFAIK,
> > the
> > > > > > purpose
> > > > > > > of
> > > > > > > > > `DmlBatch` is used to batching insert statements.
> > > > > > > > > Besides, DML terminology is not commonly know among users.
> So
> > > > what
> > > > > > > about
> > > > > > > > > `InsertsBatching startBatchingInserts()` ?
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Jark
> > > > > > > > >
> > > > > > > > > On Thu, 13 Feb 2020 at 15:50, Jingsong Li <
> > > > [hidden email]>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Godfrey,
> > > > > > > > > >
> > > > > > > > > > Thanks for updating. +1 sketchy.
> > > > > > > > > >
> > > > > > > > > > I have no idea to change "sqlQuery" to "fromQuery", I
> think
> > > > > > > "sqlQuery"
> > > > > > > > is
> > > > > > > > > > OK, It's not that confusing with return values.
> > > > > > > > > >
> > > > > > > > > > Can we change the "DmlBatch" to "Inserts"?  I don't see
> any
> > > > other
> > > > > > > > needs.
> > > > > > > > > > "Dml" seems a little weird.
> > > > > > > > > > It is better to support "Inserts addInsert" too. Users
> can
> > > > > > > > > > "inserts.addInsert().addInsert()...."
> > > > > > > > > >
> > > > > > > > > > I try to match the new interfaces with the old interfaces
> > > > simply.
> > > > > > > > > > - "startInserts -> addInsert" replace old
> > "sqlUpdate(insert)"
> > > > and
> > > > > > > > > > "insertInto".
> > > > > > > > > > - "executeStatement" new one, execute all kinds of sqls
> > > > > > immediately.
> > > > > > > > > > Including old "sqlUpdate(DDLs)".
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Jingsong Lee
> > > > > > > > > >
> > > > > > > > > > On Wed, Feb 12, 2020 at 11:10 AM godfreyhe <
> > > > [hidden email]>
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi everyone,
> > > > > > > > > > >
> > > > > > > > > > > I'd like to resume the discussion for FlIP-84 [0]. I
> had
> > > > > updated
> > > > > > > the
> > > > > > > > > > > document, the mainly changes are:
> > > > > > > > > > >
> > > > > > > > > > > 1. about "`void sqlUpdate(String sql)`" section
> > > > > > > > > > >   a) change "Optional<ResultTable> executeSql(String
> sql)
> > > > > throws
> > > > > > > > > > Exception"
> > > > > > > > > > > to "ResultTable executeStatement(String statement,
> String
> > > > > > jobName)
> > > > > > > > > throws
> > > > > > > > > > > Exception". The reason is: "statement" is a more
> general
> > > > > concept
> > > > > > > than
> > > > > > > > > > > "sql",
> > > > > > > > > > > e.g. "show xx" is not a sql command (refer to [1]), but
> > is
> > > a
> > > > > > > > statement
> > > > > > > > > > > (just
> > > > > > > > > > > like JDBC). "insert" statement also has return value
> > which
> > > is
> > > > > the
> > > > > > > > > > affected
> > > > > > > > > > > row count, we can unify the return type to
> "ResultTable"
> > > > > instead
> > > > > > of
> > > > > > > > > > > "Optional<ResultTable>".
> > > > > > > > > > >   b) add two sub-interfaces for "ResultTable":
> > > > "RowResultTable"
> > > > > > is
> > > > > > > > used
> > > > > > > > > > for
> > > > > > > > > > > non-streaming select statement and will not contain
> > change
> > > > > flag;
> > > > > > > > > > > "RowWithChangeFlagResultTable" is used for streaming
> > select
> > > > > > > statement
> > > > > > > > > and
> > > > > > > > > > > will contain change flag.
> > > > > > > > > > >
> > > > > > > > > > > 2) about "Support batch sql execute and explain"
> section
> > > > > > > > > > > introduce "DmlBatch" to support both sql and Table API
> > > (which
> > > > > is
> > > > > > > > > borrowed
> > > > > > > > > > > from the ideas Dawid mentioned in the slack)
> > > > > > > > > > >
> > > > > > > > > > > interface TableEnvironment {
> > > > > > > > > > >     DmlBatch startDmlBatch();
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > interface DmlBatch {
> > > > > > > > > > >   /**
> > > > > > > > > > >   * add insert statement to the batch
> > > > > > > > > > >   */
> > > > > > > > > > >     void addInsert(String insert);
> > > > > > > > > > >
> > > > > > > > > > >  /**
> > > > > > > > > > >   * add Table with given sink name to the batch
> > > > > > > > > > >   */
> > > > > > > > > > >     void addInsert(String sinkName, Table table);
> > > > > > > > > > >
> > > > > > > > > > >  /**
> > > > > > > > > > >   * execute the dml statements as a batch
> > > > > > > > > > >   */
> > > > > > > > > > >   ResultTable execute(String jobName) throws Exception
> > > > > > > > > > >
> > > > > > > > > > >   /**
> > > > > > > > > > >  * Returns the AST and the execution plan to compute
> the
> > > > result
> > > > > > of
> > > > > > > > the
> > > > > > > > > > > batch
> > > > > > > > > > > dml statement.
> > > > > > > > > > >   */
> > > > > > > > > > >   String explain(boolean extended);
> > > > > > > > > > > }
> > > > > > > > > > >
> > > > > > > > > > > 3) about "Discuss a parse method for multiple
> statements
> > > > > execute
> > > > > > in
> > > > > > > > SQL
> > > > > > > > > > > CLI"
> > > > > > > > > > > section
> > > > > > > > > > > add the pros and cons for each solution
> > > > > > > > > > >
> > > > > > > > > > > 4) update the "Examples" section and "Summary" section
> > > based
> > > > on
> > > > > > the
> > > > > > > > > above
> > > > > > > > > > > changes
> > > > > > > > > > >
> > > > > > > > > > > Please refer the design doc[1] for more details and
> > welcome
> > > > any
> > > > > > > > > feedback.
> > > > > > > > > > >
> > > > > > > > > > > Bests,
> > > > > > > > > > > godfreyhe
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > [0]
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit
> > > > > > > > > > > [1]
> > > > > > >
> https://www.geeksforgeeks.org/sql-ddl-dql-dml-dcl-tcl-commands/
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Sent from:
> > > > > > > > > >
> > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Best, Jingsong Lee
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Best, Jingsong Lee
> > > >
> > >
> >
> >
> > --
> >
> > Benchao Li
> > School of Electronics Engineering and Computer Science, Peking University
> > Tel:+86-15650713730
> > Email: [hidden email]; [hidden email]
> >
>


--

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: [hidden email]; [hidden email]
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

godfreyhe
Hi Benchao,

I think the document has contained both parts: the behavior is explained
when introducing `executeStatement` method, and asynchronous execution
methods is explained in the appendix.

Bests,
Godfrey

Benchao Li <[hidden email]> 于2020年2月28日周五 下午10:09写道:

> Hi godfrey,
>
> Thanks for your explanation.
>
> Do we need to clarify this in the FLIP? Maybe this confuses other users as
> well.
>
> godfrey he <[hidden email]> 于2020年2月28日周五 下午4:54写道:
>
> > Hi Benchao,
> >
> > > I have one question about this FLIP:
> > > executeStatement  accepts DML, what if it's a streaming DML ?
> > >    does it submit the job to cluster directly and blocks forever?
> what's
> > > the behavior for the next statements?
> > `executeStatement` is a synchronous method, will execute the statement
> once
> > calling this method and return the result until the job is finished.
> > We will introduce asynchronous method like `executeStatementAsync` in the
> > future.
> >
> > > nit: there's a typo in "the table describing the result for each kind
> of
> > > statement", "*Result Scheam" -> "Result Schema"*
> > Thanks for the reminding, I will fix it now.
> >
> > Bests,
> > Godfrey
> >
> > Benchao Li <[hidden email]> 于2020年2月28日周五 下午4:00写道:
> >
> > > Hi Terry,
> > >
> > > Thanks for the propose, and sorry for joining the party late.
> > >
> > > I have one question about this FLIP:
> > > executeStatement  accepts DML, what if it's a streaming DML ?
> > >     does it submit the job to cluster directly and blocks forever?
> what's
> > > the behavior for the next statements?
> > >
> > > nit: there's a typo in "the table describing the result for each kind
> of
> > > statement", "*Result Scheam" -> "Result Schema"*
> > >
> > >
> > > godfrey he <[hidden email]> 于2020年2月18日周二 下午4:41写道:
> > >
> > > > Thanks Kurt and Jark for explanation, I now also think we should make
> > the
> > > > TableEnvironment interface more statable and should not change
> > "sqlQuery"
> > > > method and "from" method.
> > > >
> > > > Hi Jingsong. Regarding to the "DmlBatch", I totally agree with
> > advantages
> > > > of "addBatch" method. However, there are two more questions need to
> > > solve:
> > > > one is how users write multi-sink programs in a Table API ? and
> another
> > > is
> > > > how users explain multi-sink program in both SQL and Table API ?
> > > > Currently, "DmlBatch" class can solve those questions. (the main
> > > > disadvantages is Inconsistent with the current interface)
> > > >
> > > > Bests,
> > > > godfrey
> > > >
> > > > Jingsong Li <[hidden email]> 于2020年2月15日周六 下午9:09写道:
> > > >
> > > > > Hi Kurt and Godfrey,
> > > > >
> > > > > Thank you for your explanation.
> > > > >
> > > > > Regarding to the "DmlBatch",
> > > > > I see there are some description for JDBC Statement.addBatch in the
> > > > > document.
> > > > > What do you think about introducing "addBatch" to the TableEnv
> > instead
> > > of
> > > > > introducing a new class?
> > > > > The advantage is:
> > > > > - Consistent with JDBC statement.
> > > > > - Consistent with current interface, what we need do is just modify
> > > > method
> > > > > name.
> > > > >
> > > > > Best,
> > > > > Jingsong Lee
> > > > >
> > > > >
> > > > > On Sat, Feb 15, 2020 at 4:48 PM Kurt Young <[hidden email]>
> wrote:
> > > > >
> > > > > > I don't think we should change `from` to `fromCatalog`,
> especially
> > > > `from`
> > > > > > is just
> > > > > > introduced in 1.10. I agree with Jark we should change interface
> > only
> > > > > when
> > > > > > necessary,
> > > > > > e.g. the semantic is broken or confusing. So I'm +1 to keep
> > > `sqlQuery`
> > > > as
> > > > > > it is.
> > > > > >
> > > > > > Best,
> > > > > > Kurt
> > > > > >
> > > > > >
> > > > > > On Sat, Feb 15, 2020 at 3:59 PM Jark Wu <[hidden email]>
> wrote:
> > > > > >
> > > > > > > Thanks Kurt and Godfrey for the explanation,
> > > > > > >
> > > > > > > It makes sense to me that renaming `from(tableName)` to
> > > > > > > `fromCatalog(tableName)`.
> > > > > > > However, I still think `sqlQuery(query)` is clear and works
> well.
> > > Is
> > > > it
> > > > > > > necessary to change it?
> > > > > > >
> > > > > > > We removed `sql(query)` and introduced `sqlQuery(query)`, we
> > > removed
> > > > > > > `scan(tableName)` and introduced `from(tableName)`,
> > > > > > > and now we want to remove them again. Users will feel like the
> > > > > interface
> > > > > > is
> > > > > > > very unstable, that really frustrates users.
> > > > > > > I think we should be cautious to remove interface and only when
> > it
> > > is
> > > > > > > necessary.
> > > > > > >
> > > > > > > Best,
> > > > > > > Jark
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Thu, 13 Feb 2020 at 20:58, godfrey he <[hidden email]>
> > > > wrote:
> > > > > > >
> > > > > > > > hi kurt,jark,jingsong
> > > > > > > >
> > > > > > > > Regarding to "fromQuery", I agree with kurt. In addition, I
> > think
> > > > > > `Table
> > > > > > > > from(String tableName)` should be renamed to `Table
> > > > > fromCatalog(String
> > > > > > > > tableName)`.
> > > > > > > >
> > > > > > > > Regarding to the "DmlBatch", DML contains "INSERT", "UPDATE",
> > > > > "DELETE",
> > > > > > > and
> > > > > > > > they can be executed in a same batch in the future. So we can
> > add
> > > > > > > > "addUpdate" method and "addDelete" method to support them.
> > > > > > > >
> > > > > > > > Regarding to the "Inserts addInsert", maybe we can add a
> > > > > > > "DmlBatchBuilder".
> > > > > > > >
> > > > > > > > open to more discussion
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > godfrey
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > Kurt Young <[hidden email]> 于2020年2月13日周四 下午4:56写道:
> > > > > > > >
> > > > > > > > > Regarding to "fromQuery" is confusing users with "Table
> > > > from(String
> > > > > > > > > tableName)", I have
> > > > > > > > > a just opposite opinion. I think this "fromXXX" pattern can
> > > make
> > > > > > users
> > > > > > > > > quite clear when they
> > > > > > > > > want to get a Table from TableEnvironment. Similar
> interfaces
> > > > will
> > > > > > also
> > > > > > > > > include like "fromElements".
> > > > > > > > >
> > > > > > > > > Regarding to the name of DmlBatch, I think it's mainly for
> > > > > > > > > future flexibility, in case we can support
> > > > > > > > > other statement in a single batch. If that happens, the
> name
> > > > > > "Inserts"
> > > > > > > > will
> > > > > > > > > be weird.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Kurt
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Feb 13, 2020 at 4:03 PM Jark Wu <[hidden email]>
> > > > wrote:
> > > > > > > > >
> > > > > > > > > > I agree with Jingsong.
> > > > > > > > > >
> > > > > > > > > > +1 to keep `sqlQuery`, it's clear from the method name
> and
> > > > return
> > > > > > > type
> > > > > > > > > that
> > > > > > > > > > it accepts a SELECT query and returns a logic
> > representation
> > > > > > `Table`.
> > > > > > > > > > The `fromQuery` is a little confused users with the
> `Table
> > > > > > > from(String
> > > > > > > > > > tableName)` method.
> > > > > > > > > >
> > > > > > > > > > Regarding to the `DmlBatch`, I agree with Jingsong,
> AFAIK,
> > > the
> > > > > > > purpose
> > > > > > > > of
> > > > > > > > > > `DmlBatch` is used to batching insert statements.
> > > > > > > > > > Besides, DML terminology is not commonly know among
> users.
> > So
> > > > > what
> > > > > > > > about
> > > > > > > > > > `InsertsBatching startBatchingInserts()` ?
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Jark
> > > > > > > > > >
> > > > > > > > > > On Thu, 13 Feb 2020 at 15:50, Jingsong Li <
> > > > > [hidden email]>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Godfrey,
> > > > > > > > > > >
> > > > > > > > > > > Thanks for updating. +1 sketchy.
> > > > > > > > > > >
> > > > > > > > > > > I have no idea to change "sqlQuery" to "fromQuery", I
> > think
> > > > > > > > "sqlQuery"
> > > > > > > > > is
> > > > > > > > > > > OK, It's not that confusing with return values.
> > > > > > > > > > >
> > > > > > > > > > > Can we change the "DmlBatch" to "Inserts"?  I don't see
> > any
> > > > > other
> > > > > > > > > needs.
> > > > > > > > > > > "Dml" seems a little weird.
> > > > > > > > > > > It is better to support "Inserts addInsert" too. Users
> > can
> > > > > > > > > > > "inserts.addInsert().addInsert()...."
> > > > > > > > > > >
> > > > > > > > > > > I try to match the new interfaces with the old
> interfaces
> > > > > simply.
> > > > > > > > > > > - "startInserts -> addInsert" replace old
> > > "sqlUpdate(insert)"
> > > > > and
> > > > > > > > > > > "insertInto".
> > > > > > > > > > > - "executeStatement" new one, execute all kinds of sqls
> > > > > > > immediately.
> > > > > > > > > > > Including old "sqlUpdate(DDLs)".
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Jingsong Lee
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Feb 12, 2020 at 11:10 AM godfreyhe <
> > > > > [hidden email]>
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > >
> > > > > > > > > > > > I'd like to resume the discussion for FlIP-84 [0]. I
> > had
> > > > > > updated
> > > > > > > > the
> > > > > > > > > > > > document, the mainly changes are:
> > > > > > > > > > > >
> > > > > > > > > > > > 1. about "`void sqlUpdate(String sql)`" section
> > > > > > > > > > > >   a) change "Optional<ResultTable> executeSql(String
> > sql)
> > > > > > throws
> > > > > > > > > > > Exception"
> > > > > > > > > > > > to "ResultTable executeStatement(String statement,
> > String
> > > > > > > jobName)
> > > > > > > > > > throws
> > > > > > > > > > > > Exception". The reason is: "statement" is a more
> > general
> > > > > > concept
> > > > > > > > than
> > > > > > > > > > > > "sql",
> > > > > > > > > > > > e.g. "show xx" is not a sql command (refer to [1]),
> but
> > > is
> > > > a
> > > > > > > > > statement
> > > > > > > > > > > > (just
> > > > > > > > > > > > like JDBC). "insert" statement also has return value
> > > which
> > > > is
> > > > > > the
> > > > > > > > > > > affected
> > > > > > > > > > > > row count, we can unify the return type to
> > "ResultTable"
> > > > > > instead
> > > > > > > of
> > > > > > > > > > > > "Optional<ResultTable>".
> > > > > > > > > > > >   b) add two sub-interfaces for "ResultTable":
> > > > > "RowResultTable"
> > > > > > > is
> > > > > > > > > used
> > > > > > > > > > > for
> > > > > > > > > > > > non-streaming select statement and will not contain
> > > change
> > > > > > flag;
> > > > > > > > > > > > "RowWithChangeFlagResultTable" is used for streaming
> > > select
> > > > > > > > statement
> > > > > > > > > > and
> > > > > > > > > > > > will contain change flag.
> > > > > > > > > > > >
> > > > > > > > > > > > 2) about "Support batch sql execute and explain"
> > section
> > > > > > > > > > > > introduce "DmlBatch" to support both sql and Table
> API
> > > > (which
> > > > > > is
> > > > > > > > > > borrowed
> > > > > > > > > > > > from the ideas Dawid mentioned in the slack)
> > > > > > > > > > > >
> > > > > > > > > > > > interface TableEnvironment {
> > > > > > > > > > > >     DmlBatch startDmlBatch();
> > > > > > > > > > > > }
> > > > > > > > > > > >
> > > > > > > > > > > > interface DmlBatch {
> > > > > > > > > > > >   /**
> > > > > > > > > > > >   * add insert statement to the batch
> > > > > > > > > > > >   */
> > > > > > > > > > > >     void addInsert(String insert);
> > > > > > > > > > > >
> > > > > > > > > > > >  /**
> > > > > > > > > > > >   * add Table with given sink name to the batch
> > > > > > > > > > > >   */
> > > > > > > > > > > >     void addInsert(String sinkName, Table table);
> > > > > > > > > > > >
> > > > > > > > > > > >  /**
> > > > > > > > > > > >   * execute the dml statements as a batch
> > > > > > > > > > > >   */
> > > > > > > > > > > >   ResultTable execute(String jobName) throws
> Exception
> > > > > > > > > > > >
> > > > > > > > > > > >   /**
> > > > > > > > > > > >  * Returns the AST and the execution plan to compute
> > the
> > > > > result
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > > batch
> > > > > > > > > > > > dml statement.
> > > > > > > > > > > >   */
> > > > > > > > > > > >   String explain(boolean extended);
> > > > > > > > > > > > }
> > > > > > > > > > > >
> > > > > > > > > > > > 3) about "Discuss a parse method for multiple
> > statements
> > > > > > execute
> > > > > > > in
> > > > > > > > > SQL
> > > > > > > > > > > > CLI"
> > > > > > > > > > > > section
> > > > > > > > > > > > add the pros and cons for each solution
> > > > > > > > > > > >
> > > > > > > > > > > > 4) update the "Examples" section and "Summary"
> section
> > > > based
> > > > > on
> > > > > > > the
> > > > > > > > > > above
> > > > > > > > > > > > changes
> > > > > > > > > > > >
> > > > > > > > > > > > Please refer the design doc[1] for more details and
> > > welcome
> > > > > any
> > > > > > > > > > feedback.
> > > > > > > > > > > >
> > > > > > > > > > > > Bests,
> > > > > > > > > > > > godfreyhe
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > [0]
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit
> > > > > > > > > > > > [1]
> > > > > > > >
> > https://www.geeksforgeeks.org/sql-ddl-dql-dml-dcl-tcl-commands/
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Sent from:
> > > > > > > > > > >
> > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Best, Jingsong Lee
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Best, Jingsong Lee
> > > > >
> > > >
> > >
> > >
> > > --
> > >
> > > Benchao Li
> > > School of Electronics Engineering and Computer Science, Peking
> University
> > > Tel:+86-15650713730
> > > Email: [hidden email]; [hidden email]
> > >
> >
>
>
> --
>
> Benchao Li
> School of Electronics Engineering and Computer Science, Peking University
> Tel:+86-15650713730
> Email: [hidden email]; [hidden email]
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-84: Improve & Refactor execute/sqlQuery/sqlUpdate APIS of TableEnvironment

Benchao Li
hi godfrey,

Thanks for your information, I didn't see that part before.
Now the FLIP LGTM, thanks for driving this effort.

godfrey he <[hidden email]> 于2020年3月2日周一 上午9:51写道:

> Hi Benchao,
>
> I think the document has contained both parts: the behavior is explained
> when introducing `executeStatement` method, and asynchronous execution
> methods is explained in the appendix.
>
> Bests,
> Godfrey
>
> Benchao Li <[hidden email]> 于2020年2月28日周五 下午10:09写道:
>
> > Hi godfrey,
> >
> > Thanks for your explanation.
> >
> > Do we need to clarify this in the FLIP? Maybe this confuses other users
> as
> > well.
> >
> > godfrey he <[hidden email]> 于2020年2月28日周五 下午4:54写道:
> >
> > > Hi Benchao,
> > >
> > > > I have one question about this FLIP:
> > > > executeStatement  accepts DML, what if it's a streaming DML ?
> > > >    does it submit the job to cluster directly and blocks forever?
> > what's
> > > > the behavior for the next statements?
> > > `executeStatement` is a synchronous method, will execute the statement
> > once
> > > calling this method and return the result until the job is finished.
> > > We will introduce asynchronous method like `executeStatementAsync` in
> the
> > > future.
> > >
> > > > nit: there's a typo in "the table describing the result for each kind
> > of
> > > > statement", "*Result Scheam" -> "Result Schema"*
> > > Thanks for the reminding, I will fix it now.
> > >
> > > Bests,
> > > Godfrey
> > >
> > > Benchao Li <[hidden email]> 于2020年2月28日周五 下午4:00写道:
> > >
> > > > Hi Terry,
> > > >
> > > > Thanks for the propose, and sorry for joining the party late.
> > > >
> > > > I have one question about this FLIP:
> > > > executeStatement  accepts DML, what if it's a streaming DML ?
> > > >     does it submit the job to cluster directly and blocks forever?
> > what's
> > > > the behavior for the next statements?
> > > >
> > > > nit: there's a typo in "the table describing the result for each kind
> > of
> > > > statement", "*Result Scheam" -> "Result Schema"*
> > > >
> > > >
> > > > godfrey he <[hidden email]> 于2020年2月18日周二 下午4:41写道:
> > > >
> > > > > Thanks Kurt and Jark for explanation, I now also think we should
> make
> > > the
> > > > > TableEnvironment interface more statable and should not change
> > > "sqlQuery"
> > > > > method and "from" method.
> > > > >
> > > > > Hi Jingsong. Regarding to the "DmlBatch", I totally agree with
> > > advantages
> > > > > of "addBatch" method. However, there are two more questions need to
> > > > solve:
> > > > > one is how users write multi-sink programs in a Table API ? and
> > another
> > > > is
> > > > > how users explain multi-sink program in both SQL and Table API ?
> > > > > Currently, "DmlBatch" class can solve those questions. (the main
> > > > > disadvantages is Inconsistent with the current interface)
> > > > >
> > > > > Bests,
> > > > > godfrey
> > > > >
> > > > > Jingsong Li <[hidden email]> 于2020年2月15日周六 下午9:09写道:
> > > > >
> > > > > > Hi Kurt and Godfrey,
> > > > > >
> > > > > > Thank you for your explanation.
> > > > > >
> > > > > > Regarding to the "DmlBatch",
> > > > > > I see there are some description for JDBC Statement.addBatch in
> the
> > > > > > document.
> > > > > > What do you think about introducing "addBatch" to the TableEnv
> > > instead
> > > > of
> > > > > > introducing a new class?
> > > > > > The advantage is:
> > > > > > - Consistent with JDBC statement.
> > > > > > - Consistent with current interface, what we need do is just
> modify
> > > > > method
> > > > > > name.
> > > > > >
> > > > > > Best,
> > > > > > Jingsong Lee
> > > > > >
> > > > > >
> > > > > > On Sat, Feb 15, 2020 at 4:48 PM Kurt Young <[hidden email]>
> > wrote:
> > > > > >
> > > > > > > I don't think we should change `from` to `fromCatalog`,
> > especially
> > > > > `from`
> > > > > > > is just
> > > > > > > introduced in 1.10. I agree with Jark we should change
> interface
> > > only
> > > > > > when
> > > > > > > necessary,
> > > > > > > e.g. the semantic is broken or confusing. So I'm +1 to keep
> > > > `sqlQuery`
> > > > > as
> > > > > > > it is.
> > > > > > >
> > > > > > > Best,
> > > > > > > Kurt
> > > > > > >
> > > > > > >
> > > > > > > On Sat, Feb 15, 2020 at 3:59 PM Jark Wu <[hidden email]>
> > wrote:
> > > > > > >
> > > > > > > > Thanks Kurt and Godfrey for the explanation,
> > > > > > > >
> > > > > > > > It makes sense to me that renaming `from(tableName)` to
> > > > > > > > `fromCatalog(tableName)`.
> > > > > > > > However, I still think `sqlQuery(query)` is clear and works
> > well.
> > > > Is
> > > > > it
> > > > > > > > necessary to change it?
> > > > > > > >
> > > > > > > > We removed `sql(query)` and introduced `sqlQuery(query)`, we
> > > > removed
> > > > > > > > `scan(tableName)` and introduced `from(tableName)`,
> > > > > > > > and now we want to remove them again. Users will feel like
> the
> > > > > > interface
> > > > > > > is
> > > > > > > > very unstable, that really frustrates users.
> > > > > > > > I think we should be cautious to remove interface and only
> when
> > > it
> > > > is
> > > > > > > > necessary.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Jark
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, 13 Feb 2020 at 20:58, godfrey he <
> [hidden email]>
> > > > > wrote:
> > > > > > > >
> > > > > > > > > hi kurt,jark,jingsong
> > > > > > > > >
> > > > > > > > > Regarding to "fromQuery", I agree with kurt. In addition, I
> > > think
> > > > > > > `Table
> > > > > > > > > from(String tableName)` should be renamed to `Table
> > > > > > fromCatalog(String
> > > > > > > > > tableName)`.
> > > > > > > > >
> > > > > > > > > Regarding to the "DmlBatch", DML contains "INSERT",
> "UPDATE",
> > > > > > "DELETE",
> > > > > > > > and
> > > > > > > > > they can be executed in a same batch in the future. So we
> can
> > > add
> > > > > > > > > "addUpdate" method and "addDelete" method to support them.
> > > > > > > > >
> > > > > > > > > Regarding to the "Inserts addInsert", maybe we can add a
> > > > > > > > "DmlBatchBuilder".
> > > > > > > > >
> > > > > > > > > open to more discussion
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > godfrey
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Kurt Young <[hidden email]> 于2020年2月13日周四 下午4:56写道:
> > > > > > > > >
> > > > > > > > > > Regarding to "fromQuery" is confusing users with "Table
> > > > > from(String
> > > > > > > > > > tableName)", I have
> > > > > > > > > > a just opposite opinion. I think this "fromXXX" pattern
> can
> > > > make
> > > > > > > users
> > > > > > > > > > quite clear when they
> > > > > > > > > > want to get a Table from TableEnvironment. Similar
> > interfaces
> > > > > will
> > > > > > > also
> > > > > > > > > > include like "fromElements".
> > > > > > > > > >
> > > > > > > > > > Regarding to the name of DmlBatch, I think it's mainly
> for
> > > > > > > > > > future flexibility, in case we can support
> > > > > > > > > > other statement in a single batch. If that happens, the
> > name
> > > > > > > "Inserts"
> > > > > > > > > will
> > > > > > > > > > be weird.
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Kurt
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, Feb 13, 2020 at 4:03 PM Jark Wu <
> [hidden email]>
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I agree with Jingsong.
> > > > > > > > > > >
> > > > > > > > > > > +1 to keep `sqlQuery`, it's clear from the method name
> > and
> > > > > return
> > > > > > > > type
> > > > > > > > > > that
> > > > > > > > > > > it accepts a SELECT query and returns a logic
> > > representation
> > > > > > > `Table`.
> > > > > > > > > > > The `fromQuery` is a little confused users with the
> > `Table
> > > > > > > > from(String
> > > > > > > > > > > tableName)` method.
> > > > > > > > > > >
> > > > > > > > > > > Regarding to the `DmlBatch`, I agree with Jingsong,
> > AFAIK,
> > > > the
> > > > > > > > purpose
> > > > > > > > > of
> > > > > > > > > > > `DmlBatch` is used to batching insert statements.
> > > > > > > > > > > Besides, DML terminology is not commonly know among
> > users.
> > > So
> > > > > > what
> > > > > > > > > about
> > > > > > > > > > > `InsertsBatching startBatchingInserts()` ?
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Jark
> > > > > > > > > > >
> > > > > > > > > > > On Thu, 13 Feb 2020 at 15:50, Jingsong Li <
> > > > > > [hidden email]>
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Godfrey,
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for updating. +1 sketchy.
> > > > > > > > > > > >
> > > > > > > > > > > > I have no idea to change "sqlQuery" to "fromQuery", I
> > > think
> > > > > > > > > "sqlQuery"
> > > > > > > > > > is
> > > > > > > > > > > > OK, It's not that confusing with return values.
> > > > > > > > > > > >
> > > > > > > > > > > > Can we change the "DmlBatch" to "Inserts"?  I don't
> see
> > > any
> > > > > > other
> > > > > > > > > > needs.
> > > > > > > > > > > > "Dml" seems a little weird.
> > > > > > > > > > > > It is better to support "Inserts addInsert" too.
> Users
> > > can
> > > > > > > > > > > > "inserts.addInsert().addInsert()...."
> > > > > > > > > > > >
> > > > > > > > > > > > I try to match the new interfaces with the old
> > interfaces
> > > > > > simply.
> > > > > > > > > > > > - "startInserts -> addInsert" replace old
> > > > "sqlUpdate(insert)"
> > > > > > and
> > > > > > > > > > > > "insertInto".
> > > > > > > > > > > > - "executeStatement" new one, execute all kinds of
> sqls
> > > > > > > > immediately.
> > > > > > > > > > > > Including old "sqlUpdate(DDLs)".
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Jingsong Lee
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Feb 12, 2020 at 11:10 AM godfreyhe <
> > > > > > [hidden email]>
> > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > >
> > > > > > > > > > > > > I'd like to resume the discussion for FlIP-84 [0].
> I
> > > had
> > > > > > > updated
> > > > > > > > > the
> > > > > > > > > > > > > document, the mainly changes are:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1. about "`void sqlUpdate(String sql)`" section
> > > > > > > > > > > > >   a) change "Optional<ResultTable>
> executeSql(String
> > > sql)
> > > > > > > throws
> > > > > > > > > > > > Exception"
> > > > > > > > > > > > > to "ResultTable executeStatement(String statement,
> > > String
> > > > > > > > jobName)
> > > > > > > > > > > throws
> > > > > > > > > > > > > Exception". The reason is: "statement" is a more
> > > general
> > > > > > > concept
> > > > > > > > > than
> > > > > > > > > > > > > "sql",
> > > > > > > > > > > > > e.g. "show xx" is not a sql command (refer to [1]),
> > but
> > > > is
> > > > > a
> > > > > > > > > > statement
> > > > > > > > > > > > > (just
> > > > > > > > > > > > > like JDBC). "insert" statement also has return
> value
> > > > which
> > > > > is
> > > > > > > the
> > > > > > > > > > > > affected
> > > > > > > > > > > > > row count, we can unify the return type to
> > > "ResultTable"
> > > > > > > instead
> > > > > > > > of
> > > > > > > > > > > > > "Optional<ResultTable>".
> > > > > > > > > > > > >   b) add two sub-interfaces for "ResultTable":
> > > > > > "RowResultTable"
> > > > > > > > is
> > > > > > > > > > used
> > > > > > > > > > > > for
> > > > > > > > > > > > > non-streaming select statement and will not contain
> > > > change
> > > > > > > flag;
> > > > > > > > > > > > > "RowWithChangeFlagResultTable" is used for
> streaming
> > > > select
> > > > > > > > > statement
> > > > > > > > > > > and
> > > > > > > > > > > > > will contain change flag.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2) about "Support batch sql execute and explain"
> > > section
> > > > > > > > > > > > > introduce "DmlBatch" to support both sql and Table
> > API
> > > > > (which
> > > > > > > is
> > > > > > > > > > > borrowed
> > > > > > > > > > > > > from the ideas Dawid mentioned in the slack)
> > > > > > > > > > > > >
> > > > > > > > > > > > > interface TableEnvironment {
> > > > > > > > > > > > >     DmlBatch startDmlBatch();
> > > > > > > > > > > > > }
> > > > > > > > > > > > >
> > > > > > > > > > > > > interface DmlBatch {
> > > > > > > > > > > > >   /**
> > > > > > > > > > > > >   * add insert statement to the batch
> > > > > > > > > > > > >   */
> > > > > > > > > > > > >     void addInsert(String insert);
> > > > > > > > > > > > >
> > > > > > > > > > > > >  /**
> > > > > > > > > > > > >   * add Table with given sink name to the batch
> > > > > > > > > > > > >   */
> > > > > > > > > > > > >     void addInsert(String sinkName, Table table);
> > > > > > > > > > > > >
> > > > > > > > > > > > >  /**
> > > > > > > > > > > > >   * execute the dml statements as a batch
> > > > > > > > > > > > >   */
> > > > > > > > > > > > >   ResultTable execute(String jobName) throws
> > Exception
> > > > > > > > > > > > >
> > > > > > > > > > > > >   /**
> > > > > > > > > > > > >  * Returns the AST and the execution plan to
> compute
> > > the
> > > > > > result
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > > > > batch
> > > > > > > > > > > > > dml statement.
> > > > > > > > > > > > >   */
> > > > > > > > > > > > >   String explain(boolean extended);
> > > > > > > > > > > > > }
> > > > > > > > > > > > >
> > > > > > > > > > > > > 3) about "Discuss a parse method for multiple
> > > statements
> > > > > > > execute
> > > > > > > > in
> > > > > > > > > > SQL
> > > > > > > > > > > > > CLI"
> > > > > > > > > > > > > section
> > > > > > > > > > > > > add the pros and cons for each solution
> > > > > > > > > > > > >
> > > > > > > > > > > > > 4) update the "Examples" section and "Summary"
> > section
> > > > > based
> > > > > > on
> > > > > > > > the
> > > > > > > > > > > above
> > > > > > > > > > > > > changes
> > > > > > > > > > > > >
> > > > > > > > > > > > > Please refer the design doc[1] for more details and
> > > > welcome
> > > > > > any
> > > > > > > > > > > feedback.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Bests,
> > > > > > > > > > > > > godfreyhe
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > [0]
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/19-mdYJjKirh5aXCwq1fDajSaI09BJMMT95wy_YhtuZk/edit
> > > > > > > > > > > > > [1]
> > > > > > > > >
> > > https://www.geeksforgeeks.org/sql-ddl-dql-dml-dcl-tcl-commands/
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > Sent from:
> > > > > > > > > > > >
> > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > Best, Jingsong Lee
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best, Jingsong Lee
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Benchao Li
> > > > School of Electronics Engineering and Computer Science, Peking
> > University
> > > > Tel:+86-15650713730
> > > > Email: [hidden email]; [hidden email]
> > > >
> > >
> >
> >
> > --
> >
> > Benchao Li
> > School of Electronics Engineering and Computer Science, Peking University
> > Tel:+86-15650713730
> > Email: [hidden email]; [hidden email]
> >
>


--

Benchao Li
School of Electronics Engineering and Computer Science, Peking University
Tel:+86-15650713730
Email: [hidden email]; [hidden email]