Hi dev,
Recently we have tested the brand-new SQLClient and Flink SQL module, and we are amazed at this simple way of programming for streaming data analysis. However, as far as I know, the SET command is only available in the SQL Client, but not in SQL API. Although we understand that developers could simply set TableConfig via tEnv .getConfig().getConfiguration() API, however, we hope that there could be an API like sqlSet() or something like that, to allow for setting table configurations within SQL statements themselves, which paves the way for a unified interface for users to write a Flink SQL job, without the need of writing any Java or Scala code in a production environment. Moreover, it could be much better if there could be an API that automatically detect the type of SQL statement and choose the write logic to execute, instead of manually choosing sqlUpdate or sqlQuery, i.e. sql("CREATE TABLE abc ( a VARCHAR(10), b BIGINT ) WITH ( 'xxx' = 'yyy' )"); sql("SET table.exec.mini-batch.enabled = 'true'"); sql("INSERT INTO sink SELECT * FROM abc"); then, users could simply write their SQL code within .sql files and Flink could read them line by line and call sql() method to parse the code, and eventually submit to the ExecutionEnvironment and run the program in the cluster, which is different from current SQL client whose interactive way of programming is not well suited for production usage. We would like to know if these proposals contradicts with the current plan of the community, or if any other issues that should be addressed before implementing such features. Thanks, Weike |
Hi Weike,
Thanks for kicking off this discussion! I cannot agree more on the proposal for a universal sql() method. It confuses & annoys our users a lot to distinguish sqlUpdate/sqlQuery and even insertInto and so on. IIRC there is an ongoing FLIP[1] dealing with the problem. You can checkout to see if it fits into your requirements. Besides, for enabling SET in sql statement, I agree that it helps on consistent user experience using *just* SQL to describe their Flink job. Looking forward to maintainers' idea on the possibility & plan. Best, tison. [1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878 DONG, Weike <[hidden email]> 于2020年3月9日周一 下午12:46写道: > Hi dev, > > Recently we have tested the brand-new SQLClient and Flink SQL module, and > we are amazed at this simple way of programming for streaming data > analysis. However, as far as I know, the SET command is only available in > the SQL Client, but not in SQL API. > > Although we understand that developers could simply set TableConfig via > tEnv > .getConfig().getConfiguration() API, however, we hope that there could be > an API like sqlSet() or something like that, to allow for setting table > configurations within SQL statements themselves, which paves the way for a > unified interface for users to write a Flink SQL job, without the need of > writing any Java or Scala code in a production environment. > > Moreover, it could be much better if there could be an API that > automatically detect the type of SQL statement and choose the write logic > to execute, instead of manually choosing sqlUpdate or sqlQuery, i.e. > > sql("CREATE TABLE abc ( a VARCHAR(10), b BIGINT ) WITH ( 'xxx' = 'yyy' )"); > sql("SET table.exec.mini-batch.enabled = 'true'"); > sql("INSERT INTO sink SELECT * FROM abc"); > > then, users could simply write their SQL code within .sql files and Flink > could read them line by line and call sql() method to parse the code, and > eventually submit to the ExecutionEnvironment and run the program in the > cluster, which is different from current SQL client whose interactive way > of programming is not well suited for production usage. > > We would like to know if these proposals contradicts with the current plan > of the community, or if any other issues that should be addressed before > implementing such features. > > Thanks, > Weike > |
Hi Weike and Tison,
This is already covered in FLIP-84 [1], we will propose a new method "executeStatement(String statement)" which can execute arbitrary statement including SET, CREATE. This is in the progress [2]. Best, Jark [1]: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878 [2]: https://issues.apache.org/jira/browse/FLINK-16366 On Mon, 9 Mar 2020 at 13:22, tison <[hidden email]> wrote: > Hi Weike, > > Thanks for kicking off this discussion! I cannot agree more on the > proposal for > a universal sql() method. It confuses & annoys our users a lot to > distinguish > sqlUpdate/sqlQuery and even insertInto and so on. > > IIRC there is an ongoing FLIP[1] dealing with the problem. You can > checkout to > see if it fits into your requirements. > > Besides, for enabling SET in sql statement, I agree that it helps on > consistent user > experience using *just* SQL to describe their Flink job. Looking forward > to maintainers' > idea on the possibility & plan. > > Best, > tison. > > [1] > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878 > > > DONG, Weike <[hidden email]> 于2020年3月9日周一 下午12:46写道: > >> Hi dev, >> >> Recently we have tested the brand-new SQLClient and Flink SQL module, and >> we are amazed at this simple way of programming for streaming data >> analysis. However, as far as I know, the SET command is only available in >> the SQL Client, but not in SQL API. >> >> Although we understand that developers could simply set TableConfig via >> tEnv >> .getConfig().getConfiguration() API, however, we hope that there could be >> an API like sqlSet() or something like that, to allow for setting table >> configurations within SQL statements themselves, which paves the way for a >> unified interface for users to write a Flink SQL job, without the need of >> writing any Java or Scala code in a production environment. >> >> Moreover, it could be much better if there could be an API that >> automatically detect the type of SQL statement and choose the write logic >> to execute, instead of manually choosing sqlUpdate or sqlQuery, i.e. >> >> sql("CREATE TABLE abc ( a VARCHAR(10), b BIGINT ) WITH ( 'xxx' = 'yyy' >> )"); >> sql("SET table.exec.mini-batch.enabled = 'true'"); >> sql("INSERT INTO sink SELECT * FROM abc"); >> >> then, users could simply write their SQL code within .sql files and Flink >> could read them line by line and call sql() method to parse the code, and >> eventually submit to the ExecutionEnvironment and run the program in the >> cluster, which is different from current SQL client whose interactive way >> of programming is not well suited for production usage. >> >> We would like to know if these proposals contradicts with the current plan >> of the community, or if any other issues that should be addressed before >> implementing such features. >> >> Thanks, >> Weike >> > |
In reply to this post by tison
Hi Tison and all,
Thanks for the timely response, and I have carefully examined the aforementioned FLIP-84. As I see it, executeStatement() is kind of akin to our original design of sql() method, but with more detailed considerations included. However, it does not cover SET statement to tune TableConfig, and the differences among all those Environments (TableEnvironment, StreamTableEnvironment, ExecutionEnvironment, StreamExecutionEnvironment, etc.) might still confuse users, especially about the effects of execute() / executeStatements() methods when old APIs are not yet completely removed, which poses as a heavy burden for newcomers and hinders user-adoption process. Therefore I believe that the table API needs a further cohesive re-design by improving FLIP-84, or provide a purely SQL interface which removes the burden of learning all those complex concepts (run Flink streaming or batch programs from SQL files). Hope to hear any suggestions or questions, thanks : ) Sincerely, Weike On Mon, Mar 9, 2020 at 1:23 PM tison <[hidden email]> wrote: > Hi Weike, > > Thanks for kicking off this discussion! I cannot agree more on the proposal > for > a universal sql() method. It confuses & annoys our users a lot to > distinguish > sqlUpdate/sqlQuery and even insertInto and so on. > > IIRC there is an ongoing FLIP[1] dealing with the problem. You can checkout > to > see if it fits into your requirements. > > Besides, for enabling SET in sql statement, I agree that it helps on > consistent user > experience using *just* SQL to describe their Flink job. Looking forward to > maintainers' > idea on the possibility & plan. > > Best, > tison. > > [1] > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878 > > > DONG, Weike <[hidden email]> 于2020年3月9日周一 下午12:46写道: > > > Hi dev, > > > > Recently we have tested the brand-new SQLClient and Flink SQL module, and > > we are amazed at this simple way of programming for streaming data > > analysis. However, as far as I know, the SET command is only available in > > the SQL Client, but not in SQL API. > > > > Although we understand that developers could simply set TableConfig via > > tEnv > > .getConfig().getConfiguration() API, however, we hope that there could be > > an API like sqlSet() or something like that, to allow for setting table > > configurations within SQL statements themselves, which paves the way for > a > > unified interface for users to write a Flink SQL job, without the need of > > writing any Java or Scala code in a production environment. > > > > Moreover, it could be much better if there could be an API that > > automatically detect the type of SQL statement and choose the write logic > > to execute, instead of manually choosing sqlUpdate or sqlQuery, i.e. > > > > sql("CREATE TABLE abc ( a VARCHAR(10), b BIGINT ) WITH ( 'xxx' = 'yyy' > )"); > > sql("SET table.exec.mini-batch.enabled = 'true'"); > > sql("INSERT INTO sink SELECT * FROM abc"); > > > > then, users could simply write their SQL code within .sql files and Flink > > could read them line by line and call sql() method to parse the code, and > > eventually submit to the ExecutionEnvironment and run the program in the > > cluster, which is different from current SQL client whose interactive way > > of programming is not well suited for production usage. > > > > We would like to know if these proposals contradicts with the current > plan > > of the community, or if any other issues that should be addressed before > > implementing such features. > > > > Thanks, > > Weike > > > |
Hi Weike,
thanks for your feedback. Your use case is definitely on our agenda. The redesign of big parts of the API is still in progress. In the mid-term, most of the SQL Client commands should be present in the SQL API as well such that platform teams can built their custom logic (like REST APIs etc.) around it. For pure SQL users, there are discussions of making the SQL Client richer in the future see: https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Client+Gateway Regards, Timo On 09.03.20 08:15, DONG, Weike wrote: > Hi Tison and all, > > Thanks for the timely response, and I have carefully examined the > aforementioned FLIP-84. As I see it, executeStatement() is kind of akin to > our original design of sql() method, but with more detailed considerations > included. > > However, it does not cover SET statement to tune TableConfig, and the > differences among all those Environments (TableEnvironment, > StreamTableEnvironment, ExecutionEnvironment, StreamExecutionEnvironment, > etc.) might still confuse users, especially about the effects of execute() > / executeStatements() methods when old APIs are not yet completely removed, > which poses as a heavy burden for newcomers and hinders user-adoption > process. > > Therefore I believe that the table API needs a further cohesive re-design > by improving FLIP-84, or provide a purely SQL interface which removes the > burden of learning all those complex concepts (run Flink streaming or batch > programs from SQL files). > > Hope to hear any suggestions or questions, thanks : ) > > Sincerely, > Weike > > On Mon, Mar 9, 2020 at 1:23 PM tison <[hidden email]> wrote: > >> Hi Weike, >> >> Thanks for kicking off this discussion! I cannot agree more on the proposal >> for >> a universal sql() method. It confuses & annoys our users a lot to >> distinguish >> sqlUpdate/sqlQuery and even insertInto and so on. >> >> IIRC there is an ongoing FLIP[1] dealing with the problem. You can checkout >> to >> see if it fits into your requirements. >> >> Besides, for enabling SET in sql statement, I agree that it helps on >> consistent user >> experience using *just* SQL to describe their Flink job. Looking forward to >> maintainers' >> idea on the possibility & plan. >> >> Best, >> tison. >> >> [1] >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878 >> >> >> DONG, Weike <[hidden email]> 于2020年3月9日周一 下午12:46写道: >> >>> Hi dev, >>> >>> Recently we have tested the brand-new SQLClient and Flink SQL module, and >>> we are amazed at this simple way of programming for streaming data >>> analysis. However, as far as I know, the SET command is only available in >>> the SQL Client, but not in SQL API. >>> >>> Although we understand that developers could simply set TableConfig via >>> tEnv >>> .getConfig().getConfiguration() API, however, we hope that there could be >>> an API like sqlSet() or something like that, to allow for setting table >>> configurations within SQL statements themselves, which paves the way for >> a >>> unified interface for users to write a Flink SQL job, without the need of >>> writing any Java or Scala code in a production environment. >>> >>> Moreover, it could be much better if there could be an API that >>> automatically detect the type of SQL statement and choose the write logic >>> to execute, instead of manually choosing sqlUpdate or sqlQuery, i.e. >>> >>> sql("CREATE TABLE abc ( a VARCHAR(10), b BIGINT ) WITH ( 'xxx' = 'yyy' >> )"); >>> sql("SET table.exec.mini-batch.enabled = 'true'"); >>> sql("INSERT INTO sink SELECT * FROM abc"); >>> >>> then, users could simply write their SQL code within .sql files and Flink >>> could read them line by line and call sql() method to parse the code, and >>> eventually submit to the ExecutionEnvironment and run the program in the >>> cluster, which is different from current SQL client whose interactive way >>> of programming is not well suited for production usage. >>> >>> We would like to know if these proposals contradicts with the current >> plan >>> of the community, or if any other issues that should be addressed before >>> implementing such features. >>> >>> Thanks, >>> Weike >>> >> > |
Hi Timo,
After carefully read FLIP-91 (SQL Client Gateway), I have found that it still focuses on ad-hoc (or realtime) queries of batch data, which is quite different from the streaming case. Here I suppose if we could combine some features in FLIP-84 (generic all-purpose executeStatement() ) with JDBC compliant SQL Gateway (FLIP-91), to make submitting online streaming SQL jobs feasible. Just an immature thought, and would like to know if the community plans to do so in the foreseeable future : ) Best, Weike On Mon, Mar 9, 2020 at 6:07 PM Timo Walther <[hidden email]> wrote: > Hi Weike, > > thanks for your feedback. Your use case is definitely on our agenda. The > redesign of big parts of the API is still in progress. In the mid-term, > most of the SQL Client commands should be present in the SQL API as well > such that platform teams can built their custom logic (like REST APIs > etc.) around it. > > For pure SQL users, there are discussions of making the SQL Client > richer in the future see: > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-91%3A+Support+SQL+Client+Gateway > > Regards, > Timo > > > On 09.03.20 08:15, DONG, Weike wrote: > > Hi Tison and all, > > > > Thanks for the timely response, and I have carefully examined the > > aforementioned FLIP-84. As I see it, executeStatement() is kind of akin > to > > our original design of sql() method, but with more detailed > considerations > > included. > > > > However, it does not cover SET statement to tune TableConfig, and the > > differences among all those Environments (TableEnvironment, > > StreamTableEnvironment, ExecutionEnvironment, StreamExecutionEnvironment, > > etc.) might still confuse users, especially about the effects of > execute() > > / executeStatements() methods when old APIs are not yet completely > removed, > > which poses as a heavy burden for newcomers and hinders user-adoption > > process. > > > > Therefore I believe that the table API needs a further cohesive re-design > > by improving FLIP-84, or provide a purely SQL interface which removes the > > burden of learning all those complex concepts (run Flink streaming or > batch > > programs from SQL files). > > > > Hope to hear any suggestions or questions, thanks : ) > > > > Sincerely, > > Weike > > > > On Mon, Mar 9, 2020 at 1:23 PM tison <[hidden email]> wrote: > > > >> Hi Weike, > >> > >> Thanks for kicking off this discussion! I cannot agree more on the > proposal > >> for > >> a universal sql() method. It confuses & annoys our users a lot to > >> distinguish > >> sqlUpdate/sqlQuery and even insertInto and so on. > >> > >> IIRC there is an ongoing FLIP[1] dealing with the problem. You can > checkout > >> to > >> see if it fits into your requirements. > >> > >> Besides, for enabling SET in sql statement, I agree that it helps on > >> consistent user > >> experience using *just* SQL to describe their Flink job. Looking > forward to > >> maintainers' > >> idea on the possibility & plan. > >> > >> Best, > >> tison. > >> > >> [1] > >> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=134745878 > >> > >> > >> DONG, Weike <[hidden email]> 于2020年3月9日周一 下午12:46写道: > >> > >>> Hi dev, > >>> > >>> Recently we have tested the brand-new SQLClient and Flink SQL module, > and > >>> we are amazed at this simple way of programming for streaming data > >>> analysis. However, as far as I know, the SET command is only available > in > >>> the SQL Client, but not in SQL API. > >>> > >>> Although we understand that developers could simply set TableConfig via > >>> tEnv > >>> .getConfig().getConfiguration() API, however, we hope that there could > be > >>> an API like sqlSet() or something like that, to allow for setting table > >>> configurations within SQL statements themselves, which paves the way > for > >> a > >>> unified interface for users to write a Flink SQL job, without the need > of > >>> writing any Java or Scala code in a production environment. > >>> > >>> Moreover, it could be much better if there could be an API that > >>> automatically detect the type of SQL statement and choose the write > logic > >>> to execute, instead of manually choosing sqlUpdate or sqlQuery, i.e. > >>> > >>> sql("CREATE TABLE abc ( a VARCHAR(10), b BIGINT ) WITH ( 'xxx' = 'yyy' > >> )"); > >>> sql("SET table.exec.mini-batch.enabled = 'true'"); > >>> sql("INSERT INTO sink SELECT * FROM abc"); > >>> > >>> then, users could simply write their SQL code within .sql files and > Flink > >>> could read them line by line and call sql() method to parse the code, > and > >>> eventually submit to the ExecutionEnvironment and run the program in > the > >>> cluster, which is different from current SQL client whose interactive > way > >>> of programming is not well suited for production usage. > >>> > >>> We would like to know if these proposals contradicts with the current > >> plan > >>> of the community, or if any other issues that should be addressed > before > >>> implementing such features. > >>> > >>> Thanks, > >>> Weike > >>> > >> > > > > |
Free forum by Nabble | Edit this page |