[DISCUSS] FLIP-60: Restructure the Table API & SQL documentation

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] FLIP-60: Restructure the Table API & SQL documentation

Timo Walther-2
Hi everyone,

the Table API & SQL documentation was already in a very good shape in
Flink 1.8. However, in the past it was mostly presented as an addition
to DataStream API. As the Table and SQL world is growing quickly,
stabilizes in its concepts, and is considered as another top-level API
and closed ecosystem, it is time to restructure the docs a little bit to
represent the vision of FLIP-32.

Current state:
https://ci.apache.org/projects/flink/flink-docs-master/dev/table/

We would like to propose the following FLIP-60 for a new structure:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=127405685

Looking forward to feedback.

Thanks,

Timo


Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-60: Restructure the Table API & SQL documentation

dwysakowicz
+1 to the idea of restructuring the docs.

My only suggestion to consider is how about moving the
User-Defined-Extensions subpages to corresponding broader topics?

Sources & Sinks >> Connect to external systems

Catalogs >> Connect to external systems

and then have a Functions sections with subsections:

functions

    |- built in functions

    |- user defined functions


Best,

Dawid

On 30/08/2019 10:59, Timo Walther wrote:

> Hi everyone,
>
> the Table API & SQL documentation was already in a very good shape in
> Flink 1.8. However, in the past it was mostly presented as an addition
> to DataStream API. As the Table and SQL world is growing quickly,
> stabilizes in its concepts, and is considered as another top-level API
> and closed ecosystem, it is time to restructure the docs a little bit
> to represent the vision of FLIP-32.
>
> Current state:
> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/
>
> We would like to propose the following FLIP-60 for a new structure:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=127405685
>
>
> Looking forward to feedback.
>
> Thanks,
>
> Timo
>
>
>


signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-60: Restructure the Table API & SQL documentation

vino yang
Agree with Dawid's suggestion about function.

Having a Functions section to unify the built-in function and UDF would be
better.

Dawid Wysakowicz <[hidden email]> 于2019年8月30日周五 下午7:43写道:

> +1 to the idea of restructuring the docs.
>
> My only suggestion to consider is how about moving the
> User-Defined-Extensions subpages to corresponding broader topics?
>
> Sources & Sinks >> Connect to external systems
>
> Catalogs >> Connect to external systems
>
> and then have a Functions sections with subsections:
>
> functions
>
>     |- built in functions
>
>     |- user defined functions
>
>
> Best,
>
> Dawid
>
> On 30/08/2019 10:59, Timo Walther wrote:
> > Hi everyone,
> >
> > the Table API & SQL documentation was already in a very good shape in
> > Flink 1.8. However, in the past it was mostly presented as an addition
> > to DataStream API. As the Table and SQL world is growing quickly,
> > stabilizes in its concepts, and is considered as another top-level API
> > and closed ecosystem, it is time to restructure the docs a little bit
> > to represent the vision of FLIP-32.
> >
> > Current state:
> > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/
> >
> > We would like to propose the following FLIP-60 for a new structure:
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=127405685
> >
> >
> > Looking forward to feedback.
> >
> > Thanks,
> >
> > Timo
> >
> >
> >
>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-60: Restructure the Table API & SQL documentation

Kurt Young
+1 to the general idea and thanks for driving this. I think the new
structure is
more clear than the old one, and i have some suggestions:

1. How about adding a "Architecture & Internals" chapter? This can help
developers
or users who want to contribute more to have a better understanding about
Table.
Essentially with blink planner, we merged a lots of codes and features but
lack of
proper user and design documents.

2. Add a dedicated "Hive Integration" chapter. We spend lots of effort on
integrating
hive, and hive integration is happened in different areas, like catalog,
function and
maybe ddl in the future. I think a dedicated chapter can make users who are
interested
in this topic easier to find the information they need.

3. Add a chapter about how to manage, monitor or tune the Table & SQL jobs,
and
might adding something like how to migrate old version jobs to new version
in the future.

Best,
Kurt


On Mon, Sep 2, 2019 at 4:17 PM vino yang <[hidden email]> wrote:

> Agree with Dawid's suggestion about function.
>
> Having a Functions section to unify the built-in function and UDF would be
> better.
>
> Dawid Wysakowicz <[hidden email]> 于2019年8月30日周五 下午7:43写道:
>
> > +1 to the idea of restructuring the docs.
> >
> > My only suggestion to consider is how about moving the
> > User-Defined-Extensions subpages to corresponding broader topics?
> >
> > Sources & Sinks >> Connect to external systems
> >
> > Catalogs >> Connect to external systems
> >
> > and then have a Functions sections with subsections:
> >
> > functions
> >
> >     |- built in functions
> >
> >     |- user defined functions
> >
> >
> > Best,
> >
> > Dawid
> >
> > On 30/08/2019 10:59, Timo Walther wrote:
> > > Hi everyone,
> > >
> > > the Table API & SQL documentation was already in a very good shape in
> > > Flink 1.8. However, in the past it was mostly presented as an addition
> > > to DataStream API. As the Table and SQL world is growing quickly,
> > > stabilizes in its concepts, and is considered as another top-level API
> > > and closed ecosystem, it is time to restructure the docs a little bit
> > > to represent the vision of FLIP-32.
> > >
> > > Current state:
> > > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/
> > >
> > > We would like to propose the following FLIP-60 for a new structure:
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=127405685
> > >
> > >
> > > Looking forward to feedback.
> > >
> > > Thanks,
> > >
> > > Timo
> > >
> > >
> > >
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-60: Restructure the Table API & SQL documentation

Jark Wu-2
big +1 to the idea of restructuring the docs. We got a lot of complaints
from users about the Table & SQL docs.

In general, I think the new structure is very nice.

Regarding to moving "User-defined Extensions" to corresponding broader
topics, I would prefer current "User-defined Extensions".
Because it is a more advanced topic than "Connect to external systems" and
"Builtin Functions", and we can mention the common points (e.g. pom
dependency) in the overview of the Extensions section.
Besides that, I would like to keep Builtin Functions as a top-level to make
it have more exposure and may further split the page.

I have some other suggestions:

1) Having subpages under "Built-in Functions". For example:

Built-in Functions
 - Mathematical Functions
 - Bit Functions
 - Date and Time Functions
 - Conditional Functions
 - String Functions
 - Aggregate Functions
 - ...

Currently, all the functions are squeezed in one page. It make the
page bloated.
Meanwhile, I think it would be great to enrich the built-in functions with
argument explanation and more clear examples like MySQL[1] and other
DataBase docs.

2) +1 to the "Architecture & Internals" chapter.
We already have a pull request[2] to add "Streaming Aggregation Performance
Tuning" page which talks about the performance tuning tips around streaming
aggregation and the internals.
Maybe we can put it under the internal chapter or a "Performance Tuning"
chapter.

3) How about restructure SQL chapter a bit like this?

SQL
 - Overview
 - Data Manipulation Statements (all operations available in SQL)
 - Data Definition Statements (DDL syntaxes)
 - Pattern Matching

It renames "Full Reference" to "Data Manipulation Statements" which is more
align with "Data Definition Statements".


Regards,
Jark

[1]:
https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_adddate
[2]: https://github.com/apache/flink/pull/9525





On Mon, 2 Sep 2019 at 17:29, Kurt Young <[hidden email]> wrote:

> +1 to the general idea and thanks for driving this. I think the new
> structure is
> more clear than the old one, and i have some suggestions:
>
> 1. How about adding a "Architecture & Internals" chapter? This can help
> developers
> or users who want to contribute more to have a better understanding about
> Table.
> Essentially with blink planner, we merged a lots of codes and features but
> lack of
> proper user and design documents.
>
> 2. Add a dedicated "Hive Integration" chapter. We spend lots of effort on
> integrating
> hive, and hive integration is happened in different areas, like catalog,
> function and
> maybe ddl in the future. I think a dedicated chapter can make users who are
> interested
> in this topic easier to find the information they need.
>
> 3. Add a chapter about how to manage, monitor or tune the Table & SQL jobs,
> and
> might adding something like how to migrate old version jobs to new version
> in the future.
>
> Best,
> Kurt
>
>
> On Mon, Sep 2, 2019 at 4:17 PM vino yang <[hidden email]> wrote:
>
> > Agree with Dawid's suggestion about function.
> >
> > Having a Functions section to unify the built-in function and UDF would
> be
> > better.
> >
> > Dawid Wysakowicz <[hidden email]> 于2019年8月30日周五 下午7:43写道:
> >
> > > +1 to the idea of restructuring the docs.
> > >
> > > My only suggestion to consider is how about moving the
> > > User-Defined-Extensions subpages to corresponding broader topics?
> > >
> > > Sources & Sinks >> Connect to external systems
> > >
> > > Catalogs >> Connect to external systems
> > >
> > > and then have a Functions sections with subsections:
> > >
> > > functions
> > >
> > >     |- built in functions
> > >
> > >     |- user defined functions
> > >
> > >
> > > Best,
> > >
> > > Dawid
> > >
> > > On 30/08/2019 10:59, Timo Walther wrote:
> > > > Hi everyone,
> > > >
> > > > the Table API & SQL documentation was already in a very good shape in
> > > > Flink 1.8. However, in the past it was mostly presented as an
> addition
> > > > to DataStream API. As the Table and SQL world is growing quickly,
> > > > stabilizes in its concepts, and is considered as another top-level
> API
> > > > and closed ecosystem, it is time to restructure the docs a little bit
> > > > to represent the vision of FLIP-32.
> > > >
> > > > Current state:
> > > > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/
> > > >
> > > > We would like to propose the following FLIP-60 for a new structure:
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=127405685
> > > >
> > > >
> > > > Looking forward to feedback.
> > > >
> > > > Thanks,
> > > >
> > > > Timo
> > > >
> > > >
> > > >
> > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-60: Restructure the Table API & SQL documentation

Stephan Ewen
There are also some other efforts to restructure the docs, which have
resulted until now in more quickstarts and more concepts.

IIRC there is the goal to have a big section on concepts for the whole
system: streaming concepts, time, order, etc.
The API docs would be really more about an API specific reference guide
then.

Should the table API concepts be a section in the overall concepts then?


On Tue, Sep 3, 2019 at 5:11 AM Jark Wu <[hidden email]> wrote:

> big +1 to the idea of restructuring the docs. We got a lot of complaints
> from users about the Table & SQL docs.
>
> In general, I think the new structure is very nice.
>
> Regarding to moving "User-defined Extensions" to corresponding broader
> topics, I would prefer current "User-defined Extensions".
> Because it is a more advanced topic than "Connect to external systems" and
> "Builtin Functions", and we can mention the common points (e.g. pom
> dependency) in the overview of the Extensions section.
> Besides that, I would like to keep Builtin Functions as a top-level to make
> it have more exposure and may further split the page.
>
> I have some other suggestions:
>
> 1) Having subpages under "Built-in Functions". For example:
>
> Built-in Functions
>  - Mathematical Functions
>  - Bit Functions
>  - Date and Time Functions
>  - Conditional Functions
>  - String Functions
>  - Aggregate Functions
>  - ...
>
> Currently, all the functions are squeezed in one page. It make the
> page bloated.
> Meanwhile, I think it would be great to enrich the built-in functions with
> argument explanation and more clear examples like MySQL[1] and other
> DataBase docs.
>
> 2) +1 to the "Architecture & Internals" chapter.
> We already have a pull request[2] to add "Streaming Aggregation Performance
> Tuning" page which talks about the performance tuning tips around streaming
> aggregation and the internals.
> Maybe we can put it under the internal chapter or a "Performance Tuning"
> chapter.
>
> 3) How about restructure SQL chapter a bit like this?
>
> SQL
>  - Overview
>  - Data Manipulation Statements (all operations available in SQL)
>  - Data Definition Statements (DDL syntaxes)
>  - Pattern Matching
>
> It renames "Full Reference" to "Data Manipulation Statements" which is more
> align with "Data Definition Statements".
>
>
> Regards,
> Jark
>
> [1]:
>
> https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_adddate
> [2]: https://github.com/apache/flink/pull/9525
>
>
>
>
>
> On Mon, 2 Sep 2019 at 17:29, Kurt Young <[hidden email]> wrote:
>
> > +1 to the general idea and thanks for driving this. I think the new
> > structure is
> > more clear than the old one, and i have some suggestions:
> >
> > 1. How about adding a "Architecture & Internals" chapter? This can help
> > developers
> > or users who want to contribute more to have a better understanding about
> > Table.
> > Essentially with blink planner, we merged a lots of codes and features
> but
> > lack of
> > proper user and design documents.
> >
> > 2. Add a dedicated "Hive Integration" chapter. We spend lots of effort on
> > integrating
> > hive, and hive integration is happened in different areas, like catalog,
> > function and
> > maybe ddl in the future. I think a dedicated chapter can make users who
> are
> > interested
> > in this topic easier to find the information they need.
> >
> > 3. Add a chapter about how to manage, monitor or tune the Table & SQL
> jobs,
> > and
> > might adding something like how to migrate old version jobs to new
> version
> > in the future.
> >
> > Best,
> > Kurt
> >
> >
> > On Mon, Sep 2, 2019 at 4:17 PM vino yang <[hidden email]> wrote:
> >
> > > Agree with Dawid's suggestion about function.
> > >
> > > Having a Functions section to unify the built-in function and UDF would
> > be
> > > better.
> > >
> > > Dawid Wysakowicz <[hidden email]> 于2019年8月30日周五 下午7:43写道:
> > >
> > > > +1 to the idea of restructuring the docs.
> > > >
> > > > My only suggestion to consider is how about moving the
> > > > User-Defined-Extensions subpages to corresponding broader topics?
> > > >
> > > > Sources & Sinks >> Connect to external systems
> > > >
> > > > Catalogs >> Connect to external systems
> > > >
> > > > and then have a Functions sections with subsections:
> > > >
> > > > functions
> > > >
> > > >     |- built in functions
> > > >
> > > >     |- user defined functions
> > > >
> > > >
> > > > Best,
> > > >
> > > > Dawid
> > > >
> > > > On 30/08/2019 10:59, Timo Walther wrote:
> > > > > Hi everyone,
> > > > >
> > > > > the Table API & SQL documentation was already in a very good shape
> in
> > > > > Flink 1.8. However, in the past it was mostly presented as an
> > addition
> > > > > to DataStream API. As the Table and SQL world is growing quickly,
> > > > > stabilizes in its concepts, and is considered as another top-level
> > API
> > > > > and closed ecosystem, it is time to restructure the docs a little
> bit
> > > > > to represent the vision of FLIP-32.
> > > > >
> > > > > Current state:
> > > > > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/
> > > > >
> > > > > We would like to propose the following FLIP-60 for a new structure:
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=127405685
> > > > >
> > > > >
> > > > > Looking forward to feedback.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Timo
> > > > >
> > > > >
> > > > >
> > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-60: Restructure the Table API & SQL documentation

Kurt Young
>>> Should the table API concepts be a section in the overall concepts then?

I would say yes, but not exactly as table API concept, but for streaming SQL
concept, plus how to unify the streaming and batch from SQL's perspective.
This topic has lots of connection with underlying streaming concepts, also
time,
watermark, etc. But also have lots of relational concepts, which I think
definitely
needs some introduction.

Best,
Kurt


On Mon, Sep 16, 2019 at 6:15 PM Stephan Ewen <[hidden email]> wrote:

> There are also some other efforts to restructure the docs, which have
> resulted until now in more quickstarts and more concepts.
>
> IIRC there is the goal to have a big section on concepts for the whole
> system: streaming concepts, time, order, etc.
> The API docs would be really more about an API specific reference guide
> then.
>
> Should the table API concepts be a section in the overall concepts then?
>
>
> On Tue, Sep 3, 2019 at 5:11 AM Jark Wu <[hidden email]> wrote:
>
> > big +1 to the idea of restructuring the docs. We got a lot of complaints
> > from users about the Table & SQL docs.
> >
> > In general, I think the new structure is very nice.
> >
> > Regarding to moving "User-defined Extensions" to corresponding broader
> > topics, I would prefer current "User-defined Extensions".
> > Because it is a more advanced topic than "Connect to external systems"
> and
> > "Builtin Functions", and we can mention the common points (e.g. pom
> > dependency) in the overview of the Extensions section.
> > Besides that, I would like to keep Builtin Functions as a top-level to
> make
> > it have more exposure and may further split the page.
> >
> > I have some other suggestions:
> >
> > 1) Having subpages under "Built-in Functions". For example:
> >
> > Built-in Functions
> >  - Mathematical Functions
> >  - Bit Functions
> >  - Date and Time Functions
> >  - Conditional Functions
> >  - String Functions
> >  - Aggregate Functions
> >  - ...
> >
> > Currently, all the functions are squeezed in one page. It make the
> > page bloated.
> > Meanwhile, I think it would be great to enrich the built-in functions
> with
> > argument explanation and more clear examples like MySQL[1] and other
> > DataBase docs.
> >
> > 2) +1 to the "Architecture & Internals" chapter.
> > We already have a pull request[2] to add "Streaming Aggregation
> Performance
> > Tuning" page which talks about the performance tuning tips around
> streaming
> > aggregation and the internals.
> > Maybe we can put it under the internal chapter or a "Performance Tuning"
> > chapter.
> >
> > 3) How about restructure SQL chapter a bit like this?
> >
> > SQL
> >  - Overview
> >  - Data Manipulation Statements (all operations available in SQL)
> >  - Data Definition Statements (DDL syntaxes)
> >  - Pattern Matching
> >
> > It renames "Full Reference" to "Data Manipulation Statements" which is
> more
> > align with "Data Definition Statements".
> >
> >
> > Regards,
> > Jark
> >
> > [1]:
> >
> >
> https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_adddate
> > [2]: https://github.com/apache/flink/pull/9525
> >
> >
> >
> >
> >
> > On Mon, 2 Sep 2019 at 17:29, Kurt Young <[hidden email]> wrote:
> >
> > > +1 to the general idea and thanks for driving this. I think the new
> > > structure is
> > > more clear than the old one, and i have some suggestions:
> > >
> > > 1. How about adding a "Architecture & Internals" chapter? This can help
> > > developers
> > > or users who want to contribute more to have a better understanding
> about
> > > Table.
> > > Essentially with blink planner, we merged a lots of codes and features
> > but
> > > lack of
> > > proper user and design documents.
> > >
> > > 2. Add a dedicated "Hive Integration" chapter. We spend lots of effort
> on
> > > integrating
> > > hive, and hive integration is happened in different areas, like
> catalog,
> > > function and
> > > maybe ddl in the future. I think a dedicated chapter can make users who
> > are
> > > interested
> > > in this topic easier to find the information they need.
> > >
> > > 3. Add a chapter about how to manage, monitor or tune the Table & SQL
> > jobs,
> > > and
> > > might adding something like how to migrate old version jobs to new
> > version
> > > in the future.
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Mon, Sep 2, 2019 at 4:17 PM vino yang <[hidden email]>
> wrote:
> > >
> > > > Agree with Dawid's suggestion about function.
> > > >
> > > > Having a Functions section to unify the built-in function and UDF
> would
> > > be
> > > > better.
> > > >
> > > > Dawid Wysakowicz <[hidden email]> 于2019年8月30日周五 下午7:43写道:
> > > >
> > > > > +1 to the idea of restructuring the docs.
> > > > >
> > > > > My only suggestion to consider is how about moving the
> > > > > User-Defined-Extensions subpages to corresponding broader topics?
> > > > >
> > > > > Sources & Sinks >> Connect to external systems
> > > > >
> > > > > Catalogs >> Connect to external systems
> > > > >
> > > > > and then have a Functions sections with subsections:
> > > > >
> > > > > functions
> > > > >
> > > > >     |- built in functions
> > > > >
> > > > >     |- user defined functions
> > > > >
> > > > >
> > > > > Best,
> > > > >
> > > > > Dawid
> > > > >
> > > > > On 30/08/2019 10:59, Timo Walther wrote:
> > > > > > Hi everyone,
> > > > > >
> > > > > > the Table API & SQL documentation was already in a very good
> shape
> > in
> > > > > > Flink 1.8. However, in the past it was mostly presented as an
> > > addition
> > > > > > to DataStream API. As the Table and SQL world is growing quickly,
> > > > > > stabilizes in its concepts, and is considered as another
> top-level
> > > API
> > > > > > and closed ecosystem, it is time to restructure the docs a little
> > bit
> > > > > > to represent the vision of FLIP-32.
> > > > > >
> > > > > > Current state:
> > > > > >
> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/
> > > > > >
> > > > > > We would like to propose the following FLIP-60 for a new
> structure:
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=127405685
> > > > > >
> > > > > >
> > > > > > Looking forward to feedback.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Timo
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-60: Restructure the Table API & SQL documentation

Timo Walther-2
Hi all,

thanks for your feedback.

@Stephan: Our efforts will definitely be synced with the general Flink
documentation improvements mentioned in FLIP-42. I also had an offline
discussion with Konstantin about this. Concepts such as streaming
concepts, time, order, etc. should definitely discussed only in one
place which is the general Flink concepts section. However, we still
need need a dedicated SQL concepts section for the specific stuff such as:
- SQL data types
- the concept of time attributes and how they can be used in queries
- planners and temporal tables

I would not move this out of the SQL subsection.

@others: I will update the FLIP with your comments and notify once the
FLIP is ready for another review.

Thanks,
Timo



On 16.09.19 22:56, Kurt Young wrote:

>>>> Should the table API concepts be a section in the overall concepts then?
> I would say yes, but not exactly as table API concept, but for streaming SQL
> concept, plus how to unify the streaming and batch from SQL's perspective.
> This topic has lots of connection with underlying streaming concepts, also
> time,
> watermark, etc. But also have lots of relational concepts, which I think
> definitely
> needs some introduction.
>
> Best,
> Kurt
>
>
> On Mon, Sep 16, 2019 at 6:15 PM Stephan Ewen <[hidden email]> wrote:
>
>> There are also some other efforts to restructure the docs, which have
>> resulted until now in more quickstarts and more concepts.
>>
>> IIRC there is the goal to have a big section on concepts for the whole
>> system: streaming concepts, time, order, etc.
>> The API docs would be really more about an API specific reference guide
>> then.
>>
>> Should the table API concepts be a section in the overall concepts then?
>>
>>
>> On Tue, Sep 3, 2019 at 5:11 AM Jark Wu <[hidden email]> wrote:
>>
>>> big +1 to the idea of restructuring the docs. We got a lot of complaints
>>> from users about the Table & SQL docs.
>>>
>>> In general, I think the new structure is very nice.
>>>
>>> Regarding to moving "User-defined Extensions" to corresponding broader
>>> topics, I would prefer current "User-defined Extensions".
>>> Because it is a more advanced topic than "Connect to external systems"
>> and
>>> "Builtin Functions", and we can mention the common points (e.g. pom
>>> dependency) in the overview of the Extensions section.
>>> Besides that, I would like to keep Builtin Functions as a top-level to
>> make
>>> it have more exposure and may further split the page.
>>>
>>> I have some other suggestions:
>>>
>>> 1) Having subpages under "Built-in Functions". For example:
>>>
>>> Built-in Functions
>>>   - Mathematical Functions
>>>   - Bit Functions
>>>   - Date and Time Functions
>>>   - Conditional Functions
>>>   - String Functions
>>>   - Aggregate Functions
>>>   - ...
>>>
>>> Currently, all the functions are squeezed in one page. It make the
>>> page bloated.
>>> Meanwhile, I think it would be great to enrich the built-in functions
>> with
>>> argument explanation and more clear examples like MySQL[1] and other
>>> DataBase docs.
>>>
>>> 2) +1 to the "Architecture & Internals" chapter.
>>> We already have a pull request[2] to add "Streaming Aggregation
>> Performance
>>> Tuning" page which talks about the performance tuning tips around
>> streaming
>>> aggregation and the internals.
>>> Maybe we can put it under the internal chapter or a "Performance Tuning"
>>> chapter.
>>>
>>> 3) How about restructure SQL chapter a bit like this?
>>>
>>> SQL
>>>   - Overview
>>>   - Data Manipulation Statements (all operations available in SQL)
>>>   - Data Definition Statements (DDL syntaxes)
>>>   - Pattern Matching
>>>
>>> It renames "Full Reference" to "Data Manipulation Statements" which is
>> more
>>> align with "Data Definition Statements".
>>>
>>>
>>> Regards,
>>> Jark
>>>
>>> [1]:
>>>
>>>
>> https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_adddate
>>> [2]: https://github.com/apache/flink/pull/9525
>>>
>>>
>>>
>>>
>>>
>>> On Mon, 2 Sep 2019 at 17:29, Kurt Young <[hidden email]> wrote:
>>>
>>>> +1 to the general idea and thanks for driving this. I think the new
>>>> structure is
>>>> more clear than the old one, and i have some suggestions:
>>>>
>>>> 1. How about adding a "Architecture & Internals" chapter? This can help
>>>> developers
>>>> or users who want to contribute more to have a better understanding
>> about
>>>> Table.
>>>> Essentially with blink planner, we merged a lots of codes and features
>>> but
>>>> lack of
>>>> proper user and design documents.
>>>>
>>>> 2. Add a dedicated "Hive Integration" chapter. We spend lots of effort
>> on
>>>> integrating
>>>> hive, and hive integration is happened in different areas, like
>> catalog,
>>>> function and
>>>> maybe ddl in the future. I think a dedicated chapter can make users who
>>> are
>>>> interested
>>>> in this topic easier to find the information they need.
>>>>
>>>> 3. Add a chapter about how to manage, monitor or tune the Table & SQL
>>> jobs,
>>>> and
>>>> might adding something like how to migrate old version jobs to new
>>> version
>>>> in the future.
>>>>
>>>> Best,
>>>> Kurt
>>>>
>>>>
>>>> On Mon, Sep 2, 2019 at 4:17 PM vino yang <[hidden email]>
>> wrote:
>>>>> Agree with Dawid's suggestion about function.
>>>>>
>>>>> Having a Functions section to unify the built-in function and UDF
>> would
>>>> be
>>>>> better.
>>>>>
>>>>> Dawid Wysakowicz <[hidden email]> 于2019年8月30日周五 下午7:43写道:
>>>>>
>>>>>> +1 to the idea of restructuring the docs.
>>>>>>
>>>>>> My only suggestion to consider is how about moving the
>>>>>> User-Defined-Extensions subpages to corresponding broader topics?
>>>>>>
>>>>>> Sources & Sinks >> Connect to external systems
>>>>>>
>>>>>> Catalogs >> Connect to external systems
>>>>>>
>>>>>> and then have a Functions sections with subsections:
>>>>>>
>>>>>> functions
>>>>>>
>>>>>>      |- built in functions
>>>>>>
>>>>>>      |- user defined functions
>>>>>>
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Dawid
>>>>>>
>>>>>> On 30/08/2019 10:59, Timo Walther wrote:
>>>>>>> Hi everyone,
>>>>>>>
>>>>>>> the Table API & SQL documentation was already in a very good
>> shape
>>> in
>>>>>>> Flink 1.8. However, in the past it was mostly presented as an
>>>> addition
>>>>>>> to DataStream API. As the Table and SQL world is growing quickly,
>>>>>>> stabilizes in its concepts, and is considered as another
>> top-level
>>>> API
>>>>>>> and closed ecosystem, it is time to restructure the docs a little
>>> bit
>>>>>>> to represent the vision of FLIP-32.
>>>>>>>
>>>>>>> Current state:
>>>>>>>
>> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/
>>>>>>> We would like to propose the following FLIP-60 for a new
>> structure:
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=127405685
>>>>>>>
>>>>>>> Looking forward to feedback.
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Timo
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>

Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] FLIP-60: Restructure the Table API & SQL documentation

Konstantin Knauf-3
Hi all,

sorry for joining late. Looking at the target structure of our
documentation as discussed in FLIP-42 [1] we will have some sections, which
cover both APIs (SQL/Table and DataStream). These include

* Getting Started
* Concepts
* Deployments
* Operations
* Connectors
* Libraries (CEP, ML)

As this will be a long-term effort, this should not at all block
restructuring the Table API documentation in general. Nevertheless, I
suggest, to consider to move some content from "Setup & Execution" as well
as "Overview" to the top-level "Getting Started" section. For the top-level
"Getting Started" session there is already ongoing work towards an
Interactive SQL Playground based on Docker and it already contains a Table
API Walkthrough. A discussion of "DataStream API vs Table API / SQL
Ecosystem" is very valuable, but should not be "hidden" in the Table API
documentation. Maybe move it to Concepts or Getting Started instead with a
link from the landing page of the documentation? The "Connect to External
Systems" I would leave in the Table API Documentation for now (similar to
the DataStream API) and maybe move once we create the top-level
"Connectors" section.

Cheers and thanks,

Konstantin

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-42%3A+Rework+Flink+Documentation

On Tue, Sep 17, 2019 at 1:36 AM Timo Walther <[hidden email]> wrote:

> Hi all,
>
> thanks for your feedback.
>
> @Stephan: Our efforts will definitely be synced with the general Flink
> documentation improvements mentioned in FLIP-42. I also had an offline
> discussion with Konstantin about this. Concepts such as streaming
> concepts, time, order, etc. should definitely discussed only in one
> place which is the general Flink concepts section. However, we still
> need need a dedicated SQL concepts section for the specific stuff such as:
> - SQL data types
> - the concept of time attributes and how they can be used in queries
> - planners and temporal tables
>
> I would not move this out of the SQL subsection.
>
> @others: I will update the FLIP with your comments and notify once the
> FLIP is ready for another review.
>
> Thanks,
> Timo
>
>
>
> On 16.09.19 22:56, Kurt Young wrote:
> >>>> Should the table API concepts be a section in the overall concepts
> then?
> > I would say yes, but not exactly as table API concept, but for streaming
> SQL
> > concept, plus how to unify the streaming and batch from SQL's
> perspective.
> > This topic has lots of connection with underlying streaming concepts,
> also
> > time,
> > watermark, etc. But also have lots of relational concepts, which I think
> > definitely
> > needs some introduction.
> >
> > Best,
> > Kurt
> >
> >
> > On Mon, Sep 16, 2019 at 6:15 PM Stephan Ewen <[hidden email]> wrote:
> >
> >> There are also some other efforts to restructure the docs, which have
> >> resulted until now in more quickstarts and more concepts.
> >>
> >> IIRC there is the goal to have a big section on concepts for the whole
> >> system: streaming concepts, time, order, etc.
> >> The API docs would be really more about an API specific reference guide
> >> then.
> >>
> >> Should the table API concepts be a section in the overall concepts then?
> >>
> >>
> >> On Tue, Sep 3, 2019 at 5:11 AM Jark Wu <[hidden email]> wrote:
> >>
> >>> big +1 to the idea of restructuring the docs. We got a lot of
> complaints
> >>> from users about the Table & SQL docs.
> >>>
> >>> In general, I think the new structure is very nice.
> >>>
> >>> Regarding to moving "User-defined Extensions" to corresponding broader
> >>> topics, I would prefer current "User-defined Extensions".
> >>> Because it is a more advanced topic than "Connect to external systems"
> >> and
> >>> "Builtin Functions", and we can mention the common points (e.g. pom
> >>> dependency) in the overview of the Extensions section.
> >>> Besides that, I would like to keep Builtin Functions as a top-level to
> >> make
> >>> it have more exposure and may further split the page.
> >>>
> >>> I have some other suggestions:
> >>>
> >>> 1) Having subpages under "Built-in Functions". For example:
> >>>
> >>> Built-in Functions
> >>>   - Mathematical Functions
> >>>   - Bit Functions
> >>>   - Date and Time Functions
> >>>   - Conditional Functions
> >>>   - String Functions
> >>>   - Aggregate Functions
> >>>   - ...
> >>>
> >>> Currently, all the functions are squeezed in one page. It make the
> >>> page bloated.
> >>> Meanwhile, I think it would be great to enrich the built-in functions
> >> with
> >>> argument explanation and more clear examples like MySQL[1] and other
> >>> DataBase docs.
> >>>
> >>> 2) +1 to the "Architecture & Internals" chapter.
> >>> We already have a pull request[2] to add "Streaming Aggregation
> >> Performance
> >>> Tuning" page which talks about the performance tuning tips around
> >> streaming
> >>> aggregation and the internals.
> >>> Maybe we can put it under the internal chapter or a "Performance
> Tuning"
> >>> chapter.
> >>>
> >>> 3) How about restructure SQL chapter a bit like this?
> >>>
> >>> SQL
> >>>   - Overview
> >>>   - Data Manipulation Statements (all operations available in SQL)
> >>>   - Data Definition Statements (DDL syntaxes)
> >>>   - Pattern Matching
> >>>
> >>> It renames "Full Reference" to "Data Manipulation Statements" which is
> >> more
> >>> align with "Data Definition Statements".
> >>>
> >>>
> >>> Regards,
> >>> Jark
> >>>
> >>> [1]:
> >>>
> >>>
> >>
> https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_adddate
> >>> [2]: https://github.com/apache/flink/pull/9525
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Mon, 2 Sep 2019 at 17:29, Kurt Young <[hidden email]> wrote:
> >>>
> >>>> +1 to the general idea and thanks for driving this. I think the new
> >>>> structure is
> >>>> more clear than the old one, and i have some suggestions:
> >>>>
> >>>> 1. How about adding a "Architecture & Internals" chapter? This can
> help
> >>>> developers
> >>>> or users who want to contribute more to have a better understanding
> >> about
> >>>> Table.
> >>>> Essentially with blink planner, we merged a lots of codes and features
> >>> but
> >>>> lack of
> >>>> proper user and design documents.
> >>>>
> >>>> 2. Add a dedicated "Hive Integration" chapter. We spend lots of effort
> >> on
> >>>> integrating
> >>>> hive, and hive integration is happened in different areas, like
> >> catalog,
> >>>> function and
> >>>> maybe ddl in the future. I think a dedicated chapter can make users
> who
> >>> are
> >>>> interested
> >>>> in this topic easier to find the information they need.
> >>>>
> >>>> 3. Add a chapter about how to manage, monitor or tune the Table & SQL
> >>> jobs,
> >>>> and
> >>>> might adding something like how to migrate old version jobs to new
> >>> version
> >>>> in the future.
> >>>>
> >>>> Best,
> >>>> Kurt
> >>>>
> >>>>
> >>>> On Mon, Sep 2, 2019 at 4:17 PM vino yang <[hidden email]>
> >> wrote:
> >>>>> Agree with Dawid's suggestion about function.
> >>>>>
> >>>>> Having a Functions section to unify the built-in function and UDF
> >> would
> >>>> be
> >>>>> better.
> >>>>>
> >>>>> Dawid Wysakowicz <[hidden email]> 于2019年8月30日周五 下午7:43写道:
> >>>>>
> >>>>>> +1 to the idea of restructuring the docs.
> >>>>>>
> >>>>>> My only suggestion to consider is how about moving the
> >>>>>> User-Defined-Extensions subpages to corresponding broader topics?
> >>>>>>
> >>>>>> Sources & Sinks >> Connect to external systems
> >>>>>>
> >>>>>> Catalogs >> Connect to external systems
> >>>>>>
> >>>>>> and then have a Functions sections with subsections:
> >>>>>>
> >>>>>> functions
> >>>>>>
> >>>>>>      |- built in functions
> >>>>>>
> >>>>>>      |- user defined functions
> >>>>>>
> >>>>>>
> >>>>>> Best,
> >>>>>>
> >>>>>> Dawid
> >>>>>>
> >>>>>> On 30/08/2019 10:59, Timo Walther wrote:
> >>>>>>> Hi everyone,
> >>>>>>>
> >>>>>>> the Table API & SQL documentation was already in a very good
> >> shape
> >>> in
> >>>>>>> Flink 1.8. However, in the past it was mostly presented as an
> >>>> addition
> >>>>>>> to DataStream API. As the Table and SQL world is growing quickly,
> >>>>>>> stabilizes in its concepts, and is considered as another
> >> top-level
> >>>> API
> >>>>>>> and closed ecosystem, it is time to restructure the docs a little
> >>> bit
> >>>>>>> to represent the vision of FLIP-32.
> >>>>>>>
> >>>>>>> Current state:
> >>>>>>>
> >> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/
> >>>>>>> We would like to propose the following FLIP-60 for a new
> >> structure:
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=127405685
> >>>>>>>
> >>>>>>> Looking forward to feedback.
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>>
> >>>>>>> Timo
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
>
>

--

Konstantin Knauf | Solutions Architect

+49 160 91394525


Follow us @VervericaData Ververica <https://www.ververica.com/>


--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Ververica GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji
(Tony) Cheng