[DISCUSS] Add column operations in Table API

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Add column operations in Table API

jincheng sun
Hi, ALL:

Currently, there are already lots of table-level operations available in
Table API such as the select, window, join, etc. Most functionalities can
be accomplished with these APIs. But things may become difficult when there
are too many columns to operate.

The difficulties can be summarized into two categories:

1. Column modification - Users have to specify all the columns even if only
some columns are changed, e.g. adding a column, renaming a column, etc.
2. Column projection - It lacks flexible column operations to express which
columns to be selected. e.g.: there are 100 columns, but the user is only
interested in selecting the 1~10 and 20~40 columns.

So, I propose to add the following features in the Table API:

  1. Add the following operator
    - Add/Relace Columns
    - Drop columns
    - Rename columns

   2. Add column selection utils
    - columns(...) - select the specified columns
    - -columns(...) - deselect the columns specified

For more details, please check out the Google doc
<https://docs.google.com/document/d/1tryl6swt1K1pw7yvv5pdvFXSxfrBZ3_OkOObymis2ck/edit?usp=sharing>.
You are welcome to leave a comment in the Google doc
<https://docs.google.com/document/d/1tryl6swt1K1pw7yvv5pdvFXSxfrBZ3_OkOObymis2ck/edit?usp=sharing>
and welcome any email feedback!z

Regards,
Jincheng
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Add column operations in Table API

Jark Wu-2
Hi jingcheng,

Thanks for bringing up the detailed design. I think the column operation
feature is a great improvement to Table API.

+1 to the design from my side.

I only left a minor comment in the google doc.

Looking forward to the feature.

Best,
Jark

On Mon, 11 Mar 2019 at 12:29, jincheng sun <[hidden email]> wrote:

> Hi, ALL:
>
> Currently, there are already lots of table-level operations available in
> Table API such as the select, window, join, etc. Most functionalities can
> be accomplished with these APIs. But things may become difficult when there
> are too many columns to operate.
>
> The difficulties can be summarized into two categories:
>
> 1. Column modification - Users have to specify all the columns even if only
> some columns are changed, e.g. adding a column, renaming a column, etc.
> 2. Column projection - It lacks flexible column operations to express which
> columns to be selected. e.g.: there are 100 columns, but the user is only
> interested in selecting the 1~10 and 20~40 columns.
>
> So, I propose to add the following features in the Table API:
>
>   1. Add the following operator
>     - Add/Relace Columns
>     - Drop columns
>     - Rename columns
>
>    2. Add column selection utils
>     - columns(...) - select the specified columns
>     - -columns(...) - deselect the columns specified
>
> For more details, please check out the Google doc
> <
> https://docs.google.com/document/d/1tryl6swt1K1pw7yvv5pdvFXSxfrBZ3_OkOObymis2ck/edit?usp=sharing
> >.
> You are welcome to leave a comment in the Google doc
> <
> https://docs.google.com/document/d/1tryl6swt1K1pw7yvv5pdvFXSxfrBZ3_OkOObymis2ck/edit?usp=sharing
> >
> and welcome any email feedback!z
>
> Regards,
> Jincheng
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Add column operations in Table API

jincheng sun
Hi Jark, thanks for your feedback! And thanks for your comments in Google
doc
<https://docs.google.com/document/d/1tryl6swt1K1pw7yvv5pdvFXSxfrBZ3_OkOObymis2ck/edit?usp=gmail>
!

Hi all community contributors, this mail discussion will continue until
next Tuesday. If there are no new suggestions, I will create JIRA and start
the development of new features.

Best,
Jincheng


Jark Wu <[hidden email]> 于2019年3月14日周四 下午5:28写道:

> Hi jingcheng,
>
> Thanks for bringing up the detailed design. I think the column operation
> feature is a great improvement to Table API.
>
> +1 to the design from my side.
>
> I only left a minor comment in the google doc.
>
> Looking forward to the feature.
>
> Best,
> Jark
>
> On Mon, 11 Mar 2019 at 12:29, jincheng sun <[hidden email]>
> wrote:
>
> > Hi, ALL:
> >
> > Currently, there are already lots of table-level operations available in
> > Table API such as the select, window, join, etc. Most functionalities can
> > be accomplished with these APIs. But things may become difficult when
> there
> > are too many columns to operate.
> >
> > The difficulties can be summarized into two categories:
> >
> > 1. Column modification - Users have to specify all the columns even if
> only
> > some columns are changed, e.g. adding a column, renaming a column, etc.
> > 2. Column projection - It lacks flexible column operations to express
> which
> > columns to be selected. e.g.: there are 100 columns, but the user is only
> > interested in selecting the 1~10 and 20~40 columns.
> >
> > So, I propose to add the following features in the Table API:
> >
> >   1. Add the following operator
> >     - Add/Relace Columns
> >     - Drop columns
> >     - Rename columns
> >
> >    2. Add column selection utils
> >     - columns(...) - select the specified columns
> >     - -columns(...) - deselect the columns specified
> >
> > For more details, please check out the Google doc
> > <
> >
> https://docs.google.com/document/d/1tryl6swt1K1pw7yvv5pdvFXSxfrBZ3_OkOObymis2ck/edit?usp=sharing
> > >.
> > You are welcome to leave a comment in the Google doc
> > <
> >
> https://docs.google.com/document/d/1tryl6swt1K1pw7yvv5pdvFXSxfrBZ3_OkOObymis2ck/edit?usp=sharing
> > >
> > and welcome any email feedback!z
> >
> > Regards,
> > Jincheng
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Add column operations in Table API

Hequn Cheng
Hi Jincheng,

Thanks a lot for the discussion.
The enhancement of column operations has been asked for many times. The
design in the document gives a whole picture on it. I think it is very
good. I left some minor comments in the document.

Looking forward to the JIRA and the new features.

Best, Hequn


On Fri, Mar 15, 2019 at 2:00 PM jincheng sun <[hidden email]>
wrote:

> Hi Jark, thanks for your feedback! And thanks for your comments in Google
> doc
> <
> https://docs.google.com/document/d/1tryl6swt1K1pw7yvv5pdvFXSxfrBZ3_OkOObymis2ck/edit?usp=gmail
> >
> !
>
> Hi all community contributors, this mail discussion will continue until
> next Tuesday. If there are no new suggestions, I will create JIRA and start
> the development of new features.
>
> Best,
> Jincheng
>
>
> Jark Wu <[hidden email]> 于2019年3月14日周四 下午5:28写道:
>
> > Hi jingcheng,
> >
> > Thanks for bringing up the detailed design. I think the column operation
> > feature is a great improvement to Table API.
> >
> > +1 to the design from my side.
> >
> > I only left a minor comment in the google doc.
> >
> > Looking forward to the feature.
> >
> > Best,
> > Jark
> >
> > On Mon, 11 Mar 2019 at 12:29, jincheng sun <[hidden email]>
> > wrote:
> >
> > > Hi, ALL:
> > >
> > > Currently, there are already lots of table-level operations available
> in
> > > Table API such as the select, window, join, etc. Most functionalities
> can
> > > be accomplished with these APIs. But things may become difficult when
> > there
> > > are too many columns to operate.
> > >
> > > The difficulties can be summarized into two categories:
> > >
> > > 1. Column modification - Users have to specify all the columns even if
> > only
> > > some columns are changed, e.g. adding a column, renaming a column, etc.
> > > 2. Column projection - It lacks flexible column operations to express
> > which
> > > columns to be selected. e.g.: there are 100 columns, but the user is
> only
> > > interested in selecting the 1~10 and 20~40 columns.
> > >
> > > So, I propose to add the following features in the Table API:
> > >
> > >   1. Add the following operator
> > >     - Add/Relace Columns
> > >     - Drop columns
> > >     - Rename columns
> > >
> > >    2. Add column selection utils
> > >     - columns(...) - select the specified columns
> > >     - -columns(...) - deselect the columns specified
> > >
> > > For more details, please check out the Google doc
> > > <
> > >
> >
> https://docs.google.com/document/d/1tryl6swt1K1pw7yvv5pdvFXSxfrBZ3_OkOObymis2ck/edit?usp=sharing
> > > >.
> > > You are welcome to leave a comment in the Google doc
> > > <
> > >
> >
> https://docs.google.com/document/d/1tryl6swt1K1pw7yvv5pdvFXSxfrBZ3_OkOObymis2ck/edit?usp=sharing
> > > >
> > > and welcome any email feedback!z
> > >
> > > Regards,
> > > Jincheng
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Add column operations in Table API

jincheng sun
Thanks for your feedback and comments! @Hequn Cheng <[hidden email]>

I had open the JIRA FLINK-11967
<https://issues.apache.org/jira/browse/FLINK-11967>, And welcome left any
suggestions in JIRA.

Best,
Jincheng

Hequn Cheng <[hidden email]> 于2019年3月16日周六 下午4:03写道:

> Hi Jincheng,
>
> Thanks a lot for the discussion.
> The enhancement of column operations has been asked for many times. The
> design in the document gives a whole picture on it. I think it is very
> good. I left some minor comments in the document.
>
> Looking forward to the JIRA and the new features.
>
> Best, Hequn
>
>
> On Fri, Mar 15, 2019 at 2:00 PM jincheng sun <[hidden email]>
> wrote:
>
> > Hi Jark, thanks for your feedback! And thanks for your comments in Google
> > doc
> > <
> >
> https://docs.google.com/document/d/1tryl6swt1K1pw7yvv5pdvFXSxfrBZ3_OkOObymis2ck/edit?usp=gmail
> > >
> > !
> >
> > Hi all community contributors, this mail discussion will continue until
> > next Tuesday. If there are no new suggestions, I will create JIRA and
> start
> > the development of new features.
> >
> > Best,
> > Jincheng
> >
> >
> > Jark Wu <[hidden email]> 于2019年3月14日周四 下午5:28写道:
> >
> > > Hi jingcheng,
> > >
> > > Thanks for bringing up the detailed design. I think the column
> operation
> > > feature is a great improvement to Table API.
> > >
> > > +1 to the design from my side.
> > >
> > > I only left a minor comment in the google doc.
> > >
> > > Looking forward to the feature.
> > >
> > > Best,
> > > Jark
> > >
> > > On Mon, 11 Mar 2019 at 12:29, jincheng sun <[hidden email]>
> > > wrote:
> > >
> > > > Hi, ALL:
> > > >
> > > > Currently, there are already lots of table-level operations available
> > in
> > > > Table API such as the select, window, join, etc. Most functionalities
> > can
> > > > be accomplished with these APIs. But things may become difficult when
> > > there
> > > > are too many columns to operate.
> > > >
> > > > The difficulties can be summarized into two categories:
> > > >
> > > > 1. Column modification - Users have to specify all the columns even
> if
> > > only
> > > > some columns are changed, e.g. adding a column, renaming a column,
> etc.
> > > > 2. Column projection - It lacks flexible column operations to express
> > > which
> > > > columns to be selected. e.g.: there are 100 columns, but the user is
> > only
> > > > interested in selecting the 1~10 and 20~40 columns.
> > > >
> > > > So, I propose to add the following features in the Table API:
> > > >
> > > >   1. Add the following operator
> > > >     - Add/Relace Columns
> > > >     - Drop columns
> > > >     - Rename columns
> > > >
> > > >    2. Add column selection utils
> > > >     - columns(...) - select the specified columns
> > > >     - -columns(...) - deselect the columns specified
> > > >
> > > > For more details, please check out the Google doc
> > > > <
> > > >
> > >
> >
> https://docs.google.com/document/d/1tryl6swt1K1pw7yvv5pdvFXSxfrBZ3_OkOObymis2ck/edit?usp=sharing
> > > > >.
> > > > You are welcome to leave a comment in the Google doc
> > > > <
> > > >
> > >
> >
> https://docs.google.com/document/d/1tryl6swt1K1pw7yvv5pdvFXSxfrBZ3_OkOObymis2ck/edit?usp=sharing
> > > > >
> > > > and welcome any email feedback!z
> > > >
> > > > Regards,
> > > > Jincheng
> > > >
> > >
> >
>