[DISCUSS] Support User-Defined Table Function in PyFlink

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Support User-Defined Table Function in PyFlink

Xingbo Huang
Hi all,

The scalar Python UDF has already been supported in coming release of 1.10,
we’d like to introduce Python UDTF now. FLIP-58[1] has already introduced
some content about Python UDTF. However, the implementation details are
still not touched. I have drafted a design doc[2]. It includes the
following items:

- How to define Python UDTF.

- The introduced rules for Python UDTF.

- How to execute Python UDTF.

Because the implementation relies on Beam's portability framework for
Python user-defined table function execution and not all the contributors
are familiar with it, I have done a prototype[3].

Welcome any feedback.

Best,

Xingbo

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table

[2]
https://docs.google.com/document/d/1Pkv5S0geoYQ2ySS5YTTBivJ3hoi-uzLXVQkDVIaR0cE/edit#heading=h.pzeztvig3kg1
[3] https://github.com/HuangXingBo/flink/commits/FLINK-UDTF
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Support User-Defined Table Function in PyFlink

jincheng sun
Thanks for bring up this discussion Xingbo!

The the design is pretty nice for me! This feature is really need which
mentioned in FLIP-58. So, I think is better to create the JIRA and open the
PR, then more detail can be reviewed. :)

Best,
Jincheng



Xingbo Huang <[hidden email]> 于2020年2月3日周一 下午3:02写道:

> Hi all,
>
> The scalar Python UDF has already been supported in coming release of 1.10,
> we’d like to introduce Python UDTF now. FLIP-58[1] has already introduced
> some content about Python UDTF. However, the implementation details are
> still not touched. I have drafted a design doc[2]. It includes the
> following items:
>
> - How to define Python UDTF.
>
> - The introduced rules for Python UDTF.
>
> - How to execute Python UDTF.
>
> Because the implementation relies on Beam's portability framework for
> Python user-defined table function execution and not all the contributors
> are familiar with it, I have done a prototype[3].
>
> Welcome any feedback.
>
> Best,
>
> Xingbo
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table
>
> [2]
>
> https://docs.google.com/document/d/1Pkv5S0geoYQ2ySS5YTTBivJ3hoi-uzLXVQkDVIaR0cE/edit#heading=h.pzeztvig3kg1
> [3] https://github.com/HuangXingBo/flink/commits/FLINK-UDTF
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Support User-Defined Table Function in PyFlink

Hequn Cheng-2
Hi Xingbo,

Thanks a lot for bringing up the discussion. Looks good from my side.
One suggestion beyond the document: it would be nice to avoid Scala code in
the flink-table module since we would like to get rid of Scala in the
long-term[1][2].

Best, Hequn

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free
[2] https://flink.apache.org/contributing/code-style-and-quality-scala.html


On Tue, Feb 4, 2020 at 9:09 PM jincheng sun <[hidden email]>
wrote:

> Thanks for bring up this discussion Xingbo!
>
> The the design is pretty nice for me! This feature is really need which
> mentioned in FLIP-58. So, I think is better to create the JIRA and open the
> PR, then more detail can be reviewed. :)
>
> Best,
> Jincheng
>
>
>
> Xingbo Huang <[hidden email]> 于2020年2月3日周一 下午3:02写道:
>
> > Hi all,
> >
> > The scalar Python UDF has already been supported in coming release of
> 1.10,
> > we’d like to introduce Python UDTF now. FLIP-58[1] has already introduced
> > some content about Python UDTF. However, the implementation details are
> > still not touched. I have drafted a design doc[2]. It includes the
> > following items:
> >
> > - How to define Python UDTF.
> >
> > - The introduced rules for Python UDTF.
> >
> > - How to execute Python UDTF.
> >
> > Because the implementation relies on Beam's portability framework for
> > Python user-defined table function execution and not all the contributors
> > are familiar with it, I have done a prototype[3].
> >
> > Welcome any feedback.
> >
> > Best,
> >
> > Xingbo
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table
> >
> > [2]
> >
> >
> https://docs.google.com/document/d/1Pkv5S0geoYQ2ySS5YTTBivJ3hoi-uzLXVQkDVIaR0cE/edit#heading=h.pzeztvig3kg1
> > [3] https://github.com/HuangXingBo/flink/commits/FLINK-UDTF
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Support User-Defined Table Function in PyFlink

Xingbo Huang
In reply to this post by jincheng sun
Hi Jincheng,

Thanks for your feed back. The more details we can discussed in the JIRA
and PR. :)

Best,
Xingbo

jincheng sun <[hidden email]> 于2020年2月4日周二 下午9:09写道:

> Thanks for bring up this discussion Xingbo!
>
> The the design is pretty nice for me! This feature is really need which
> mentioned in FLIP-58. So, I think is better to create the JIRA and open the
> PR, then more detail can be reviewed. :)
>
> Best,
> Jincheng
>
>
>
> Xingbo Huang <[hidden email]> 于2020年2月3日周一 下午3:02写道:
>
> > Hi all,
> >
> > The scalar Python UDF has already been supported in coming release of
> 1.10,
> > we’d like to introduce Python UDTF now. FLIP-58[1] has already introduced
> > some content about Python UDTF. However, the implementation details are
> > still not touched. I have drafted a design doc[2]. It includes the
> > following items:
> >
> > - How to define Python UDTF.
> >
> > - The introduced rules for Python UDTF.
> >
> > - How to execute Python UDTF.
> >
> > Because the implementation relies on Beam's portability framework for
> > Python user-defined table function execution and not all the contributors
> > are familiar with it, I have done a prototype[3].
> >
> > Welcome any feedback.
> >
> > Best,
> >
> > Xingbo
> >
> > [1]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table
> >
> > [2]
> >
> >
> https://docs.google.com/document/d/1Pkv5S0geoYQ2ySS5YTTBivJ3hoi-uzLXVQkDVIaR0cE/edit#heading=h.pzeztvig3kg1
> > [3] https://github.com/HuangXingBo/flink/commits/FLINK-UDTF
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Support User-Defined Table Function in PyFlink

Xingbo Huang
In reply to this post by Hequn Cheng-2
Hi Hequn,

Thanks for your feedback. Good suggestion. I will avoid Scala code in the
flink-table module.

Best,
Xingbo

Hequn Cheng <[hidden email]> 于2020年2月4日周二 下午10:14写道:

> Hi Xingbo,
>
> Thanks a lot for bringing up the discussion. Looks good from my side.
> One suggestion beyond the document: it would be nice to avoid Scala code in
> the flink-table module since we would like to get rid of Scala in the
> long-term[1][2].
>
> Best, Hequn
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-28%3A+Long-term+goal+of+making+flink-table+Scala-free
> [2]
> https://flink.apache.org/contributing/code-style-and-quality-scala.html
>
>
> On Tue, Feb 4, 2020 at 9:09 PM jincheng sun <[hidden email]>
> wrote:
>
> > Thanks for bring up this discussion Xingbo!
> >
> > The the design is pretty nice for me! This feature is really need which
> > mentioned in FLIP-58. So, I think is better to create the JIRA and open
> the
> > PR, then more detail can be reviewed. :)
> >
> > Best,
> > Jincheng
> >
> >
> >
> > Xingbo Huang <[hidden email]> 于2020年2月3日周一 下午3:02写道:
> >
> > > Hi all,
> > >
> > > The scalar Python UDF has already been supported in coming release of
> > 1.10,
> > > we’d like to introduce Python UDTF now. FLIP-58[1] has already
> introduced
> > > some content about Python UDTF. However, the implementation details are
> > > still not touched. I have drafted a design doc[2]. It includes the
> > > following items:
> > >
> > > - How to define Python UDTF.
> > >
> > > - The introduced rules for Python UDTF.
> > >
> > > - How to execute Python UDTF.
> > >
> > > Because the implementation relies on Beam's portability framework for
> > > Python user-defined table function execution and not all the
> contributors
> > > are familiar with it, I have done a prototype[3].
> > >
> > > Welcome any feedback.
> > >
> > > Best,
> > >
> > > Xingbo
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-58%3A+Flink+Python+User-Defined+Stateless+Function+for+Table
> > >
> > > [2]
> > >
> > >
> >
> https://docs.google.com/document/d/1Pkv5S0geoYQ2ySS5YTTBivJ3hoi-uzLXVQkDVIaR0cE/edit#heading=h.pzeztvig3kg1
> > > [3] https://github.com/HuangXingBo/flink/commits/FLINK-UDTF
> > >
> >
>