[jira] [Created] (FLINK-17146) Support conversion between PyFlink Table and Pandas DataFrame

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-17146) Support conversion between PyFlink Table and Pandas DataFrame

Shang Yuanchun (Jira)
Dian Fu created FLINK-17146:
-------------------------------

             Summary: Support conversion between PyFlink Table and Pandas DataFrame
                 Key: FLINK-17146
                 URL: https://issues.apache.org/jira/browse/FLINK-17146
             Project: Flink
          Issue Type: New Feature
          Components: API / Python
            Reporter: Dian Fu
            Assignee: Dian Fu


Pandas dataframe is the de-facto standard to work with tabular data in Python community. PyFlink table is Flink’s representation of the tabular data in Python language. It would be nice to provide the ability to convert between the PyFlink table and Pandas dataframe in PyFlink Table API which has the following benefits:
 * It provides users the ability to switch between PyFlink and Pandas seamlessly when processing data in Python language. Users could process data using one execution engine and switch to another seamlessly. For example, it may happen that users have already got a Pandas dataframe at hand and want to perform some expensive transformation of it. Then they could convert it to a PyFlink table and leverage the power of Flink engine. Users could also convert a PyFlink table to Pandas dataframe and perform transformation of it with the rich functionalities provided by the Pandas ecosystem.
 * No intermediate connectors are needed when converting between them.

More details could be found in [FLIP-120|https://cwiki.apache.org/confluence/display/FLINK/FLIP-120%3A+Support+conversion+between+PyFlink+Table+and+Pandas+DataFrame].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)