[jira] [Commented] (FLINK-947) Add support for "Named Datasets"

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (FLINK-947) Add support for "Named Datasets"

Shang Yuanchun (Jira)

    [ https://issues.apache.org/jira/browse/FLINK-947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14035692#comment-14035692 ]

Aljoscha Krettek commented on FLINK-947:
----------------------------------------

Yes, they would. Doing it "properly" would require changing how we execute stuff: Having operators that work on raw binary data. Maybe having support for schemas.

Projections are not strictly necessary but having them explicit makes the system easier. Also, right now, for a mapper a check is performed whether the POJO types that the user uses have fields for all the named fields in the dataset.

> Add support for "Named Datasets"
> --------------------------------
>
>                 Key: FLINK-947
>                 URL: https://issues.apache.org/jira/browse/FLINK-947
>             Project: Flink
>          Issue Type: New Feature
>          Components: Java API
>            Reporter: Aljoscha Krettek
>            Assignee: Aljoscha Krettek
>            Priority: Minor
>
> This would create an API that is a mix between SQL like declarativity and the power of user defined functions. Example user code could look like this:
> {code:Java}
> NamedDataSet one = ...
> NamedDataSet two = ...
> NamedDataSet result = one.join(two).where("key").equalTo("otherKey")
>   .project("a", "b", "c")
>   .map( (UserTypeIn in) -> return new UserTypeOut(...) )
>   .print();
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)