(DEPRECATED) Apache Flink Mailing List archive.

[jira] [Commented] (FLINK-668) API Proposal - NamedDataSets

Classic

List

Threaded

1 message

Shang Yuanchun (Jira)

Jun 18, 2014; 4:44pm

[jira] [Commented] (FLINK-668) API Proposal - NamedDataSets

[ https://issues.apache.org/jira/browse/FLINK-668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14035919#comment-14035919 ]

Markus Holzemer commented on FLINK-668:
---------------------------------------

The discussion on this topic is continued in a newer issue. (FLINK-947)

> API Proposal - NamedDataSets
> ----------------------------
>
> Key: FLINK-668
> URL: https://issues.apache.org/jira/browse/FLINK-668
> Project: Flink
> Issue Type: Improvement
> Reporter: GitHub Import
> Labels: github-import
> Fix For: pre-apache
>
>
> @StephanEwen, @aljoscha and me were discussing a further stage / alternative version of the new Java API that we called NamedDataSets. Instead of dealing with specific types that are checked on compile time, users should be able to just use names of fields to operate on. The types would be checked not on compile time but on pre flight time. That would give a feeling more similiar to what SQL is like.
> Currently users often have to remember what position in the tuple a specific field has, which can get a little bit annoying when dealing with bigger queries. Using names instead would perhaps make this more manageable.
> I have created a first proposal for the syntax that we can use as a basis for disussion:
> ```
> NamedDataSet nds = get3TupleDataSet(env).named("ID", "Number", "Comment");
>
> NamedDataSet join = get3TupleDataSet(env).named("ID", "Number", "Comment");
>
> NamedDataSet join_result = nds.join(join).where("ID").equalTo("ID");
>
> NamedDataSet group_result = nds.groupBy("ID");
> // to apply a udf
> NamedDataSet reduceDs = nds.get("ID", "Number", "Comment").types(Integer.class, Long.class, String.class)
> .groupBy(1).reduce(new Tuple3Reduce("B-)")).named("ID", "Number", "Comment");
>
> reduceDs.get("ID", "Number", "Comment").types(Integer.class, Long.class, String.class).print();
> env.execute();
> ```
> My current development progress can be looked at here:
> https://github.com/markus-h/stratosphere/compare/named_dataset
> ---------------- Imported from GitHub ----------------
> Url: https://github.com/stratosphere/stratosphere/issues/668
> Created by: [markus-h|https://github.com/markus-h]
> Labels: enhancement, java api, user satisfaction,
> Created at: Tue Apr 08 13:31:59 CEST 2014
> State: open

... [show rest of quote]

--
This message was sent by Atlassian JIRA
(v6.2#6252)