Expression DataSets

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Expression DataSets

Aljoscha Krettek-2
Hi,
I did some work recently on adding support for SQL-like queries on top
of DataSets. (This is known as "named datasets" in the jira issue:
https://issues.apache.org/jira/browse/FLINK-947?jql=project%20%3D%20FLINK%20AND%20assignee%20%3D%20currentUser()%20AND%20resolution%20%3D%20Unresolved).

I have support for filter, join, grouping and aggregation. I think the
basis is quite strong now but we can add support for more data types
and supported operations in the select expressions.

Please have a look at my branch if you're interested:
https://github.com/aljoscha/flink/tree/linq You can look at the new
Expression ITCases to see what features are currently available and
how the interface is used. There are also two complete programs:
PageRankExpression and TPCHQuery3Expression.

And now at last, a sneak peek at how the new interface is used:

in.group('key).select('key, ('a + 10).avg + " the average", 'a.count)

The notation 'foo are Scala symbols, I use them in the DSL to
reference named fields.

Cheers,
Aljoscha
Reply | Threaded
Open this post in threaded view
|

Re: Expression DataSets

Stephan Ewen
Very exciting!

This looks amazing. It almost looks like half a SQL interface ;-)

On Fri, Jan 16, 2015 at 11:04 AM, Aljoscha Krettek <[hidden email]>
wrote:

> Hi,
> I did some work recently on adding support for SQL-like queries on top
> of DataSets. (This is known as "named datasets" in the jira issue:
>
> https://issues.apache.org/jira/browse/FLINK-947?jql=project%20%3D%20FLINK%20AND%20assignee%20%3D%20currentUser()%20AND%20resolution%20%3D%20Unresolved
> ).
>
> I have support for filter, join, grouping and aggregation. I think the
> basis is quite strong now but we can add support for more data types
> and supported operations in the select expressions.
>
> Please have a look at my branch if you're interested:
> https://github.com/aljoscha/flink/tree/linq You can look at the new
> Expression ITCases to see what features are currently available and
> how the interface is used. There are also two complete programs:
> PageRankExpression and TPCHQuery3Expression.
>
> And now at last, a sneak peek at how the new interface is used:
>
> in.group('key).select('key, ('a + 10).avg + " the average", 'a.count)
>
> The notation 'foo are Scala symbols, I use them in the DSL to
> reference named fields.
>
> Cheers,
> Aljoscha
>
Reply | Threaded
Open this post in threaded view
|

Re: Expression DataSets

Fabian Hueske-2
This is great!
Will this be exclusive for the Scala API or are we adding this (or similar)
functionality to the Java API as well?

2015-01-16 17:30 GMT+01:00 Stephan Ewen <[hidden email]>:

> Very exciting!
>
> This looks amazing. It almost looks like half a SQL interface ;-)
>
> On Fri, Jan 16, 2015 at 11:04 AM, Aljoscha Krettek <[hidden email]>
> wrote:
>
> > Hi,
> > I did some work recently on adding support for SQL-like queries on top
> > of DataSets. (This is known as "named datasets" in the jira issue:
> >
> >
> https://issues.apache.org/jira/browse/FLINK-947?jql=project%20%3D%20FLINK%20AND%20assignee%20%3D%20currentUser()%20AND%20resolution%20%3D%20Unresolved
> > ).
> >
> > I have support for filter, join, grouping and aggregation. I think the
> > basis is quite strong now but we can add support for more data types
> > and supported operations in the select expressions.
> >
> > Please have a look at my branch if you're interested:
> > https://github.com/aljoscha/flink/tree/linq You can look at the new
> > Expression ITCases to see what features are currently available and
> > how the interface is used. There are also two complete programs:
> > PageRankExpression and TPCHQuery3Expression.
> >
> > And now at last, a sneak peek at how the new interface is used:
> >
> > in.group('key).select('key, ('a + 10).avg + " the average", 'a.count)
> >
> > The notation 'foo are Scala symbols, I use them in the DSL to
> > reference named fields.
> >
> > Cheers,
> > Aljoscha
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Expression DataSets

Till Rohrmann
I agree that this looks awesome. I'm looking forward writing new jobs with
it.

On Mon, Jan 19, 2015 at 10:33 AM, Fabian Hueske <[hidden email]> wrote:

> This is great!
> Will this be exclusive for the Scala API or are we adding this (or similar)
> functionality to the Java API as well?
>
> 2015-01-16 17:30 GMT+01:00 Stephan Ewen <[hidden email]>:
>
> > Very exciting!
> >
> > This looks amazing. It almost looks like half a SQL interface ;-)
> >
> > On Fri, Jan 16, 2015 at 11:04 AM, Aljoscha Krettek <[hidden email]>
> > wrote:
> >
> > > Hi,
> > > I did some work recently on adding support for SQL-like queries on top
> > > of DataSets. (This is known as "named datasets" in the jira issue:
> > >
> > >
> >
> https://issues.apache.org/jira/browse/FLINK-947?jql=project%20%3D%20FLINK%20AND%20assignee%20%3D%20currentUser()%20AND%20resolution%20%3D%20Unresolved
> > > ).
> > >
> > > I have support for filter, join, grouping and aggregation. I think the
> > > basis is quite strong now but we can add support for more data types
> > > and supported operations in the select expressions.
> > >
> > > Please have a look at my branch if you're interested:
> > > https://github.com/aljoscha/flink/tree/linq You can look at the new
> > > Expression ITCases to see what features are currently available and
> > > how the interface is used. There are also two complete programs:
> > > PageRankExpression and TPCHQuery3Expression.
> > >
> > > And now at last, a sneak peek at how the new interface is used:
> > >
> > > in.group('key).select('key, ('a + 10).avg + " the average", 'a.count)
> > >
> > > The notation 'foo are Scala symbols, I use them in the DSL to
> > > reference named fields.
> > >
> > > Cheers,
> > > Aljoscha
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Expression DataSets

DEVAN M.S.
great !


Devan M.S. | Research Associate | Cyber Security | AMRITA VISHWA
VIDYAPEETHAM | Amritapuri | Cell +919946535290 |


On Mon, Jan 19, 2015 at 3:25 PM, Till Rohrmann <[hidden email]> wrote:

> I agree that this looks awesome. I'm looking forward writing new jobs with
> it.
>
> On Mon, Jan 19, 2015 at 10:33 AM, Fabian Hueske <[hidden email]> wrote:
>
> > This is great!
> > Will this be exclusive for the Scala API or are we adding this (or
> similar)
> > functionality to the Java API as well?
> >
> > 2015-01-16 17:30 GMT+01:00 Stephan Ewen <[hidden email]>:
> >
> > > Very exciting!
> > >
> > > This looks amazing. It almost looks like half a SQL interface ;-)
> > >
> > > On Fri, Jan 16, 2015 at 11:04 AM, Aljoscha Krettek <
> [hidden email]>
> > > wrote:
> > >
> > > > Hi,
> > > > I did some work recently on adding support for SQL-like queries on
> top
> > > > of DataSets. (This is known as "named datasets" in the jira issue:
> > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/FLINK-947?jql=project%20%3D%20FLINK%20AND%20assignee%20%3D%20currentUser()%20AND%20resolution%20%3D%20Unresolved
> > > > ).
> > > >
> > > > I have support for filter, join, grouping and aggregation. I think
> the
> > > > basis is quite strong now but we can add support for more data types
> > > > and supported operations in the select expressions.
> > > >
> > > > Please have a look at my branch if you're interested:
> > > > https://github.com/aljoscha/flink/tree/linq You can look at the new
> > > > Expression ITCases to see what features are currently available and
> > > > how the interface is used. There are also two complete programs:
> > > > PageRankExpression and TPCHQuery3Expression.
> > > >
> > > > And now at last, a sneak peek at how the new interface is used:
> > > >
> > > > in.group('key).select('key, ('a + 10).avg + " the average", 'a.count)
> > > >
> > > > The notation 'foo are Scala symbols, I use them in the DSL to
> > > > reference named fields.
> > > >
> > > > Cheers,
> > > > Aljoscha
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Expression DataSets

Kostas Tzoumas-2
In reply to this post by Fabian Hueske-2
I think the plan is to add this for both Scala and Java (starting with
Scala)

On Mon, Jan 19, 2015 at 1:33 AM, Fabian Hueske <[hidden email]> wrote:

> This is great!
> Will this be exclusive for the Scala API or are we adding this (or similar)
> functionality to the Java API as well?
>
> 2015-01-16 17:30 GMT+01:00 Stephan Ewen <[hidden email]>:
>
> > Very exciting!
> >
> > This looks amazing. It almost looks like half a SQL interface ;-)
> >
> > On Fri, Jan 16, 2015 at 11:04 AM, Aljoscha Krettek <[hidden email]>
> > wrote:
> >
> > > Hi,
> > > I did some work recently on adding support for SQL-like queries on top
> > > of DataSets. (This is known as "named datasets" in the jira issue:
> > >
> > >
> >
> https://issues.apache.org/jira/browse/FLINK-947?jql=project%20%3D%20FLINK%20AND%20assignee%20%3D%20currentUser()%20AND%20resolution%20%3D%20Unresolved
> > > ).
> > >
> > > I have support for filter, join, grouping and aggregation. I think the
> > > basis is quite strong now but we can add support for more data types
> > > and supported operations in the select expressions.
> > >
> > > Please have a look at my branch if you're interested:
> > > https://github.com/aljoscha/flink/tree/linq You can look at the new
> > > Expression ITCases to see what features are currently available and
> > > how the interface is used. There are also two complete programs:
> > > PageRankExpression and TPCHQuery3Expression.
> > >
> > > And now at last, a sneak peek at how the new interface is used:
> > >
> > > in.group('key).select('key, ('a + 10).avg + " the average", 'a.count)
> > >
> > > The notation 'foo are Scala symbols, I use them in the DSL to
> > > reference named fields.
> > >
> > > Cheers,
> > > Aljoscha
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Expression DataSets

Aljoscha Krettek-2
Yes, Java support is working in my head.

P. S. Sorry for my slow reaction times. I'm in my winter holidays. 😀
On Jan 19, 2015 5:37 PM, "Kostas Tzoumas" <[hidden email]> wrote:

> I think the plan is to add this for both Scala and Java (starting with
> Scala)
>
> On Mon, Jan 19, 2015 at 1:33 AM, Fabian Hueske <[hidden email]> wrote:
>
> > This is great!
> > Will this be exclusive for the Scala API or are we adding this (or
> similar)
> > functionality to the Java API as well?
> >
> > 2015-01-16 17:30 GMT+01:00 Stephan Ewen <[hidden email]>:
> >
> > > Very exciting!
> > >
> > > This looks amazing. It almost looks like half a SQL interface ;-)
> > >
> > > On Fri, Jan 16, 2015 at 11:04 AM, Aljoscha Krettek <
> [hidden email]>
> > > wrote:
> > >
> > > > Hi,
> > > > I did some work recently on adding support for SQL-like queries on
> top
> > > > of DataSets. (This is known as "named datasets" in the jira issue:
> > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/FLINK-947?jql=project%20%3D%20FLINK%20AND%20assignee%20%3D%20currentUser()%20AND%20resolution%20%3D%20Unresolved
> > > > ).
> > > >
> > > > I have support for filter, join, grouping and aggregation. I think
> the
> > > > basis is quite strong now but we can add support for more data types
> > > > and supported operations in the select expressions.
> > > >
> > > > Please have a look at my branch if you're interested:
> > > > https://github.com/aljoscha/flink/tree/linq You can look at the new
> > > > Expression ITCases to see what features are currently available and
> > > > how the interface is used. There are also two complete programs:
> > > > PageRankExpression and TPCHQuery3Expression.
> > > >
> > > > And now at last, a sneak peek at how the new interface is used:
> > > >
> > > > in.group('key).select('key, ('a + 10).avg + " the average", 'a.count)
> > > >
> > > > The notation 'foo are Scala symbols, I use them in the DSL to
> > > > reference named fields.
> > > >
> > > > Cheers,
> > > > Aljoscha
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Expression DataSets

Fabian Hueske
Excellent!  :-)
Have a great time and enjoy the snow!

2015-01-19 19:17 GMT+01:00 Aljoscha Krettek <[hidden email]>:

> Yes, Java support is working in my head.
>
> P. S. Sorry for my slow reaction times. I'm in my winter holidays. 😀
> On Jan 19, 2015 5:37 PM, "Kostas Tzoumas" <[hidden email]> wrote:
>
> > I think the plan is to add this for both Scala and Java (starting with
> > Scala)
> >
> > On Mon, Jan 19, 2015 at 1:33 AM, Fabian Hueske <[hidden email]>
> wrote:
> >
> > > This is great!
> > > Will this be exclusive for the Scala API or are we adding this (or
> > similar)
> > > functionality to the Java API as well?
> > >
> > > 2015-01-16 17:30 GMT+01:00 Stephan Ewen <[hidden email]>:
> > >
> > > > Very exciting!
> > > >
> > > > This looks amazing. It almost looks like half a SQL interface ;-)
> > > >
> > > > On Fri, Jan 16, 2015 at 11:04 AM, Aljoscha Krettek <
> > [hidden email]>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > > I did some work recently on adding support for SQL-like queries on
> > top
> > > > > of DataSets. (This is known as "named datasets" in the jira issue:
> > > > >
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/browse/FLINK-947?jql=project%20%3D%20FLINK%20AND%20assignee%20%3D%20currentUser()%20AND%20resolution%20%3D%20Unresolved
> > > > > ).
> > > > >
> > > > > I have support for filter, join, grouping and aggregation. I think
> > the
> > > > > basis is quite strong now but we can add support for more data
> types
> > > > > and supported operations in the select expressions.
> > > > >
> > > > > Please have a look at my branch if you're interested:
> > > > > https://github.com/aljoscha/flink/tree/linq You can look at the
> new
> > > > > Expression ITCases to see what features are currently available and
> > > > > how the interface is used. There are also two complete programs:
> > > > > PageRankExpression and TPCHQuery3Expression.
> > > > >
> > > > > And now at last, a sneak peek at how the new interface is used:
> > > > >
> > > > > in.group('key).select('key, ('a + 10).avg + " the average",
> 'a.count)
> > > > >
> > > > > The notation 'foo are Scala symbols, I use them in the DSL to
> > > > > reference named fields.
> > > > >
> > > > > Cheers,
> > > > > Aljoscha
> > > > >
> > > >
> > >
> >
>