grouping and sort grouping with KeySelector

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

grouping and sort grouping with KeySelector

Martin Neumann
Hej,

The data set I'm working on is quite large so I created a pojo class for it
to make it less messy. I want do do a simple map, group reduce job. But the
groups needs to be sorted on a secondary key.

From what I understand I would get that by doing:
dataset.groupBy( key1 ).sortGroup( key2 , Order.ASCENDING )

Is that correct?
And if so how do I do the same call using a KeySelector. While groupBy can
take a KeySelector sortGroup cant.

Also can someone explain to me what Order.ANY is about?


cheers Martin
Reply | Threaded
Open this post in threaded view
|

Re: grouping and sort grouping with KeySelector

Fabian Hueske
Hi Martin,

group sorting is currently only possible with field-index keys and not with
KeySelectors.
A workaround would be to use a map to convert your DataSet into
Tuple3<YouPojoType, GroupType, SortType> and do a groupBy(1).sortGroup(2,
Order:ASCENDING).
This is actually also what the KeySelector would do under the hood.

Order.ANY gives the optimizer a bit more freedom to decide whether to sort
ascending or descending. However, I guess this is only relevant in very few
cases.

Best,
Fabian

2014-09-22 16:31 GMT+02:00 Martin Neumann <[hidden email]>:

> Hej,
>
> The data set I'm working on is quite large so I created a pojo class for it
> to make it less messy. I want do do a simple map, group reduce job. But the
> groups needs to be sorted on a secondary key.
>
> From what I understand I would get that by doing:
> dataset.groupBy( key1 ).sortGroup( key2 , Order.ASCENDING )
>
> Is that correct?
> And if so how do I do the same call using a KeySelector. While groupBy can
> take a KeySelector sortGroup cant.
>
> Also can someone explain to me what Order.ANY is about?
>
>
> cheers Martin
>