Sorting of fields

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Sorting of fields

Timo Walther-2
Hey,

is it correct that we currently do not support sorting without any
grouping? I had this question by 2 users in the last weeks and now I
also need this functionality.


Is it possible to sort e.g. a word count result Tuple2<String, Integer>
by count?


Regards,
Timo
Reply | Threaded
Open this post in threaded view
|

Re: Sorting of fields

Timo Walther-2
Ok, I found an earlier discussion about it. Sorry for the mail.

However, I think this is a very important feature and I should be added
soon.

On 04.02.2015 14:38, Timo Walther wrote:

> Hey,
>
> is it correct that we currently do not support sorting without any
> grouping? I had this question by 2 users in the last weeks and now I
> also need this functionality.
>
>
> Is it possible to sort e.g. a word count result Tuple2<String,
> Integer> by count?
>
>
> Regards,
> Timo
>

Reply | Threaded
Open this post in threaded view
|

Re: Sorting of fields

Fabian Hueske-2
In reply to this post by Timo Walther-2
I just merged support for local output sorting yesterday :-)
This allows to sort the data before it is given to the OutputFormat.

It is done like this:
myData.write(myOF).sortLocalOutput(1, Order.ASCENDING);

See the programming guide for details (only in master, not online).

Full sorting can be done with a DOP=1 data sink.
Full parallel sorting is not supported yet (requires range partitioning and
data stats for good partitioning bins).

Best, Fabian

2015-02-04 14:38 GMT+01:00 Timo Walther <[hidden email]>:

> Hey,
>
> is it correct that we currently do not support sorting without any
> grouping? I had this question by 2 users in the last weeks and now I also
> need this functionality.
>
>
> Is it possible to sort e.g. a word count result Tuple2<String, Integer> by
> count?
>
>
> Regards,
> Timo
>
Reply | Threaded
Open this post in threaded view
|

Re: Sorting of fields

Stephan Ewen
Based on this, we should also be able to implement a global top-k, which
has come up as a frequent requirement.

On Wed, Feb 4, 2015 at 2:55 PM, Fabian Hueske <[hidden email]> wrote:

> I just merged support for local output sorting yesterday :-)
> This allows to sort the data before it is given to the OutputFormat.
>
> It is done like this:
> myData.write(myOF).sortLocalOutput(1, Order.ASCENDING);
>
> See the programming guide for details (only in master, not online).
>
> Full sorting can be done with a DOP=1 data sink.
> Full parallel sorting is not supported yet (requires range partitioning and
> data stats for good partitioning bins).
>
> Best, Fabian
>
> 2015-02-04 14:38 GMT+01:00 Timo Walther <[hidden email]>:
>
> > Hey,
> >
> > is it correct that we currently do not support sorting without any
> > grouping? I had this question by 2 users in the last weeks and now I also
> > need this functionality.
> >
> >
> > Is it possible to sort e.g. a word count result Tuple2<String, Integer>
> by
> > count?
> >
> >
> > Regards,
> > Timo
> >
>