Login  Register

Re: [stratosphere-dev] Grouping by a tuple

Posted by Vyacheslav Zholudev on Jun 12, 2014; 7:46am
URL: http://deprecated-apache-flink-mailing-list-archive.368.s1.nabble.com/Fwd-stratosphere-dev-Grouping-by-a-tuple-tp40p61.html

Hi Robert,

thanks, I will post my future questions to that list.

Regarding your question: When using the Tuples, you don't need to specify a keySelector. It is sufficient to specify the ID(s) of the keys: <a href="http://stratosphere-javadocs.github.io/eu/stratosphere/api/java/DataSet.html#groupBy(int.." target="_blank" onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fstratosphere-javadocs.github.io%2Feu%2Fstratosphere%2Fapi%2Fjava%2FDataSet.html%23groupBy(int..\46sa\75D\46sntz\0751\46usg\75AFQjCNGv56jr8kjFpqDrJyt0NgTnX5F3Og';return true;" onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fstratosphere-javadocs.github.io%2Feu%2Fstratosphere%2Fapi%2Fjava%2FDataSet.html%23groupBy(int..\46sa\75D\46sntz\0751\46usg\75AFQjCNGv56jr8kjFpqDrJyt0NgTnX5F3Og';return true;">http://stratosphere-javadocs.github.io/eu/stratosphere/api/java/DataSet.html#groupBy(int...)
So you should be able to do a ".groupBy(0,3,4)"


Actually my question is about the situation when I don't have tuples. Assume I have a DataSet<UserData> ds and I want to invoke ds.groupBy(/* grouping by <userId, sessionId, dayOfTheYear> */), the ideal choice would be to return a comparable tuple from the KeySelector.
On the side note, would it be possible to generate the clone method for the tuples? Yesterday I was copying a Tuple13 in a groupReduce function by hand and it was a pretty long line of code :)

Thanks,
Vyacheslav