Hey everyone,
We needed to assign unique labels as vertex values in Gelly at some point. We got a nice suggestion on how to do that in parallel (Implemented in https://github.com/apache/flink/pull/801#issuecomment-110654447). Now the question is where should these two functions go? Should they be part of the API? Something like: class DataSet<T> { public DataSet<Tuple2<Long, T>> zipWithID() {} } or should they go in flink-contrib? Fabian, Robert and Till seem to be in favour of the second option. Thanks! Andra |
As Andra said, I'd would not add it to the API at this point.
However, I don't think it should go into a separate Maven module (flink-contrib) that needs to be added as dependency but rather into some DataSetUtils class in flink-java. We can easily add it to the API later, if necessary. We should however, extend the documentation such that users are aware of the DataSetUtils. Cheers, Fabian 2015-06-10 10:45 GMT+02:00 Andra Lungu <[hidden email]>: > Hey everyone, > > We needed to assign unique labels as vertex values in Gelly at some point. > We got a nice suggestion on how to do that in parallel (Implemented in > https://github.com/apache/flink/pull/801#issuecomment-110654447). > > Now the question is where should these two functions go? Should they be > part of the API? Something like: > > class DataSet<T> { > public DataSet<Tuple2<Long, T>> zipWithID() {} > } > > or should they go in flink-contrib? Fabian, Robert and Till seem to be > in favour of > the second option. > > Thanks! > > Andra > |
+1 for Fabian, but I would very much like to see this as part of the API in
the future. This function would be very useful for FlinkML as well, as we noted in a recent discussion on the mailing list regarding time series datasets. On Wed, Jun 10, 2015 at 10:56 AM, Fabian Hueske <[hidden email]> wrote: > As Andra said, I'd would not add it to the API at this point. > However, I don't think it should go into a separate Maven module > (flink-contrib) that needs to be added as dependency but rather into some > DataSetUtils class in flink-java. > > We can easily add it to the API later, if necessary. We should however, > extend the documentation such that users are aware of the DataSetUtils. > > Cheers, Fabian > > 2015-06-10 10:45 GMT+02:00 Andra Lungu <[hidden email]>: > > > Hey everyone, > > > > We needed to assign unique labels as vertex values in Gelly at some > point. > > We got a nice suggestion on how to do that in parallel (Implemented in > > https://github.com/apache/flink/pull/801#issuecomment-110654447). > > > > Now the question is where should these two functions go? Should they be > > part of the API? Something like: > > > > class DataSet<T> { > > public DataSet<Tuple2<Long, T>> zipWithID() {} > > } > > > > or should they go in flink-contrib? Fabian, Robert and Till seem to be > > in favour of > > the second option. > > > > Thanks! > > > > Andra > > > |
I agree with Theo. I think it’s a nice feature to have as part of the
standard API because only few users will be aware of something like DataSetUtils. However, as a first version we can make it part of DataSetUtils. Cheers, Till On Wed, Jun 10, 2015 at 11:52 AM Theodore Vasiloudis < [hidden email]> wrote: > +1 for Fabian, but I would very much like to see this as part of the API in > the future. > > This function would be very useful for FlinkML as well, as we noted in a > recent discussion on the mailing list regarding time series datasets. > > On Wed, Jun 10, 2015 at 10:56 AM, Fabian Hueske <[hidden email]> wrote: > > > As Andra said, I'd would not add it to the API at this point. > > However, I don't think it should go into a separate Maven module > > (flink-contrib) that needs to be added as dependency but rather into some > > DataSetUtils class in flink-java. > > > > We can easily add it to the API later, if necessary. We should however, > > extend the documentation such that users are aware of the DataSetUtils. > > > > Cheers, Fabian > > > > 2015-06-10 10:45 GMT+02:00 Andra Lungu <[hidden email]>: > > > > > Hey everyone, > > > > > > We needed to assign unique labels as vertex values in Gelly at some > > point. > > > We got a nice suggestion on how to do that in parallel (Implemented in > > > https://github.com/apache/flink/pull/801#issuecomment-110654447). > > > > > > Now the question is where should these two functions go? Should they be > > > part of the API? Something like: > > > > > > class DataSet<T> { > > > public DataSet<Tuple2<Long, T>> zipWithID() {} > > > } > > > > > > or should they go in flink-contrib? Fabian, Robert and Till seem to be > > > in favour of > > > the second option. > > > > > > Thanks! > > > > > > Andra > > > > > > |
Thanks for the replies!
I will add the two methods in a DataSetUtils separate class. Where would you put the documentation for this? I think users should be able to easily access it. This means that it, IMO, it shouldn't go in a separate zip page, but rather in the programming guide. Or there could be a link in the DataSet Transformations page poining to this... What do you think? On Wed, Jun 10, 2015 at 12:33 PM, Till Rohrmann <[hidden email]> wrote: > I agree with Theo. I think it’s a nice feature to have as part of the > standard API because only few users will be aware of something like > DataSetUtils. However, as a first version we can make it part of > DataSetUtils. > > Cheers, > Till > > > On Wed, Jun 10, 2015 at 11:52 AM Theodore Vasiloudis < > [hidden email]> wrote: > > > +1 for Fabian, but I would very much like to see this as part of the API > in > > the future. > > > > This function would be very useful for FlinkML as well, as we noted in a > > recent discussion on the mailing list regarding time series datasets. > > > > On Wed, Jun 10, 2015 at 10:56 AM, Fabian Hueske <[hidden email]> > wrote: > > > > > As Andra said, I'd would not add it to the API at this point. > > > However, I don't think it should go into a separate Maven module > > > (flink-contrib) that needs to be added as dependency but rather into > some > > > DataSetUtils class in flink-java. > > > > > > We can easily add it to the API later, if necessary. We should however, > > > extend the documentation such that users are aware of the DataSetUtils. > > > > > > Cheers, Fabian > > > > > > 2015-06-10 10:45 GMT+02:00 Andra Lungu <[hidden email]>: > > > > > > > Hey everyone, > > > > > > > > We needed to assign unique labels as vertex values in Gelly at some > > > point. > > > > We got a nice suggestion on how to do that in parallel (Implemented > in > > > > https://github.com/apache/flink/pull/801#issuecomment-110654447). > > > > > > > > Now the question is where should these two functions go? Should they > be > > > > part of the API? Something like: > > > > > > > > class DataSet<T> { > > > > public DataSet<Tuple2<Long, T>> zipWithID() {} > > > > } > > > > > > > > or should they go in flink-contrib? Fabian, Robert and Till seem to > be > > > > in favour of > > > > the second option. > > > > > > > > Thanks! > > > > > > > > Andra > > > > > > > > > > |
Linking from the DataSet Transformations page would be good, IMO.
2015-06-12 17:11 GMT+02:00 Andra Lungu <[hidden email]>: > Thanks for the replies! > > I will add the two methods in a DataSetUtils separate class. Where would > you put the documentation for this? I think users should be able to easily > access it. This means that it, IMO, it shouldn't go in a separate zip page, > but rather in the programming guide. Or there could be a link in the > DataSet Transformations page poining to this... > > What do you think? > > On Wed, Jun 10, 2015 at 12:33 PM, Till Rohrmann <[hidden email]> > wrote: > > > I agree with Theo. I think it’s a nice feature to have as part of the > > standard API because only few users will be aware of something like > > DataSetUtils. However, as a first version we can make it part of > > DataSetUtils. > > > > Cheers, > > Till > > > > > > On Wed, Jun 10, 2015 at 11:52 AM Theodore Vasiloudis < > > [hidden email]> wrote: > > > > > +1 for Fabian, but I would very much like to see this as part of the > API > > in > > > the future. > > > > > > This function would be very useful for FlinkML as well, as we noted in > a > > > recent discussion on the mailing list regarding time series datasets. > > > > > > On Wed, Jun 10, 2015 at 10:56 AM, Fabian Hueske <[hidden email]> > > wrote: > > > > > > > As Andra said, I'd would not add it to the API at this point. > > > > However, I don't think it should go into a separate Maven module > > > > (flink-contrib) that needs to be added as dependency but rather into > > some > > > > DataSetUtils class in flink-java. > > > > > > > > We can easily add it to the API later, if necessary. We should > however, > > > > extend the documentation such that users are aware of the > DataSetUtils. > > > > > > > > Cheers, Fabian > > > > > > > > 2015-06-10 10:45 GMT+02:00 Andra Lungu <[hidden email]>: > > > > > > > > > Hey everyone, > > > > > > > > > > We needed to assign unique labels as vertex values in Gelly at some > > > > point. > > > > > We got a nice suggestion on how to do that in parallel (Implemented > > in > > > > > https://github.com/apache/flink/pull/801#issuecomment-110654447). > > > > > > > > > > Now the question is where should these two functions go? Should > they > > be > > > > > part of the API? Something like: > > > > > > > > > > class DataSet<T> { > > > > > public DataSet<Tuple2<Long, T>> zipWithID() {} > > > > > } > > > > > > > > > > or should they go in flink-contrib? Fabian, Robert and Till seem to > > be > > > > > in favour of > > > > > the second option. > > > > > > > > > > Thanks! > > > > > > > > > > Andra > > > > > > > > > > > > > > > |
+1 for linking from DataSet's transformations.
On Jun 12, 2015 5:27 PM, "Fabian Hueske" <[hidden email]> wrote: > Linking from the DataSet Transformations page would be good, IMO. > > 2015-06-12 17:11 GMT+02:00 Andra Lungu <[hidden email]>: > > > Thanks for the replies! > > > > I will add the two methods in a DataSetUtils separate class. Where would > > you put the documentation for this? I think users should be able to > easily > > access it. This means that it, IMO, it shouldn't go in a separate zip > page, > > but rather in the programming guide. Or there could be a link in the > > DataSet Transformations page poining to this... > > > > What do you think? > > > > On Wed, Jun 10, 2015 at 12:33 PM, Till Rohrmann <[hidden email] > > > > wrote: > > > > > I agree with Theo. I think it’s a nice feature to have as part of the > > > standard API because only few users will be aware of something like > > > DataSetUtils. However, as a first version we can make it part of > > > DataSetUtils. > > > > > > Cheers, > > > Till > > > > > > > > > On Wed, Jun 10, 2015 at 11:52 AM Theodore Vasiloudis < > > > [hidden email]> wrote: > > > > > > > +1 for Fabian, but I would very much like to see this as part of the > > API > > > in > > > > the future. > > > > > > > > This function would be very useful for FlinkML as well, as we noted > in > > a > > > > recent discussion on the mailing list regarding time series datasets. > > > > > > > > On Wed, Jun 10, 2015 at 10:56 AM, Fabian Hueske <[hidden email]> > > > wrote: > > > > > > > > > As Andra said, I'd would not add it to the API at this point. > > > > > However, I don't think it should go into a separate Maven module > > > > > (flink-contrib) that needs to be added as dependency but rather > into > > > some > > > > > DataSetUtils class in flink-java. > > > > > > > > > > We can easily add it to the API later, if necessary. We should > > however, > > > > > extend the documentation such that users are aware of the > > DataSetUtils. > > > > > > > > > > Cheers, Fabian > > > > > > > > > > 2015-06-10 10:45 GMT+02:00 Andra Lungu <[hidden email]>: > > > > > > > > > > > Hey everyone, > > > > > > > > > > > > We needed to assign unique labels as vertex values in Gelly at > some > > > > > point. > > > > > > We got a nice suggestion on how to do that in parallel > (Implemented > > > in > > > > > > https://github.com/apache/flink/pull/801#issuecomment-110654447 > ). > > > > > > > > > > > > Now the question is where should these two functions go? Should > > they > > > be > > > > > > part of the API? Something like: > > > > > > > > > > > > class DataSet<T> { > > > > > > public DataSet<Tuple2<Long, T>> zipWithID() {} > > > > > > } > > > > > > > > > > > > or should they go in flink-contrib? Fabian, Robert and Till seem > to > > > be > > > > > > in favour of > > > > > > the second option. > > > > > > > > > > > > Thanks! > > > > > > > > > > > > Andra > > > > > > > > > > > > > > > > > > > > > |
Free forum by Nabble | Edit this page |