Apache Flink 0.9 ALS API

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Apache Flink 0.9 ALS API

Ronny Bräunlich
Hello everybody,

for a university project we use the current implementation of ALS in Flink 0.9 and we were wondering about the API of predict() and fit() requiring a DataSet[(Int, Int)] or DataSet[(Int, Int, Double]) respectively, because the range of Int is quite limited.
That is why we wanted to ask you if it wouldn’t be advantageous to change Int to Long, to allow more values.
Please let me know what you think about it.

Cheers,
Ronny
Reply | Threaded
Open this post in threaded view
|

Re: Apache Flink 0.9 ALS API

Felix Neutatz
Hi Ronny,

I agree with you and I would go even further and generalize it overall. So
that the movieID could be of type Long or Int and the userID of type String.

This would increase usability of the ALS implementation :)

Best regards,
Felix

2015-06-10 11:28 GMT+02:00 Ronny Bräunlich <[hidden email]>:

> Hello everybody,
>
> for a university project we use the current implementation of ALS in Flink
> 0.9 and we were wondering about the API of predict() and fit() requiring a
> DataSet[(Int, Int)] or DataSet[(Int, Int, Double]) respectively, because
> the range of Int is quite limited.
> That is why we wanted to ask you if it wouldn’t be advantageous to change
> Int to Long, to allow more values.
> Please let me know what you think about it.
>
> Cheers,
> Ronny
Reply | Threaded
Open this post in threaded view
|

Re: Apache Flink 0.9 ALS API

Chiwan Park
+1 for generalisation.

@Ronny: Could you create a JIRA issue related to this?

Regards,
Chiwan Park

> On Jun 13, 2015, at 9:07 PM, Felix Neutatz <[hidden email]> wrote:
>
> Hi Ronny,
>
> I agree with you and I would go even further and generalize it overall. So
> that the movieID could be of type Long or Int and the userID of type String.
>
> This would increase usability of the ALS implementation :)
>
> Best regards,
> Felix
>
> 2015-06-10 11:28 GMT+02:00 Ronny Bräunlich <[hidden email]>:
>
>> Hello everybody,
>>
>> for a university project we use the current implementation of ALS in Flink
>> 0.9 and we were wondering about the API of predict() and fit() requiring a
>> DataSet[(Int, Int)] or DataSet[(Int, Int, Double]) respectively, because
>> the range of Int is quite limited.
>> That is why we wanted to ask you if it wouldn’t be advantageous to change
>> Int to Long, to allow more values.
>> Please let me know what you think about it.
>>
>> Cheers,
>> Ronny






Reply | Threaded
Open this post in threaded view
|

Re: Apache Flink 0.9 ALS API

Till Rohrmann
+1 for longs as IDs.

Not so much in favour of Strings for the user ID because the row index
could also denote the actual item ID if you swap the indices. Furthermore,
you can always add a transformer which assigns unique IDs to names.

Cheers,
Till

On Sat, Jun 13, 2015 at 3:34 PM Chiwan Park <[hidden email]> wrote:

> +1 for generalisation.
>
> @Ronny: Could you create a JIRA issue related to this?
>
> Regards,
> Chiwan Park
>
> > On Jun 13, 2015, at 9:07 PM, Felix Neutatz <[hidden email]>
> wrote:
> >
> > Hi Ronny,
> >
> > I agree with you and I would go even further and generalize it overall.
> So
> > that the movieID could be of type Long or Int and the userID of type
> String.
> >
> > This would increase usability of the ALS implementation :)
> >
> > Best regards,
> > Felix
> >
> > 2015-06-10 11:28 GMT+02:00 Ronny Bräunlich <[hidden email]>:
> >
> >> Hello everybody,
> >>
> >> for a university project we use the current implementation of ALS in
> Flink
> >> 0.9 and we were wondering about the API of predict() and fit()
> requiring a
> >> DataSet[(Int, Int)] or DataSet[(Int, Int, Double]) respectively, because
> >> the range of Int is quite limited.
> >> That is why we wanted to ask you if it wouldn’t be advantageous to
> change
> >> Int to Long, to allow more values.
> >> Please let me know what you think about it.
> >>
> >> Cheers,
> >> Ronny
>
>
>
>
>
>
>