Stringable Hashmap keys on POJOs

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Stringable Hashmap keys on POJOs

Paris Carbone
Hello,

It seems that Avro fails to serialise POJOs that contain non-String or stringable keys<https://apache.googlesource.com/avro/+/40650540dcb8ca8a6b6235de5cdd36c0f6e2eb31/lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java#361>.
eg. in the example here<https://github.com/senorcarbone/incubator-flink/blob/72b6798b50396c962fc6cea20a2bcdd51eec06f4/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/testing/LongMapKeyIssueExample.java> I get a compiler exception caused by:

org.apache.avro.AvroTypeException: Map key class not String: class java.lang.Long

Is there any known workaround/recommendation for this except for using String keys?
I need this for a use case of low latency data-intensive streaming so String conversions should be avoided.

Paris


Reply | Threaded
Open this post in threaded view
|

Re: Stringable Hashmap keys on POJOs

Aljoscha Krettek-2
I don't know any workaround. But maybe Avro should be avoided
altogether for your requirements.

What is the data that you want to move between operations?

On Thu, Dec 4, 2014 at 7:13 PM, Paris Carbone <[hidden email]> wrote:

> Hello,
>
> It seems that Avro fails to serialise POJOs that contain non-String or stringable keys<https://apache.googlesource.com/avro/+/40650540dcb8ca8a6b6235de5cdd36c0f6e2eb31/lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java#361>.
> eg. in the example here<https://github.com/senorcarbone/incubator-flink/blob/72b6798b50396c962fc6cea20a2bcdd51eec06f4/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/testing/LongMapKeyIssueExample.java> I get a compiler exception caused by:
>
> org.apache.avro.AvroTypeException: Map key class not String: class java.lang.Long
>
> Is there any known workaround/recommendation for this except for using String keys?
> I need this for a use case of low latency data-intensive streaming so String conversions should be avoided.
>
> Paris
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Stringable Hashmap keys on POJOs

Stephan Ewen
We are in the midst of replacing Avro in the serialization, so this should
spoon be fixed properly.

Until then, you could try to re-package the collection. Something like an
array of map entries, or so. Would that feasible?

Stephan
Am 04.12.2014 21:42 schrieb "Aljoscha Krettek" <[hidden email]>:

> I don't know any workaround. But maybe Avro should be avoided
> altogether for your requirements.
>
> What is the data that you want to move between operations?
>
> On Thu, Dec 4, 2014 at 7:13 PM, Paris Carbone <[hidden email]> wrote:
> > Hello,
> >
> > It seems that Avro fails to serialise POJOs that contain non-String or
> stringable keys<
> https://apache.googlesource.com/avro/+/40650540dcb8ca8a6b6235de5cdd36c0f6e2eb31/lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java#361
> >.
> > eg. in the example here<
> https://github.com/senorcarbone/incubator-flink/blob/72b6798b50396c962fc6cea20a2bcdd51eec06f4/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/testing/LongMapKeyIssueExample.java>
> I get a compiler exception caused by:
> >
> > org.apache.avro.AvroTypeException: Map key class not String: class
> java.lang.Long
> >
> > Is there any known workaround/recommendation for this except for using
> String keys?
> > I need this for a use case of low latency data-intensive streaming so
> String conversions should be avoided.
> >
> > Paris
> >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Stringable Hashmap keys on POJOs

Robert Metzger
As possible workarounds, you could
a) Implement your own serialization by implementing the "Value" interface.
b) Use the Hadoop
http://hadoop.apache.org/docs/r2.3.0/api/org/apache/hadoop/io/MapWritable.html
MapWritable Class. You have to use hadoop's LongWritable and IntWritable
for the types of the map but Flink should be able to handle Writable's in
POJOs.

I would recommend option b).

On Fri, Dec 5, 2014 at 10:22 AM, Stephan Ewen <[hidden email]> wrote:

> We are in the midst of replacing Avro in the serialization, so this should
> spoon be fixed properly.
>
> Until then, you could try to re-package the collection. Something like an
> array of map entries, or so. Would that feasible?
>
> Stephan
> Am 04.12.2014 21:42 schrieb "Aljoscha Krettek" <[hidden email]>:
>
> > I don't know any workaround. But maybe Avro should be avoided
> > altogether for your requirements.
> >
> > What is the data that you want to move between operations?
> >
> > On Thu, Dec 4, 2014 at 7:13 PM, Paris Carbone <[hidden email]> wrote:
> > > Hello,
> > >
> > > It seems that Avro fails to serialise POJOs that contain non-String or
> > stringable keys<
> >
> https://apache.googlesource.com/avro/+/40650540dcb8ca8a6b6235de5cdd36c0f6e2eb31/lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java#361
> > >.
> > > eg. in the example here<
> >
> https://github.com/senorcarbone/incubator-flink/blob/72b6798b50396c962fc6cea20a2bcdd51eec06f4/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/testing/LongMapKeyIssueExample.java
> >
> > I get a compiler exception caused by:
> > >
> > > org.apache.avro.AvroTypeException: Map key class not String: class
> > java.lang.Long
> > >
> > > Is there any known workaround/recommendation for this except for using
> > String keys?
> > > I need this for a use case of low latency data-intensive streaming so
> > String conversions should be avoided.
> > >
> > > Paris
> > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Stringable Hashmap keys on POJOs

Stephan Ewen
If you use Flink's LongValue, StringValue, DoubleValue, you can also use
Flink's MapValue. If you subclass it (without adding code) it is very
efficient. It stores type information in the subclass and that way handles
key and value types.

On Fri, Dec 5, 2014 at 11:33 AM, Robert Metzger <[hidden email]> wrote:

> As possible workarounds, you could
> a) Implement your own serialization by implementing the "Value" interface.
> b) Use the Hadoop
>
> http://hadoop.apache.org/docs/r2.3.0/api/org/apache/hadoop/io/MapWritable.html
> MapWritable Class. You have to use hadoop's LongWritable and IntWritable
> for the types of the map but Flink should be able to handle Writable's in
> POJOs.
>
> I would recommend option b).
>
> On Fri, Dec 5, 2014 at 10:22 AM, Stephan Ewen <[hidden email]> wrote:
>
> > We are in the midst of replacing Avro in the serialization, so this
> should
> > spoon be fixed properly.
> >
> > Until then, you could try to re-package the collection. Something like an
> > array of map entries, or so. Would that feasible?
> >
> > Stephan
> > Am 04.12.2014 21:42 schrieb "Aljoscha Krettek" <[hidden email]>:
> >
> > > I don't know any workaround. But maybe Avro should be avoided
> > > altogether for your requirements.
> > >
> > > What is the data that you want to move between operations?
> > >
> > > On Thu, Dec 4, 2014 at 7:13 PM, Paris Carbone <[hidden email]> wrote:
> > > > Hello,
> > > >
> > > > It seems that Avro fails to serialise POJOs that contain non-String
> or
> > > stringable keys<
> > >
> >
> https://apache.googlesource.com/avro/+/40650540dcb8ca8a6b6235de5cdd36c0f6e2eb31/lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java#361
> > > >.
> > > > eg. in the example here<
> > >
> >
> https://github.com/senorcarbone/incubator-flink/blob/72b6798b50396c962fc6cea20a2bcdd51eec06f4/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/testing/LongMapKeyIssueExample.java
> > >
> > > I get a compiler exception caused by:
> > > >
> > > > org.apache.avro.AvroTypeException: Map key class not String: class
> > > java.lang.Long
> > > >
> > > > Is there any known workaround/recommendation for this except for
> using
> > > String keys?
> > > > I need this for a use case of low latency data-intensive streaming so
> > > String conversions should be avoided.
> > > >
> > > > Paris
> > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Stringable Hashmap keys on POJOs

Paris Carbone
Great, thanks both for the recommendations!

Paris

> On 05 Dec 2014, at 12:56, Stephan Ewen <[hidden email]> wrote:
>
> If you use Flink's LongValue, StringValue, DoubleValue, you can also use
> Flink's MapValue. If you subclass it (without adding code) it is very
> efficient. It stores type information in the subclass and that way handles
> key and value types.
>
> On Fri, Dec 5, 2014 at 11:33 AM, Robert Metzger <[hidden email]> wrote:
>
>> As possible workarounds, you could
>> a) Implement your own serialization by implementing the "Value" interface.
>> b) Use the Hadoop
>>
>> http://hadoop.apache.org/docs/r2.3.0/api/org/apache/hadoop/io/MapWritable.html
>> MapWritable Class. You have to use hadoop's LongWritable and IntWritable
>> for the types of the map but Flink should be able to handle Writable's in
>> POJOs.
>>
>> I would recommend option b).
>>
>> On Fri, Dec 5, 2014 at 10:22 AM, Stephan Ewen <[hidden email]> wrote:
>>
>>> We are in the midst of replacing Avro in the serialization, so this
>> should
>>> spoon be fixed properly.
>>>
>>> Until then, you could try to re-package the collection. Something like an
>>> array of map entries, or so. Would that feasible?
>>>
>>> Stephan
>>> Am 04.12.2014 21:42 schrieb "Aljoscha Krettek" <[hidden email]>:
>>>
>>>> I don't know any workaround. But maybe Avro should be avoided
>>>> altogether for your requirements.
>>>>
>>>> What is the data that you want to move between operations?
>>>>
>>>> On Thu, Dec 4, 2014 at 7:13 PM, Paris Carbone <[hidden email]> wrote:
>>>>> Hello,
>>>>>
>>>>> It seems that Avro fails to serialise POJOs that contain non-String
>> or
>>>> stringable keys<
>>>>
>>>
>> https://apache.googlesource.com/avro/+/40650540dcb8ca8a6b6235de5cdd36c0f6e2eb31/lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java#361
>>>>> .
>>>>> eg. in the example here<
>>>>
>>>
>> https://github.com/senorcarbone/incubator-flink/blob/72b6798b50396c962fc6cea20a2bcdd51eec06f4/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/testing/LongMapKeyIssueExample.java
>>>>
>>>> I get a compiler exception caused by:
>>>>>
>>>>> org.apache.avro.AvroTypeException: Map key class not String: class
>>>> java.lang.Long
>>>>>
>>>>> Is there any known workaround/recommendation for this except for
>> using
>>>> String keys?
>>>>> I need this for a use case of low latency data-intensive streaming so
>>>> String conversions should be avoided.
>>>>>
>>>>> Paris
>>>>>
>>>>>
>>>>
>>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: Stringable Hashmap keys on POJOs

Paris Carbone
@Alioscha It will be initially binary data (byte[]) from sensor streams transformed into Hashmaps of numeric types and some additional metadata.

Paris

> On 05 Dec 2014, at 13:14, Paris Carbone <[hidden email]> wrote:
>
> Great, thanks both for the recommendations!
>
> Paris
>
>> On 05 Dec 2014, at 12:56, Stephan Ewen <[hidden email]> wrote:
>>
>> If you use Flink's LongValue, StringValue, DoubleValue, you can also use
>> Flink's MapValue. If you subclass it (without adding code) it is very
>> efficient. It stores type information in the subclass and that way handles
>> key and value types.
>>
>> On Fri, Dec 5, 2014 at 11:33 AM, Robert Metzger <[hidden email]> wrote:
>>
>>> As possible workarounds, you could
>>> a) Implement your own serialization by implementing the "Value" interface.
>>> b) Use the Hadoop
>>>
>>> http://hadoop.apache.org/docs/r2.3.0/api/org/apache/hadoop/io/MapWritable.html
>>> MapWritable Class. You have to use hadoop's LongWritable and IntWritable
>>> for the types of the map but Flink should be able to handle Writable's in
>>> POJOs.
>>>
>>> I would recommend option b).
>>>
>>> On Fri, Dec 5, 2014 at 10:22 AM, Stephan Ewen <[hidden email]> wrote:
>>>
>>>> We are in the midst of replacing Avro in the serialization, so this
>>> should
>>>> spoon be fixed properly.
>>>>
>>>> Until then, you could try to re-package the collection. Something like an
>>>> array of map entries, or so. Would that feasible?
>>>>
>>>> Stephan
>>>> Am 04.12.2014 21:42 schrieb "Aljoscha Krettek" <[hidden email]>:
>>>>
>>>>> I don't know any workaround. But maybe Avro should be avoided
>>>>> altogether for your requirements.
>>>>>
>>>>> What is the data that you want to move between operations?
>>>>>
>>>>> On Thu, Dec 4, 2014 at 7:13 PM, Paris Carbone <[hidden email]> wrote:
>>>>>> Hello,
>>>>>>
>>>>>> It seems that Avro fails to serialise POJOs that contain non-String
>>> or
>>>>> stringable keys<
>>>>>
>>>>
>>> https://apache.googlesource.com/avro/+/40650540dcb8ca8a6b6235de5cdd36c0f6e2eb31/lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java#361
>>>>>> .
>>>>>> eg. in the example here<
>>>>>
>>>>
>>> https://github.com/senorcarbone/incubator-flink/blob/72b6798b50396c962fc6cea20a2bcdd51eec06f4/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/testing/LongMapKeyIssueExample.java
>>>>>
>>>>> I get a compiler exception caused by:
>>>>>>
>>>>>> org.apache.avro.AvroTypeException: Map key class not String: class
>>>>> java.lang.Long
>>>>>>
>>>>>> Is there any known workaround/recommendation for this except for
>>> using
>>>>> String keys?
>>>>>> I need this for a use case of low latency data-intensive streaming so
>>>>> String conversions should be avoided.
>>>>>>
>>>>>> Paris
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Stringable Hashmap keys on POJOs

Aljoscha Krettek-2
Hmm, then what Stephan suggested might indeed be the best solution.

On Fri, Dec 5, 2014 at 5:43 PM, Paris Carbone <[hidden email]> wrote:

> @Alioscha It will be initially binary data (byte[]) from sensor streams transformed into Hashmaps of numeric types and some additional metadata.
>
> Paris
>
>> On 05 Dec 2014, at 13:14, Paris Carbone <[hidden email]> wrote:
>>
>> Great, thanks both for the recommendations!
>>
>> Paris
>>
>>> On 05 Dec 2014, at 12:56, Stephan Ewen <[hidden email]> wrote:
>>>
>>> If you use Flink's LongValue, StringValue, DoubleValue, you can also use
>>> Flink's MapValue. If you subclass it (without adding code) it is very
>>> efficient. It stores type information in the subclass and that way handles
>>> key and value types.
>>>
>>> On Fri, Dec 5, 2014 at 11:33 AM, Robert Metzger <[hidden email]> wrote:
>>>
>>>> As possible workarounds, you could
>>>> a) Implement your own serialization by implementing the "Value" interface.
>>>> b) Use the Hadoop
>>>>
>>>> http://hadoop.apache.org/docs/r2.3.0/api/org/apache/hadoop/io/MapWritable.html
>>>> MapWritable Class. You have to use hadoop's LongWritable and IntWritable
>>>> for the types of the map but Flink should be able to handle Writable's in
>>>> POJOs.
>>>>
>>>> I would recommend option b).
>>>>
>>>> On Fri, Dec 5, 2014 at 10:22 AM, Stephan Ewen <[hidden email]> wrote:
>>>>
>>>>> We are in the midst of replacing Avro in the serialization, so this
>>>> should
>>>>> spoon be fixed properly.
>>>>>
>>>>> Until then, you could try to re-package the collection. Something like an
>>>>> array of map entries, or so. Would that feasible?
>>>>>
>>>>> Stephan
>>>>> Am 04.12.2014 21:42 schrieb "Aljoscha Krettek" <[hidden email]>:
>>>>>
>>>>>> I don't know any workaround. But maybe Avro should be avoided
>>>>>> altogether for your requirements.
>>>>>>
>>>>>> What is the data that you want to move between operations?
>>>>>>
>>>>>> On Thu, Dec 4, 2014 at 7:13 PM, Paris Carbone <[hidden email]> wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> It seems that Avro fails to serialise POJOs that contain non-String
>>>> or
>>>>>> stringable keys<
>>>>>>
>>>>>
>>>> https://apache.googlesource.com/avro/+/40650540dcb8ca8a6b6235de5cdd36c0f6e2eb31/lang/java/avro/src/main/java/org/apache/avro/reflect/ReflectData.java#361
>>>>>>> .
>>>>>>> eg. in the example here<
>>>>>>
>>>>>
>>>> https://github.com/senorcarbone/incubator-flink/blob/72b6798b50396c962fc6cea20a2bcdd51eec06f4/flink-examples/flink-java-examples/src/main/java/org/apache/flink/examples/java/testing/LongMapKeyIssueExample.java
>>>>>>
>>>>>> I get a compiler exception caused by:
>>>>>>>
>>>>>>> org.apache.avro.AvroTypeException: Map key class not String: class
>>>>>> java.lang.Long
>>>>>>>
>>>>>>> Is there any known workaround/recommendation for this except for
>>>> using
>>>>>> String keys?
>>>>>>> I need this for a use case of low latency data-intensive streaming so
>>>>>> String conversions should be avoided.
>>>>>>>
>>>>>>> Paris
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>
>