Using Avro SpecficRecord serialization instead of slower ReflectDatumWriter/GenericDatumWriter

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Using Avro SpecficRecord serialization instead of slower ReflectDatumWriter/GenericDatumWriter

Roshan Naik-2
Noticing that Flink takes very long inside collect(..) due to Avro serialization that relies on  ReflectDatumWriter & GenericDatumWriter.   The object being serialized here is an Avro object that implements SpecificRecordBase. It is somewhat about large (~50Kb) and complex. 

Looking for a way to use SpecificDatumWriter for the serialization instead of the generic/reflection based stuff to speed it up. But don't see a way to influence that change. 








Reply | Threaded
Open this post in threaded view
|

Re: Using Avro SpecficRecord serialization instead of slower ReflectDatumWriter/GenericDatumWriter

Till Rohrmann
Hi Roshan,

these kind of questions should be posted to Flink's user mailing list. I've
cross posted it now.

If you are using Flink's latest version and your type extends
`SpecificRecord`, then Flink's AvroSerializer should use the
`SpecificDatumWriter`. If this is not the case, then this sounds like a
bug. Could you maybe provide us with a bit more details about the Flink
version you are using and the actual job you are executing. Ideally you
link a git repo which contains an example to reproduce the problem.

Cheers,
Till

On Fri, Aug 30, 2019 at 5:55 AM Roshan Naik <[hidden email]>
wrote:

> Noticing that Flink takes very long inside collect(..) due to Avro
> serialization that relies on  ReflectDatumWriter & GenericDatumWriter.
>  The object being serialized here is an Avro object that implements
> SpecificRecordBase. It is somewhat about large (~50Kb) and complex.
>
> Looking for a way to use SpecificDatumWriter for the serialization instead
> of the generic/reflection based stuff to speed it up. But don't see a way
> to influence that change.
>
>
>
>
>
>
>
>
>