Hi,
While building IT tests which extend MultipleProgramsTestBase, I encountered a problem with serialization: I posted a minimal example here: https://gist.github.com/s1ck/566796df5f35ee1de6f9 This runs fine with LocalEnvironment. However, when executing this in CollectionEnvironment, it leads to the following Exception: Exception in thread "main" com.esotericsoftware.kryo.KryoException: Class cannot be created (missing no-arg constructor): java.util.UUID Serialization trace: uuid (ObjectInTuple$ID) at com.esotericsoftware.kryo.Kryo$DefaultInstantiatorStrategy.newInstantiatorOf(Kryo.java:1228) I tried to manually register a UUIDSerializer (which should not be necessary as Flink has a dependency to such "default" serializers), but this did not fix the problem. What I don't understand in general is why the LocalEnvironment and the CollectionEnvironment use different strategies (e.g. in serialization and also in workflow execution). Thanks for your help! Best, Martin |
Hi,
I looked further into the problem and discovered, that the serialization also fails for TestEnvironment and LocalEnvironment if a groupBy operation is involved. Please have a look at the updated Gist https://gist.github.com/s1ck/566796df5f35ee1de6f9 Best, Martin On 27.11.2015 10:20, Martin Junghanns wrote: > Hi, > > While building IT tests which extend MultipleProgramsTestBase, I > encountered a problem with serialization: > > I posted a minimal example here: > https://gist.github.com/s1ck/566796df5f35ee1de6f9 > > This runs fine with LocalEnvironment. However, when executing this in > CollectionEnvironment, it leads to the following Exception: > > Exception in thread "main" com.esotericsoftware.kryo.KryoException: > Class cannot be created (missing no-arg constructor): java.util.UUID > Serialization trace: > uuid (ObjectInTuple$ID) > at > com.esotericsoftware.kryo.Kryo$DefaultInstantiatorStrategy.newInstantiatorOf(Kryo.java:1228) > > > I tried to manually register a UUIDSerializer (which should not be > necessary as Flink has a dependency to such "default" serializers), but > this did not fix the problem. > > What I don't understand in general is why the LocalEnvironment and the > CollectionEnvironment use different strategies (e.g. in serialization > and also in workflow execution). > > Thanks for your help! > > Best, > Martin |
Hi Martin,
I think the problem is that the the WritableSerializer, WritableComparator, ValueComparator, ValueSerializer and the AvroSerializer all use Kryo to copy objects. However, in some cases, e.g. missing no-arg constructor, Kryo is not able to copy the object. In these cases, one should try to copy the objects via serialization as a fallback strategy. That’s a bug in these components. I’ll file an issue and open a PR for it. Thanks a lot for finding this problem Martin! Cheers, Till On Fri, Nov 27, 2015 at 11:59 AM, Martin Junghanns <[hidden email]> wrote: > Hi, > > I looked further into the problem and discovered, that the serialization > also fails for TestEnvironment and LocalEnvironment if a groupBy operation > is involved. > > Please have a look at the updated Gist > > https://gist.github.com/s1ck/566796df5f35ee1de6f9 > > Best, > Martin > > > On 27.11.2015 10:20, Martin Junghanns wrote: > >> Hi, >> >> While building IT tests which extend MultipleProgramsTestBase, I >> encountered a problem with serialization: >> >> I posted a minimal example here: >> https://gist.github.com/s1ck/566796df5f35ee1de6f9 >> >> This runs fine with LocalEnvironment. However, when executing this in >> CollectionEnvironment, it leads to the following Exception: >> >> Exception in thread "main" com.esotericsoftware.kryo.KryoException: >> Class cannot be created (missing no-arg constructor): java.util.UUID >> Serialization trace: >> uuid (ObjectInTuple$ID) >> at >> >> com.esotericsoftware.kryo.Kryo$DefaultInstantiatorStrategy.newInstantiatorOf(Kryo.java:1228) >> >> >> I tried to manually register a UUIDSerializer (which should not be >> necessary as Flink has a dependency to such "default" serializers), but >> this did not fix the problem. >> >> What I don't understand in general is why the LocalEnvironment and the >> CollectionEnvironment use different strategies (e.g. in serialization >> and also in workflow execution). >> >> Thanks for your help! >> >> Best, >> Martin >> > |
The issue is https://issues.apache.org/jira/browse/FLINK-3088.
On Fri, Nov 27, 2015 at 12:21 PM, Till Rohrmann <[hidden email]> wrote: > Hi Martin, > > I think the problem is that the the WritableSerializer, WritableComparator, > ValueComparator, ValueSerializer and the AvroSerializer all use Kryo to > copy objects. However, in some cases, e.g. missing no-arg constructor, > Kryo is not able to copy the object. In these cases, one should try to > copy the objects via serialization as a fallback strategy. That’s a bug in > these components. I’ll file an issue and open a PR for it. > > Thanks a lot for finding this problem Martin! > > Cheers, > Till > > > On Fri, Nov 27, 2015 at 11:59 AM, Martin Junghanns < > [hidden email]> wrote: > >> Hi, >> >> I looked further into the problem and discovered, that the serialization >> also fails for TestEnvironment and LocalEnvironment if a groupBy operation >> is involved. >> >> Please have a look at the updated Gist >> >> https://gist.github.com/s1ck/566796df5f35ee1de6f9 >> >> Best, >> Martin >> >> >> On 27.11.2015 10:20, Martin Junghanns wrote: >> >>> Hi, >>> >>> While building IT tests which extend MultipleProgramsTestBase, I >>> encountered a problem with serialization: >>> >>> I posted a minimal example here: >>> https://gist.github.com/s1ck/566796df5f35ee1de6f9 >>> >>> This runs fine with LocalEnvironment. However, when executing this in >>> CollectionEnvironment, it leads to the following Exception: >>> >>> Exception in thread "main" com.esotericsoftware.kryo.KryoException: >>> Class cannot be created (missing no-arg constructor): java.util.UUID >>> Serialization trace: >>> uuid (ObjectInTuple$ID) >>> at >>> >>> com.esotericsoftware.kryo.Kryo$DefaultInstantiatorStrategy.newInstantiatorOf(Kryo.java:1228) >>> >>> >>> I tried to manually register a UUIDSerializer (which should not be >>> necessary as Flink has a dependency to such "default" serializers), but >>> this did not fix the problem. >>> >>> What I don't understand in general is why the LocalEnvironment and the >>> CollectionEnvironment use different strategies (e.g. in serialization >>> and also in workflow execution). >>> >>> Thanks for your help! >>> >>> Best, >>> Martin >>> >> > |
Hi Till!
Thanks for looking into it. Is it correct to assume that when a user defined class that implements Writable is used in Flink datasets, it's always serialized using Kryo even if the members of that class could be serialized by Flink's own serialization mechanism? Again, thanks. Best, Martin On 27.11.2015 12:36, Till Rohrmann wrote: > The issue is https://issues.apache.org/jira/browse/FLINK-3088. > > On Fri, Nov 27, 2015 at 12:21 PM, Till Rohrmann <[hidden email]> > wrote: > >> Hi Martin, >> >> I think the problem is that the the WritableSerializer, WritableComparator, >> ValueComparator, ValueSerializer and the AvroSerializer all use Kryo to >> copy objects. However, in some cases, e.g. missing no-arg constructor, >> Kryo is not able to copy the object. In these cases, one should try to >> copy the objects via serialization as a fallback strategy. That’s a bug in >> these components. I’ll file an issue and open a PR for it. >> >> Thanks a lot for finding this problem Martin! >> >> Cheers, >> Till >> >> >> On Fri, Nov 27, 2015 at 11:59 AM, Martin Junghanns < >> [hidden email]> wrote: >> >>> Hi, >>> >>> I looked further into the problem and discovered, that the serialization >>> also fails for TestEnvironment and LocalEnvironment if a groupBy operation >>> is involved. >>> >>> Please have a look at the updated Gist >>> >>> https://gist.github.com/s1ck/566796df5f35ee1de6f9 >>> >>> Best, >>> Martin >>> >>> >>> On 27.11.2015 10:20, Martin Junghanns wrote: >>> >>>> Hi, >>>> >>>> While building IT tests which extend MultipleProgramsTestBase, I >>>> encountered a problem with serialization: >>>> >>>> I posted a minimal example here: >>>> https://gist.github.com/s1ck/566796df5f35ee1de6f9 >>>> >>>> This runs fine with LocalEnvironment. However, when executing this in >>>> CollectionEnvironment, it leads to the following Exception: >>>> >>>> Exception in thread "main" com.esotericsoftware.kryo.KryoException: >>>> Class cannot be created (missing no-arg constructor): java.util.UUID >>>> Serialization trace: >>>> uuid (ObjectInTuple$ID) >>>> at >>>> >>>> com.esotericsoftware.kryo.Kryo$DefaultInstantiatorStrategy.newInstantiatorOf(Kryo.java:1228) >>>> >>>> >>>> I tried to manually register a UUIDSerializer (which should not be >>>> necessary as Flink has a dependency to such "default" serializers), but >>>> this did not fix the problem. >>>> >>>> What I don't understand in general is why the LocalEnvironment and the >>>> CollectionEnvironment use different strategies (e.g. in serialization >>>> and also in workflow execution). >>>> >>>> Thanks for your help! >>>> >>>> Best, >>>> Martin >>>> >>> >> > |
Hi Martin,
no, Kryo is actually only used for copying the element. For serialization/deserialization the write and readFields methods are used. I’ve changed it now that if Kryo fails, then it will serialize the object into a byte array and deserialize it from there again in order to copy the element. Cheers, Till On Fri, Nov 27, 2015 at 3:01 PM, Martin Junghanns <[hidden email]> wrote: > Hi Till! > > Thanks for looking into it. Is it correct to assume that when a user > defined class that implements Writable is used in Flink datasets, it's > always serialized using Kryo even if the members of that class could be > serialized by Flink's own serialization mechanism? > > Again, thanks. > > Best, > Martin > > On 27.11.2015 12:36, Till Rohrmann wrote: > > The issue is https://issues.apache.org/jira/browse/FLINK-3088. > > > > On Fri, Nov 27, 2015 at 12:21 PM, Till Rohrmann <[hidden email]> > > wrote: > > > >> Hi Martin, > >> > >> I think the problem is that the the WritableSerializer, > WritableComparator, > >> ValueComparator, ValueSerializer and the AvroSerializer all use Kryo to > >> copy objects. However, in some cases, e.g. missing no-arg constructor, > >> Kryo is not able to copy the object. In these cases, one should try to > >> copy the objects via serialization as a fallback strategy. That’s a bug > in > >> these components. I’ll file an issue and open a PR for it. > >> > >> Thanks a lot for finding this problem Martin! > >> > >> Cheers, > >> Till > >> > >> > >> On Fri, Nov 27, 2015 at 11:59 AM, Martin Junghanns < > >> [hidden email]> wrote: > >> > >>> Hi, > >>> > >>> I looked further into the problem and discovered, that the > serialization > >>> also fails for TestEnvironment and LocalEnvironment if a groupBy > operation > >>> is involved. > >>> > >>> Please have a look at the updated Gist > >>> > >>> https://gist.github.com/s1ck/566796df5f35ee1de6f9 > >>> > >>> Best, > >>> Martin > >>> > >>> > >>> On 27.11.2015 10:20, Martin Junghanns wrote: > >>> > >>>> Hi, > >>>> > >>>> While building IT tests which extend MultipleProgramsTestBase, I > >>>> encountered a problem with serialization: > >>>> > >>>> I posted a minimal example here: > >>>> https://gist.github.com/s1ck/566796df5f35ee1de6f9 > >>>> > >>>> This runs fine with LocalEnvironment. However, when executing this in > >>>> CollectionEnvironment, it leads to the following Exception: > >>>> > >>>> Exception in thread "main" com.esotericsoftware.kryo.KryoException: > >>>> Class cannot be created (missing no-arg constructor): java.util.UUID > >>>> Serialization trace: > >>>> uuid (ObjectInTuple$ID) > >>>> at > >>>> > >>>> > com.esotericsoftware.kryo.Kryo$DefaultInstantiatorStrategy.newInstantiatorOf(Kryo.java:1228) > >>>> > >>>> > >>>> I tried to manually register a UUIDSerializer (which should not be > >>>> necessary as Flink has a dependency to such "default" serializers), > but > >>>> this did not fix the problem. > >>>> > >>>> What I don't understand in general is why the LocalEnvironment and the > >>>> CollectionEnvironment use different strategies (e.g. in serialization > >>>> and also in workflow execution). > >>>> > >>>> Thanks for your help! > >>>> > >>>> Best, > >>>> Martin > >>>> > >>> > >> > > > |
Free forum by Nabble | Edit this page |