Hi all!
The ExecutionConfig has some very old settings: forceAvro() and forceKryo(), which are actually misleadingly named. They cause POJOs to use Avro or Kryo rather than the POJO serializer. I think we do not have a good case any more to use Avro for POJOs. POJOs that are also Avro types go through the Avro serializer anyways. There may be a case to use Kryo for POJOs if you don't like the Flink POJO serializer. I would suggest to remove the "forceAvro()" option completely. For "forceKryo()", I am torn between removing it completely or renaming it to "setUseKryoForPOJOs()". What are the opinion on that out there? Best, Stephan |
Hi Stephan
I prefer to remove 'enableForceKryo' since Kryo serializer does not work out-of-the-box well for schema evolution stories due to its mutable properties, and our built-in POJO serializer has already supported schema evolution. On the other hand, what's the backward compatibility plan for enableForceAvro() and enableForceKryo()? I think if https://issues.apache.org/jira/browse/FLINK-11917 merged, we could support to migrate state which was POJO but serialized using Kryo. Best Yun Tang ________________________________ From: Stephan Ewen <[hidden email]> Sent: Tuesday, March 26, 2019 2:31 To: dev; user Subject: [DISCUSS] Remove forceAvro() and forceKryo() from the ExecutionConfig Hi all! The ExecutionConfig has some very old settings: forceAvro() and forceKryo(), which are actually misleadingly named. They cause POJOs to use Avro or Kryo rather than the POJO serializer. I think we do not have a good case any more to use Avro for POJOs. POJOs that are also Avro types go through the Avro serializer anyways. There may be a case to use Kryo for POJOs if you don't like the Flink POJO serializer. I would suggest to remove the "forceAvro()" option completely. For "forceKryo()", I am torn between removing it completely or renaming it to "setUseKryoForPOJOs()". What are the opinion on that out there? Best, Stephan |
Compatibility is really important for checkpointed state.
For that, you can always directly specify GenericTypeInfo or AvroTypeInfo if you want to continue to treat a type via Kryo or Avro. Alternatively, once https://issues.apache.org/jira/browse/FLINK-11917 is implemented, this should happen automatically. On Tue, Mar 26, 2019 at 8:33 AM Yun Tang <[hidden email]> wrote: > Hi Stephan > > I prefer to remove 'enableForceKryo' since Kryo serializer does not work > out-of-the-box well for schema evolution stories due to its mutable > properties, and our built-in POJO serializer has already supported schema > evolution. > > On the other hand, what's the backward compatibility plan for > enableForceAvro() and enableForceKryo()? I think if > https://issues.apache.org/jira/browse/FLINK-11917 merged, we could > support to migrate state which was POJO but serialized using Kryo. > > Best > Yun Tang > ------------------------------ > *From:* Stephan Ewen <[hidden email]> > *Sent:* Tuesday, March 26, 2019 2:31 > *To:* dev; user > *Subject:* [DISCUSS] Remove forceAvro() and forceKryo() from the > ExecutionConfig > > Hi all! > > The ExecutionConfig has some very old settings: forceAvro() and > forceKryo(), which are actually misleadingly named. They cause POJOs to use > Avro or Kryo rather than the POJO serializer. > > I think we do not have a good case any more to use Avro for POJOs. POJOs > that are also Avro types go through the Avro serializer anyways. > > There may be a case to use Kryo for POJOs if you don't like the Flink POJO > serializer. > > I would suggest to remove the "forceAvro()" option completely. > For "forceKryo()", I am torn between removing it completely or renaming it > to "setUseKryoForPOJOs()". > > What are the opinion on that out there? > > Best, > Stephan > > |
Hi Stephan,
I am in favor of renaming forceKryo() instead of removing it, because users might plugin their Protobuf/Thrift serializers via Kryo as advertised in our documentation [1]. For this, Kryo needs to be used for POJO types as well, if I am not mistaken. Cheers, Konstantin [1] https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/custom_serializers.html On Tue, Mar 26, 2019 at 10:03 AM Stephan Ewen <[hidden email]> wrote: > Compatibility is really important for checkpointed state. > For that, you can always directly specify GenericTypeInfo or AvroTypeInfo > if you want to continue to treat a type via Kryo or Avro. > > Alternatively, once https://issues.apache.org/jira/browse/FLINK-11917 is > implemented, this should happen automatically. > > On Tue, Mar 26, 2019 at 8:33 AM Yun Tang <[hidden email]> wrote: > >> Hi Stephan >> >> I prefer to remove 'enableForceKryo' since Kryo serializer does not work >> out-of-the-box well for schema evolution stories due to its mutable >> properties, and our built-in POJO serializer has already supported schema >> evolution. >> >> On the other hand, what's the backward compatibility plan for >> enableForceAvro() and enableForceKryo()? I think if >> https://issues.apache.org/jira/browse/FLINK-11917 merged, we could >> support to migrate state which was POJO but serialized using Kryo. >> >> Best >> Yun Tang >> ------------------------------ >> *From:* Stephan Ewen <[hidden email]> >> *Sent:* Tuesday, March 26, 2019 2:31 >> *To:* dev; user >> *Subject:* [DISCUSS] Remove forceAvro() and forceKryo() from the >> ExecutionConfig >> >> Hi all! >> >> The ExecutionConfig has some very old settings: forceAvro() and >> forceKryo(), which are actually misleadingly named. They cause POJOs to use >> Avro or Kryo rather than the POJO serializer. >> >> I think we do not have a good case any more to use Avro for POJOs. POJOs >> that are also Avro types go through the Avro serializer anyways. >> >> There may be a case to use Kryo for POJOs if you don't like the Flink >> POJO serializer. >> >> I would suggest to remove the "forceAvro()" option completely. >> For "forceKryo()", I am torn between removing it completely or renaming >> it to "setUseKryoForPOJOs()". >> >> What are the opinion on that out there? >> >> Best, >> Stephan >> >> -- Konstantin Knauf | Solutions Architect +49 160 91394525 <https://www.ververica.com/> Follow us @VervericaData -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Data Artisans GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen |
Good point, Konstantin, that makes sense.
On Tue, Mar 26, 2019 at 10:37 AM Konstantin Knauf <[hidden email]> wrote: > Hi Stephan, > > I am in favor of renaming forceKryo() instead of removing it, because users > might plugin their Protobuf/Thrift serializers via Kryo as advertised in > our documentation [1]. For this, Kryo needs to be used for POJO types as > well, if I am not mistaken. > > Cheers, > > Konstantin > > [1] > > https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/custom_serializers.html > > > On Tue, Mar 26, 2019 at 10:03 AM Stephan Ewen <[hidden email]> wrote: > > > Compatibility is really important for checkpointed state. > > For that, you can always directly specify GenericTypeInfo or AvroTypeInfo > > if you want to continue to treat a type via Kryo or Avro. > > > > Alternatively, once https://issues.apache.org/jira/browse/FLINK-11917 is > > implemented, this should happen automatically. > > > > On Tue, Mar 26, 2019 at 8:33 AM Yun Tang <[hidden email]> wrote: > > > >> Hi Stephan > >> > >> I prefer to remove 'enableForceKryo' since Kryo serializer does not work > >> out-of-the-box well for schema evolution stories due to its mutable > >> properties, and our built-in POJO serializer has already supported > schema > >> evolution. > >> > >> On the other hand, what's the backward compatibility plan for > >> enableForceAvro() and enableForceKryo()? I think if > >> https://issues.apache.org/jira/browse/FLINK-11917 merged, we could > >> support to migrate state which was POJO but serialized using Kryo. > >> > >> Best > >> Yun Tang > >> ------------------------------ > >> *From:* Stephan Ewen <[hidden email]> > >> *Sent:* Tuesday, March 26, 2019 2:31 > >> *To:* dev; user > >> *Subject:* [DISCUSS] Remove forceAvro() and forceKryo() from the > >> ExecutionConfig > >> > >> Hi all! > >> > >> The ExecutionConfig has some very old settings: forceAvro() and > >> forceKryo(), which are actually misleadingly named. They cause POJOs to > use > >> Avro or Kryo rather than the POJO serializer. > >> > >> I think we do not have a good case any more to use Avro for POJOs. POJOs > >> that are also Avro types go through the Avro serializer anyways. > >> > >> There may be a case to use Kryo for POJOs if you don't like the Flink > >> POJO serializer. > >> > >> I would suggest to remove the "forceAvro()" option completely. > >> For "forceKryo()", I am torn between removing it completely or renaming > >> it to "setUseKryoForPOJOs()". > >> > >> What are the opinion on that out there? > >> > >> Best, > >> Stephan > >> > >> > > -- > > Konstantin Knauf | Solutions Architect > > +49 160 91394525 > > <https://www.ververica.com/> > > Follow us @VervericaData > > -- > > Join Flink Forward <https://flink-forward.org/> - The Apache Flink > Conference > > Stream Processing | Event Driven | Real Time > > -- > > Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany > > -- > Data Artisans GmbH > Registered at Amtsgericht Charlottenburg: HRB 158244 B > Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen > |
In reply to this post by Konstantin Knauf-3
Hi Konstantin
I think there is no direct relationship between registering chill-protobuf/chill-thrift for Protobuf/Thrift type with Kryo and enforcing POJO to use Kryo. For both Protobuf and Thrift types, they will be extracted as GenericTypeInfo within TypeExtractor which would use Kryoserializer by default. If we do not register chill-protobuf and chill-thrift within kryo as [1] suggested, we would meet unexpected exceptions. However, PojoTypeInfo would only use KryoSerializer when we enableForceKryo(). They are actually two different aspects and use different fields within ExecutionConfig. Best Yun Tang [1] https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/custom_serializers.html ________________________________ From: Konstantin Knauf <[hidden email]> Sent: Tuesday, March 26, 2019 17:37 To: Stephan Ewen Cc: Yun Tang; dev; user Subject: Re: [DISCUSS] Remove forceAvro() and forceKryo() from the ExecutionConfig Hi Stephan, I am in favor of renaming forceKryo() instead of removing it, because users might plugin their Protobuf/Thrift serializers via Kryo as advertised in our documentation [1]. For this, Kryo needs to be used for POJO types as well, if I am not mistaken. Cheers, Konstantin [1] https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/custom_serializers.html On Tue, Mar 26, 2019 at 10:03 AM Stephan Ewen <[hidden email]<mailto:[hidden email]>> wrote: Compatibility is really important for checkpointed state. For that, you can always directly specify GenericTypeInfo or AvroTypeInfo if you want to continue to treat a type via Kryo or Avro. Alternatively, once https://issues.apache.org/jira/browse/FLINK-11917 is implemented, this should happen automatically. On Tue, Mar 26, 2019 at 8:33 AM Yun Tang <[hidden email]<mailto:[hidden email]>> wrote: Hi Stephan I prefer to remove 'enableForceKryo' since Kryo serializer does not work out-of-the-box well for schema evolution stories due to its mutable properties, and our built-in POJO serializer has already supported schema evolution. On the other hand, what's the backward compatibility plan for enableForceAvro() and enableForceKryo()? I think if https://issues.apache.org/jira/browse/FLINK-11917 merged, we could support to migrate state which was POJO but serialized using Kryo. Best Yun Tang ________________________________ From: Stephan Ewen <[hidden email]<mailto:[hidden email]>> Sent: Tuesday, March 26, 2019 2:31 To: dev; user Subject: [DISCUSS] Remove forceAvro() and forceKryo() from the ExecutionConfig Hi all! The ExecutionConfig has some very old settings: forceAvro() and forceKryo(), which are actually misleadingly named. They cause POJOs to use Avro or Kryo rather than the POJO serializer. I think we do not have a good case any more to use Avro for POJOs. POJOs that are also Avro types go through the Avro serializer anyways. There may be a case to use Kryo for POJOs if you don't like the Flink POJO serializer. I would suggest to remove the "forceAvro()" option completely. For "forceKryo()", I am torn between removing it completely or renaming it to "setUseKryoForPOJOs()". What are the opinion on that out there? Best, Stephan -- Konstantin Knauf | Solutions Architect +49 160 91394525 [https://lh4.googleusercontent.com/1RRzA12SK12Xaowkag-W37QDs5LHrfw4R0tMwVNjKLDKoIu69ld1qtA2hSDn1LSJe9w2THG1A9igK_nXPrNeIqRF87FjbEQoBnZJJgyPXCkKPFYuYc_Vh419P9EOO36ERgdnX5wG]<https://www.ververica.com/> Follow us @VervericaData -- Join Flink Forward<https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany -- Data Artisans GmbH Registered at Amtsgericht Charlottenburg: HRB 158244 B Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen |
Free forum by Nabble | Edit this page |