[jira] [Created] (FLINK-12304) AvroInputFormat should support schema evolution

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-12304) AvroInputFormat should support schema evolution

Shang Yuanchun (Jira)
John created FLINK-12304:
----------------------------

             Summary: AvroInputFormat should support schema evolution
                 Key: FLINK-12304
                 URL: https://issues.apache.org/jira/browse/FLINK-12304
             Project: Flink
          Issue Type: Bug
          Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile)
    Affects Versions: 1.8.0
            Reporter: John


From the avro spec:

_A reader of Avro data, whether from an RPC or a file, can always parse that data because its schema is provided. But that schema may not be exactly the schema that was expected. For example, if the data was written with a different version of the software than it is read, then records may have had fields added or removed._

The AvroInputFormat should allow the application to supply a reader's schema to support cases where data was written with an old version of a schema and needs to be read with a newer version.  The reader's schema can have addition fields with defaults so that the old schema can be adapted to the new.  The underlying avro java library supports schema resolution, so adding support in AvroInputFormat should be straight forward.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)