[jira] [Created] (FLINK-21195) LimitableBulkFormat is invalid when format is orc

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-21195) LimitableBulkFormat is invalid when format is orc

Shang Yuanchun (Jira)
sujun created FLINK-21195:
-----------------------------

             Summary: LimitableBulkFormat is invalid when format is orc
                 Key: FLINK-21195
                 URL: https://issues.apache.org/jira/browse/FLINK-21195
             Project: Flink
          Issue Type: Bug
          Components: Connectors / FileSystem
    Affects Versions: 1.12.1
            Reporter: sujun
         Attachments: image-2021-01-29-11-38-17-576.png, image-2021-01-29-11-40-35-087.png, orc_reader_debug.jpg

The orc file will read a stripe data in advance in the createReader() method (see the construction method of RecordReaderImpl in detail), and the parquet file will start to read the block data when the readBatch() method is called, so if all orc files have only one stripe, limitableBulkFormat will be invalid

 

!orc_reader_debug.jpg!

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)