[jira] [Created] (FLINK-19345) Introduce File streaming sink compaction

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-19345) Introduce File streaming sink compaction

Shang Yuanchun (Jira)
Jingsong Lee created FLINK-19345:
------------------------------------

             Summary: Introduce File streaming sink compaction
                 Key: FLINK-19345
                 URL: https://issues.apache.org/jira/browse/FLINK-19345
             Project: Flink
          Issue Type: New Feature
          Components: Table SQL / Runtime
            Reporter: Jingsong Lee
            Assignee: Jingsong Lee
             Fix For: 1.12.0


Users often complain that many small files are written out. Small files will affect the performance of file reading and the DFS system, and even the stability of the DFS system.

Target: 
 * Compact all files generated by this job in a single checkpoint.
 * With compaction, Users can have smaller checkpoint interval, even to seconds.

Document: https://docs.google.com/document/d/1cdlyoqgBq9yJEiHFBziimIoKHapQiEY2-0Tn8IF6G-c/edit?usp=sharing



--
This message was sent by Atlassian Jira
(v8.3.4#803005)