[jira] [Created] (FLINK-11838) Create RecoverableWriter for GCS

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-11838) Create RecoverableWriter for GCS

Shang Yuanchun (Jira)
Fokko Driesprong created FLINK-11838:
----------------------------------------

             Summary: Create RecoverableWriter for GCS
                 Key: FLINK-11838
                 URL: https://issues.apache.org/jira/browse/FLINK-11838
             Project: Flink
          Issue Type: Improvement
          Components: FileSystems
    Affects Versions: 1.8.0
            Reporter: Fokko Driesprong
            Assignee: Fokko Driesprong


GCS supports the resumable upload which we can use to create a Recoverable writer similar to the S3 implementation:
https://cloud.google.com/storage/docs/json_api/v1/how-tos/resumable-upload

After using the Hadoop compatible interface: https://github.com/apache/flink/pull/7519
We've noticed that the current implementation relies heavily on the renaming of the files on the commit:
https://github.com/apache/flink/blob/master/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableFsDataOutputStream.java#L233-L259
This is suboptimal on an object store such as GCS. Therefore we would like to implement a more GCS native RecoverableWriter



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)