Fokko Driesprong created FLINK-11838:
----------------------------------------
Summary: Create RecoverableWriter for GCS
Key: FLINK-11838
URL:
https://issues.apache.org/jira/browse/FLINK-11838 Project: Flink
Issue Type: Improvement
Components: FileSystems
Affects Versions: 1.8.0
Reporter: Fokko Driesprong
Assignee: Fokko Driesprong
GCS supports the resumable upload which we can use to create a Recoverable writer similar to the S3 implementation:
https://cloud.google.com/storage/docs/json_api/v1/how-tos/resumable-uploadAfter using the Hadoop compatible interface:
https://github.com/apache/flink/pull/7519We've noticed that the current implementation relies heavily on the renaming of the files on the commit:
https://github.com/apache/flink/blob/master/flink-filesystems/flink-hadoop-fs/src/main/java/org/apache/flink/runtime/fs/hdfs/HadoopRecoverableFsDataOutputStream.java#L233-L259This is suboptimal on an object store such as GCS. Therefore we would like to implement a more GCS native RecoverableWriter
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)