zhuxiaoshang created FLINK-20538:
------------------------------------ Summary: sink.rolling-policy.file-size does not work in filesystem connector Key: FLINK-20538 URL: https://issues.apache.org/jira/browse/FLINK-20538 Project: Flink Issue Type: Bug Components: Connectors / FileSystem Affects Versions: 1.11.1 Reporter: zhuxiaoshang When I use sql filesystem connector to write data to hdfs,and set sink.rolling-policy.file-size to 50MB.But seems not working, there are still 100MB+ size files. My table ddl is : {code:java} CREATE TABLE cpc_bd_recall_log_hdfs ( log_timestamp BIGINT, ip STRING, `raw` STRING, `day` STRING, `hour` STRING,`minute` STRING ) PARTITIONED BY (`day` , `hour` ,`minute`) WITH ( 'connector'='filesystem', 'path'='hdfs://xxx/test.db/hdfs_test', 'format'='parquet', 'parquet.compression'='SNAPPY', 'sink.rolling-policy.file-size' = '50MB', 'sink.partition-commit.policy.kind' = 'success-file', 'sink.partition-commit.delay'='60s' ); {code} the hdfs files are: {code:java} 0 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/_SUCCESS -rw-r--r-- 3 hadoop hadoop 31.7 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-0-2500 -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-0-2501 -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-1-2499 -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-1-2500 -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-10-2501 -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-10-2502 -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-11-2500 -rw-r--r-- 3 hadoop hadoop 122.2 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-11-2501 -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-12-2500 -rw-r--r-- 3 hadoop hadoop 122.2 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-12-2501 -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-13-2499 -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-13-2500 -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-14-2500 -rw-r--r-- 3 hadoop hadoop 122.1 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-14-2501 -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-15-2498 -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-15-2499 -rw-r--r-- 3 hadoop hadoop 31.7 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-16-2501 -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-16-2502 -rw-r--r-- 3 hadoop hadoop 31.7 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-17-2500 -rw-r--r-- 3 hadoop hadoop 122.5 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-17-2501 -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-18-2500 -rw-r--r-- 3 hadoop hadoop 121.7 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-18-2501 -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-19-2501 -rw-r--r-- 3 hadoop hadoop 121.7 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-19-2502 -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-2-2499 -rw-r--r-- 3 hadoop hadoop 121.6 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-2-2500 -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-3-2500 -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-3-2501 -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-4-2499 -rw-r--r-- 3 hadoop hadoop 122.1 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-4-2500 -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-5-2499 -rw-r--r-- 3 hadoop hadoop 121.8 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-5-2500 -rw-r--r-- 3 hadoop hadoop 31.8 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-6-2499 -rw-r--r-- 3 hadoop hadoop 121.5 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-6-2500 -rw-r--r-- 3 hadoop hadoop 31.6 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-7-2500 -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-7-2501 -rw-r--r-- 3 hadoop hadoop 31.7 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-8-2501 -rw-r--r-- 3 hadoop hadoop 122.0 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-8-2502 -rw-r--r-- 3 hadoop hadoop 31.9 M 2020-12-04 14:55 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-9-2501 -rw-r--r-- 3 hadoop hadoop 121.9 M 2020-12-04 14:56 hdfs://xxx/test.db/hdfs_test/day=2020-12-04/hour=14/minute=55/part-3dca3b00-fd94-4f49-bdf8-a8b65bcfa92c-9-2502 {code} However,when I dig into source code,when writing element to bucket it'll invoke `shouldRollOnEvent` in TableRollingPolicy. I don't understand how can this happen?Is a BUG or somewhere I get it wrong. -- This message was sent by Atlassian Jira (v8.3.4#803005) |
Free forum by Nabble | Edit this page |