liupengcheng created FLINK-18852:
------------------------------------
Summary: StreamScan should keep the same parallelism as the input
Key: FLINK-18852
URL:
https://issues.apache.org/jira/browse/FLINK-18852 Project: Flink
Issue Type: Bug
Components: Table SQL / Planner
Affects Versions: 1.11.1
Reporter: liupengcheng
Attachments: image-2020-08-07-21-22-57-843.png
Currently, the parallelism for StreamTableSourceScan/DataStreamScan is not inherited from the upstream input, but retrieved from the config. I think this is unexpected.
I find this issue through UT, here is an example:
{code:java}
// env parallelism is set to 4
val env = StreamExecutionEnvironment.getExecutionEnvironment
val tEnv = StreamTableEnvironment.create(env)
StreamITCase.testResults = new mutable.MutableList[String]
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime)
env.setParallelism(4)
// DataSource parallelism is set to 1
val table1 = env.fromCollection(left)
.setParallelism(1)
.assignTimestampsAndWatermarks(new TimestampAndWatermarkWithOffset[(Long, String)](0))
.toTable(tEnv, 'a, 'b)
val table2 = env.fromCollection(right)
.setParallelism(1)
.assignTimestampsAndWatermarks(new TimestampAndWatermarkWithOffset[(Long, String)](0))
.toTable(tEnv, 'a, 'b)
{code}
But when you start the execution, and visualize the execution plan, you can find that the "from"(the StreamScan) operator's parallelism is 4.
!image-2020-08-07-21-22-57-843.png|thumbnail!
--
This message was sent by Atlassian Jira
(v8.3.4#803005)