[jira] [Created] (FLINK-18629) ConnectedStreams#keyBy can not derive key TypeInformation for lambda KeySelectors

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-18629) ConnectedStreams#keyBy can not derive key TypeInformation for lambda KeySelectors

Shang Yuanchun (Jira)
Dawid Wysakowicz created FLINK-18629:
----------------------------------------

             Summary: ConnectedStreams#keyBy can not derive key TypeInformation for lambda KeySelectors
                 Key: FLINK-18629
                 URL: https://issues.apache.org/jira/browse/FLINK-18629
             Project: Flink
          Issue Type: Bug
          Components: API / DataStream
    Affects Versions: 1.11.0, 1.10.0, 1.12.0
            Reporter: Dawid Wysakowicz
            Assignee: Dawid Wysakowicz
             Fix For: 1.10.2, 1.12.0, 1.11.2


Following test fails:
{code}
        @Test
        public void testKeyedConnectedStreamsType() {
                StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

                DataStreamSource<Integer> stream1 = env.fromElements(1, 2);
                DataStreamSource<Integer> stream2 = env.fromElements(1, 2);

                ConnectedStreams<Integer, Integer> connectedStreams = stream1.connect(stream2)
                        .keyBy(v -> v, v -> v);

                KeyedStream<?, ?> firstKeyedInput = (KeyedStream<?, ?>) connectedStreams.getFirstInput();
                KeyedStream<?, ?> secondKeyedInput = (KeyedStream<?, ?>) connectedStreams.getSecondInput();
                assertThat(firstKeyedInput.getKeyType(), equalTo(Types.INT));
                assertThat(secondKeyedInput.getKeyType(), equalTo(Types.INT));
        }
{code}

The problem is that the wildcard type is evaluated as {{Object}} for lambdas, which in turn produces {{GenericTypeInfo<Object>}} for any KeySelector provided as lambda.

I suggest changing the method signature to:
{code}
        public <K1, K2> ConnectedStreams<IN1, IN2> keyBy(
                        KeySelector<IN1, K1> keySelector1,
                        KeySelector<IN2, K2> keySelector2)
{code}

This would be a code compatible change. Might break the compatibility of state backend (would change derived key type info). Nevertheless there is a workaround to use:

{code}
        public <KEY> ConnectedStreams<IN1, IN2> keyBy(
                        KeySelector<IN1, KEY> keySelector1,
                        KeySelector<IN2, KEY> keySelector2,
                        TypeInformation<KEY> keyType)
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)