[jira] [Created] (FLINK-19039) Parallel Flink Kafka Consumers compete with each other

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-19039) Parallel Flink Kafka Consumers compete with each other

Shang Yuanchun (Jira)
Ayrat Hudaygulov created FLINK-19039:
----------------------------------------

             Summary: Parallel Flink Kafka Consumers compete with each other
                 Key: FLINK-19039
                 URL: https://issues.apache.org/jira/browse/FLINK-19039
             Project: Flink
          Issue Type: Improvement
          Components: Connectors / Kafka
    Affects Versions: 1.11.1
            Reporter: Ayrat Hudaygulov


If I'll run multiple Flink instances with same consumer group id they will not re-balance partitions with each other, but rather each instance take all partitions, effectively not working in parallel at all, and multiplying amount of messages processed.

 

This is because FlinkKafkaConsumer has its own re-balancing mechanism for current parallelism level and then just calls:

`consumerTmp.assign(newPartitionAssignments){color:#cc7832};{color}`

 

[https://github.com/apache/flink/blob/59714b9d6addb1dbf2171cab937a0e3fec52f2b1/flink-connectors/flink-connector-kafka-0.10/src/main/java/org/apache/flink/streaming/connectors/kafka/internal/KafkaConsumerThread.java#L422]

 

I suppose there has to be a way to fallback to default kafka mechanism of re-balancing to respect consumer group id, but it's not presented in Flink at all.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)