[jira] [Created] (FLINK-17691) Failed to sinkKafka with exactly once mode when transactional.id too long

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-17691) Failed to sinkKafka with exactly once mode when transactional.id too long

Shang Yuanchun (Jira)
freezhan created FLINK-17691:
--------------------------------

             Summary: Failed to sinkKafka with exactly once mode when transactional.id too long
                 Key: FLINK-17691
                 URL: https://issues.apache.org/jira/browse/FLINK-17691
             Project: Flink
          Issue Type: Bug
          Components: Connectors / Kafka
    Affects Versions: 1.10.1, 1.10.0
            Reporter: freezhan
         Attachments: image-2020-05-14-20-43-57-414.png, image-2020-05-14-20-45-24-030.png, image-2020-05-14-20-45-59-878.png, image-2020-05-14-21-09-01-906.png, image-2020-05-14-21-16-43-810.png, image-2020-05-14-21-17-09-784.png

When sink to Kafka using the {color:#FF0000}Semantic.EXACTLY_ONCE {color}mode.

The flink Kafka Connector Producer will auto set the {color:#FF0000}transactional.id{color}, and the user - defined value are ignored.

 

When the job operator name too long, will send failed

transactional.id is exceeds the kafka  {color:#FF0000}coordinator_key{color} limit

!image-2020-05-14-21-09-01-906.png!

 

*The flink Kafka Connector policy for automatic generation of transaction.id is as follows*

 

1. use the {color:#FF0000}taskName + "-" + operatorUniqueID{color} as transactional.id prefix (may be too long)

  getRuntimeContext().getTaskName() + "-" + ((StreamingRuntimeContext)    getRuntimeContext()).getOperatorUniqueID()

2. Range of available transactional ids 

[nextFreeTransactionalId, nextFreeTransactionalId + parallelism * kafkaProducersPoolSize)

!image-2020-05-14-20-43-57-414.png!

  !image-2020-05-14-20-45-24-030.png!

!image-2020-05-14-20-45-59-878.png!

 

*The Kafka transaction.id check policy as follows:*

 

{color:#FF0000}string bytes.length can't larger than Short.MAX_VALUE (32767){color}

!image-2020-05-14-21-16-43-810.png!

!image-2020-05-14-21-17-09-784.png!

 

*To reproduce this bug, the following conditions must be met:*

 
 # send msg to kafka with exactly once mode
 # the task TaskName' length + TaskName's length is lager than the 32767 (A very long line of SQL or window statements can appear)

*I suggest a solution:*

 

     1.  Allows users to customize transactional.id 's prefix

or

     2. Do md5 on the prefix before returning the real transactional.id

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)