Gary Yao created FLINK-14843:
-------------------------------- Summary: Streaming bucketing end-to-end test can fail with Output hash mismatch Key: FLINK-14843 URL: https://issues.apache.org/jira/browse/FLINK-14843 Project: Flink Issue Type: Bug Components: Connectors / FileSystem, Tests Affects Versions: 1.10.0 Environment: rev: dcc1330375826b779e4902176bb2473704dabb11 Reporter: Gary Yao *Description* Streaming bucketing end-to-end test ({{test_streaming_bucketing.sh}}) can fail with Output hash mismatch. {noformat} Number of running task managers has reached 4. Job (67212178694f8b2a9bc9d9572567a53f) is running. Waiting until all values have been produced Truncating buckets Number of produced values 26325/60000 Truncating buckets Number of produced values 31315/60000 Truncating buckets Number of produced values 36735/60000 Truncating buckets Number of produced values 40705/60000 Truncating buckets Number of produced values 46125/60000 Truncating buckets Number of produced values 51135/60000 Truncating buckets Number of produced values 56555/60000 Truncating buckets Number of produced values 61935/60000 Cancelling job 67212178694f8b2a9bc9d9572567a53f. Cancelled job 67212178694f8b2a9bc9d9572567a53f. Waiting for job (67212178694f8b2a9bc9d9572567a53f) to reach terminal state CANCELED ... Job (67212178694f8b2a9bc9d9572567a53f) reached terminal state CANCELED Job 67212178694f8b2a9bc9d9572567a53f was cancelled, time to verify FAIL Bucketing Sink: Output hash mismatch. Got 4e2d1859e41184a38e5bc95090fe9941, expected 01aba5ff77a0ef5e5cf6a727c248bdc3. head hexdump of actual: 0000000 ( 2 , 1 0 , 0 , S o m e p a y 0000010 l o a d . . . ) \n ( 2 , 1 0 , 1 0000020 , S o m e p a y l o a d . . . 0000030 ) \n ( 2 , 1 0 , 2 , S o m e p 0000040 a y l o a d . . . ) \n ( 2 , 1 0 0000050 , 3 , S o m e p a y l o a d . 0000060 . . ) \n ( 2 , 1 0 , 4 , S o m e 0000070 p a y l o a d . . . ) \n ( 2 , 0000080 1 0 , 5 , S o m e p a y l o a 0000090 d . . . ) \n ( 2 , 1 0 , 6 , S o 00000a0 m e p a y l o a d . . . ) \n ( 00000b0 2 , 1 0 , 7 , S o m e p a y l 00000c0 o a d . . . ) \n ( 2 , 1 0 , 8 , 00000d0 S o m e p a y l o a d . . . ) 00000e0 \n ( 2 , 1 0 , 9 , S o m e p a 00000f0 y l o a d . . . ) \n 00000fa Stopping taskexecutor daemon (pid: 654547) on host gyao-desktop. Stopping standalonesession daemon (pid: 650368) on host gyao-desktop. Stopping taskexecutor daemon (pid: 650812) on host gyao-desktop. Skipping taskexecutor daemon (pid: 651347), because it is not running anymore on gyao-desktop. Skipping taskexecutor daemon (pid: 651795), because it is not running anymore on gyao-desktop. Skipping taskexecutor daemon (pid: 652249), because it is not running anymore on gyao-desktop. Stopping taskexecutor daemon (pid: 653481) on host gyao-desktop. Stopping taskexecutor daemon (pid: 654099) on host gyao-desktop. [FAIL] Test script contains errors. Checking of logs skipped. [FAIL] 'flink-end-to-end-tests/test-scripts/test_streaming_bucketing.sh' failed after 2 minutes and 3 seconds! Test exited with exit code 1 {noformat} *How to reproduce* Comment out the delay of 10s after the 1st TM is restarted to provoke the issue: {code:bash} echo "Restarting 1 TM" $FLINK_DIR/bin/taskmanager.sh start wait_for_number_of_running_tms 4 #sleep 10 echo "Killing 2 TMs" kill_random_taskmanager kill_random_taskmanager wait_for_number_of_running_tms 2 {code} Command to run the test: {noformat} FLINK_DIR=build-target/ flink-end-to-end-tests/run-single-test.sh skip flink-end-to-end-tests/test-scripts/test_streaming_bucketing.sh {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) |
Free forum by Nabble | Edit this page |