Edward Rojas created FLINK-11337:
------------------------------------
Summary: Incorrect watermark in StreaminFileSink BucketAssigner.Context when used in connected stream
Key: FLINK-11337
URL:
https://issues.apache.org/jira/browse/FLINK-11337 Project: Flink
Issue Type: Bug
Components: filesystem-connector
Affects Versions: 1.7.0
Reporter: Edward Rojas
When StreamingFileSink is used as sink of a connected stream the "invoke" method of the sink could be called before the "combinedWatermark" is updated with the timestamp of the element currently being processed, resulting on the context containing the incorrect watermark value (the Long.MIN_VALUE when using "AssignerWithPeriodicWatermarks" for the firsts events in the stream).
I reproduce this when using a broadcast stream connected to a data stream. The broadcast stream is using a custom timestamp extractor that always return the Watermark.MAX_VALUE as it's done in a trining example here: [
https://github.com/dataArtisans/flink-training-exercises/blob/master/src/main/java/com/dataartisans/flinktraining/solutions/datastream_java/broadcast/OngoingRidesSolution.java#L143.]
This is problematic as the watermark could not be used reliably to compute the bucket id based on event time.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)