Manas Kale created FLINK-16039:
----------------------------------
Summary: Add API method to get last element in session window
Key: FLINK-16039
URL:
https://issues.apache.org/jira/browse/FLINK-16039 Project: Flink
Issue Type: Improvement
Components: API / DataStream
Affects Versions: 1.10.0
Reporter: Manas Kale
Consider the events :
[1, event], [2, event]
where first element is event timestamp in seconds and second element is event code/name.
Also consider that an Event time session window with inactivityGap = 2 seconds is acting on above stream.
When the first event arrives, a session window should be created that is [1,1].
When the second event arrives, a new session window should be created that is [2,2]. Since this falls within firstWindowTimestamp+inactivityGap, it should be merged into session window [1,2] and [2,2] should be deleted.
This is my understanding of how session windows are created. *Please correct me if wrong.*
However, Flink does not follow such a definition of windows semantically. If I call the getEnd() method of the TimeWindow() class, I get back _timestamp + inactivityGap_.
For the above example, after processing the first element, I would get 1 + 2 = 3 seconds as the window "end".
The actual window end should be the timestamp 1, which is the last event in the session window.
A solution would be to change the "end" definition of all windows, but I suppose this would be breaking and would need some debate.
Therefore, I propose an intermediate solution : add a new API method that keeps track of the last element added in the session window.
If there is agreement on this, I would like to start drafting a change document and implement this.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)