The contradiction between event time and natural time from EventTimeTrigger

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

The contradiction between event time and natural time from EventTimeTrigger

邵志鹏
Dear flinker:


Look at the contradiction between event time and natural time from EventTimeTrigger.java (the window at the time of the break and the end of the window at the end of the end must not be "real time"):


From the default EventTimeTrigger source code, I found that only onElement method (will judge the watermark) and onEventTime method only have a chance to trigger TriggerResult.FIRE;


Therefore, the default EventTimeTrigger is assumed and must be "never stop! The data stream" will have "correct" Real-time results, so as long as the interval between the two eventtimes is too large, greater than the time window interval, or the end time of the window has not arrived yet, there is no new data (flow interruption, neither element(onElement) nor eventtime(onEventtime)), then the latest time the output of the window must be untimely or non-real-time (if i use eventtime to do the bounded window aggregation of the stream, i must have the near future data support, once it is interrupted, it will not be real-time), and i must wait until the new data stream is connected.


The window result that was not output in time before the new data stream is come.
 (OnProcessingTime will never be called after EventTime is set, so modifying onProcessingTime has no effect.
Called when a processing-time timer that was set using the trigger context fires.).


Then, to "real time TriggerResult" can only
1. use processing-time
2. "guaranteed" stream data event time interval is small and best sequential and never interrupted (if this can be guaranteed, use processing-time directly. What is the meaning of using EventTime and Watermark in the production environment and how to  test the real-time and accuracy of the data results? I am sorry I have confused from some flink streaming sql examples about the time window.)
3. Add new implementations or improvements:  the end time of window determined by assignWindows can trigger TriggerResult as soon as it reaches the natural time point or reaches the natural time point plus to watermark interval.


Real-time results [unrelated to the specific Tumble Hop Session], while to the time of TriggerResult.FIRE...
The watermark has increased, but there is no data, and then the window trigger has stopped...


I don't know if my understanding is correct, I also hope to give pointers.


Thanks.



Reply | Threaded
Open this post in threaded view
|

Re: The contradiction between event time and natural time from EventTimeTrigger

Rong Rong
Hi Zhipeng,

Please see my explanation below:

From the default EventTimeTrigger source code, I found that only onElement

> method (will judge the watermark) and onEventTime method only have a chance
> to trigger TriggerResult.FIRE;
> Therefore, the default EventTimeTrigger is assumed and must be "never
> stop! The data stream" will have "correct" Real-time results, so as long as
> the interval between the two eventtimes is too large, greater than the time
> window interval, or the end time of the window has not arrived yet, there
> is no new data (flow interruption, neither element(onElement) nor
> eventtime(onEventtime)), then the latest time the output of the window must
> be untimely or non-real-time (if i use eventtime to do the bounded window
> aggregation of the stream, i must have the near future data support, once
> it is interrupted, it will not be real-time), and i must wait until the new
> data stream is connected.


This is yes and no:
1. If there's no element within a specific window at all, the window will
not be created and will not have anything to fire.
2. Upon the first element arrive at a specific window (assume no
late-arrival), the window will be created as well as an event-time timer.
so, if there's no future element arrival, the existing window (with at
least one element) will still fire promptly.
3. However, since this is an event time trigger, in order for the internal
timer service to "activate" the registered timer, watermark has to advance.

Regarding the #3 point I mentioned: I could've been wrong on this, but if
the source function does not advance watermark at all unless an element is
received from external data source, then yes this will probably be stuck.


1. use processing-time

> 2. "guaranteed" stream data event time interval is small and best
> sequential and never interrupted (if this can be guaranteed, use
> processing-time directly. What is the meaning of using EventTime and
> Watermark in the production environment and how to  test the real-time and
> accuracy of the data results? I am sorry I have confused from some flink
> streaming sql examples about the time window.)
> 3. Add new implementations or improvements:  the end time of window
> determined by assignWindows can trigger TriggerResult as soon as it reaches
> the natural time point or reaches the natural time point plus to watermark
> interval.


Regarding:
#1: The problem you described is not with processing-time because there's
nothing preventing the internal-timer to advance on the processing time
trigger/timer - they use the system time which will always advance.
#2/#3: This is not needed, as long as you guarantee watermark advance
promptly.

I am not exactly sure my explanation is the most accurate one, so if anyone
could share more insight please kindly share your thoughts :-)

Thanks,
Rong




On Sat, Apr 27, 2019 at 10:19 PM 邵志鹏 <[hidden email]> wrote:

> Dear flinker:
>
>
> Look at the contradiction between event time and natural time from
> EventTimeTrigger.java (the window at the time of the break and the end of
> the window at the end of the end must not be "real time"):
>
>
> From the default EventTimeTrigger source code, I found that only onElement
> method (will judge the watermark) and onEventTime method only have a chance
> to trigger TriggerResult.FIRE;
>
>
> Therefore, the default EventTimeTrigger is assumed and must be "never
> stop! The data stream" will have "correct" Real-time results, so as long as
> the interval between the two eventtimes is too large, greater than the time
> window interval, or the end time of the window has not arrived yet, there
> is no new data (flow interruption, neither element(onElement) nor
> eventtime(onEventtime)), then the latest time the output of the window must
> be untimely or non-real-time (if i use eventtime to do the bounded window
> aggregation of the stream, i must have the near future data support, once
> it is interrupted, it will not be real-time), and i must wait until the new
> data stream is connected.
>
>
> The window result that was not output in time before the new data stream
> is come.
>  (OnProcessingTime will never be called after EventTime is set, so
> modifying onProcessingTime has no effect.
> Called when a processing-time timer that was set using the trigger context
> fires.).
>
>
> Then, to "real time TriggerResult" can only
> 1. use processing-time
> 2. "guaranteed" stream data event time interval is small and best
> sequential and never interrupted (if this can be guaranteed, use
> processing-time directly. What is the meaning of using EventTime and
> Watermark in the production environment and how to  test the real-time and
> accuracy of the data results? I am sorry I have confused from some flink
> streaming sql examples about the time window.)
> 3. Add new implementations or improvements:  the end time of window
> determined by assignWindows can trigger TriggerResult as soon as it reaches
> the natural time point or reaches the natural time point plus to watermark
> interval.
>
>
> Real-time results [unrelated to the specific Tumble Hop Session], while to
> the time of TriggerResult.FIRE...
> The watermark has increased, but there is no data, and then the window
> trigger has stopped...
>
>
> I don't know if my understanding is correct, I also hope to give pointers.
>
>
> Thanks.
>
>
>
>