(DEPRECATED) Apache Flink Mailing List archive.

[DISCUSS] FLIP-9: Trigger DSL

Classic

List

Threaded

16 messages Options

Kostas Kloudas

[DISCUSS] FLIP-9: Trigger DSL

Hi all!

I've created a FLIP for the trigger DSL. This is the triggers
that we want Apache Flink to support out-of-the-box. This proposal
builds on various discussions on the mailing list and aims at
serving as a base for further ones.

https://cwiki.apache.org/confluence/display/FLINK/FLIP-9%3A+Trigger+DSL <https://cwiki.apache.org/confluence/display/FLINK/FLIP-9:+Trigger+DSL>

FLIP-9 provides a description of the triggers Flink already offers,
the new that we think should be added, how the APIs could look like,
some discussion on the implementation implications and some ideas
on how to implement them.

There is also a shared document giving a bit more insight on the implementation
implications. Feel free to read but please keep the discussion in the mailing list.

https://docs.google.com/a/data-artisans.com/document/d/1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing <https://docs.google.com/a/data-artisans.com/document/d/1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing>

I would like to start working on an the implementation next week.

Let the discussion begin!

Kostas

Ufuk Celebi-2

Re: [DISCUSS] FLIP-9: Trigger DSL

Hey Kostas! Thanks for sharing the documents. I think it makes sense
to merge the two documents by moving the Google doc contents to the
Wiki. I think they form one unit.

On Tue, Aug 16, 2016 at 12:34 PM, Kostas Kloudas
<[hidden email]> wrote:

> Hi all!
>
> I've created a FLIP for the trigger DSL. This is the triggers
> that we want Apache Flink to support out-of-the-box. This proposal
> builds on various discussions on the mailing list and aims at
> serving as a base for further ones.
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-9%3A+Trigger+DSL <https://cwiki.apache.org/confluence/display/FLINK/FLIP-9:+Trigger+DSL>
>
> FLIP-9 provides a description of the triggers Flink already offers,
> the new that we think should be added, how the APIs could look like,
> some discussion on the implementation implications and some ideas
> on how to implement them.
>
> There is also a shared document giving a bit more insight on the implementation
> implications. Feel free to read but please keep the discussion in the mailing list.
>
> https://docs.google.com/a/data-artisans.com/document/d/1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing <https://docs.google.com/a/data-artisans.com/document/d/1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing>
>
> I would like to start working on an the implementation next week.
>
> Let the discussion begin!
>
> Kostas
>
>

Kostas Kloudas

Re: [DISCUSS] FLIP-9: Trigger DSL

Thanks for the feedback Ufuk!
I will do that.

> On Aug 16, 2016, at 1:41 PM, Ufuk Celebi <[hidden email]> wrote:
>
> Hey Kostas! Thanks for sharing the documents. I think it makes sense
> to merge the two documents by moving the Google doc contents to the
> Wiki. I think they form one unit.
>
> On Tue, Aug 16, 2016 at 12:34 PM, Kostas Kloudas
> <[hidden email]> wrote:
>> Hi all!
>>
>> I've created a FLIP for the trigger DSL. This is the triggers
>> that we want Apache Flink to support out-of-the-box. This proposal
>> builds on various discussions on the mailing list and aims at
>> serving as a base for further ones.
>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-9%3A+Trigger+DSL <https://cwiki.apache.org/confluence/display/FLINK/FLIP-9:+Trigger+DSL>
>>
>> FLIP-9 provides a description of the triggers Flink already offers,
>> the new that we think should be added, how the APIs could look like,
>> some discussion on the implementation implications and some ideas
>> on how to implement them.
>>
>> There is also a shared document giving a bit more insight on the implementation
>> implications. Feel free to read but please keep the discussion in the mailing list.
>>
>> https://docs.google.com/a/data-artisans.com/document/d/1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing <https://docs.google.com/a/data-artisans.com/document/d/1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing>
>>
>> I would like to start working on an the implementation next week.
>>
>> Let the discussion begin!
>>
>> Kostas
>>
>>

Till Rohrmann

Re: [DISCUSS] FLIP-9: Trigger DSL

Thanks Aljioscha

In fact the onFire() method proposed in the JIRA is also included in the
FLIP. The onCleanup() I agree that it would be a nice addition as it makes
the API more complete. Now we have an onMerge(), an onFire() and
onCleanup() which allow a trigger to react to every milestone in the
lifespan of a window.

Kostas

> On Aug 17, 2016, at 5:31 PM, Aljoscha Krettek <[hidden email]> wrote:
>
> Hi,
> I opened this Jira which should help in implementing the Trigger DSL but is
> also independent in that it just enhances the range of things that can be
> done with a Trigger:
> https://issues.apache.org/jira/browse/FLINK-4415
>
> Cheers,
> Aljoscha
>
> On Wed, 17 Aug 2016 at 14:38 Jark Wu <[hidden email]> wrote:
>
>> Hi Aljoscha, Kostas, thanks for your detailed explanation. It makes sense.
>>
>> According to the discarding and accumulating, the FLIP says “the mode of
>> parent trigger overwrites that of its children”. That means Trigger decide
>> whether to discard window contents after firing , right ? But I find the
>> origin google doc[1] proposed the Trigger only decide whether to fire or
>> not while the purging behavior is determined by a setting on
>> WindowedStream. Such as :
>>
>> datastream.keyBy(0)
>> .window(windowAssigner)
>> .trigger(compositeTrigger)
>> .accumulating()
>>
>>
>> [1]
>> https://docs.google.com/document/d/1Xp-YBf87vLTduYSivgqWVEMjYUmkA-hyb4muX3KRl08/edit#heading=h.e40hqtu6za6u
>> <
>> https://docs.google.com/document/d/1Xp-YBf87vLTduYSivgqWVEMjYUmkA-hyb4muX3KRl08/edit#heading=h.e40hqtu6za6u
>>>
>>
>> - Jark Wu
>>
>>> 在 2016年8月17日，下午6:12，Aljoscha Krettek <[hidden email]> 写道：
>>>
>>> Hi,
>>> I think that would blow up state since there can be several triggers that
>>> need this kind of state, Any and All come to mind, possibly. If each of
>>> those keeps state that's at least a byte per trigger. If the finished
>> state
>>> were kept centrally by the TriggerRunner it would just be one byte for
>>> everything, in most cases.
>>>
>>> As I said, in some cases keeping that extra bit can be avoided. For
>>> example, if you have Repeat.forever(Some.trigger()) you know that the
>>> finished bit will always be false and so you don't keep any state in the
>>> TriggerRunner. If every trigger manually does that bookkeeping you remove
>>> that possibility while increasing complexity in each Trigger
>> implementation.
>>>
>>> Cheers,
>>> Aljoscha
>>>
>>> On Wed, 17 Aug 2016 at 12:05 Kostas Kloudas <[hidden email]
>>>
>>> wrote:
>>>
>>>> Hi Aljoscha,
>>>>
>>>> On the Repeat.? addition, I think that each trigger will have to have
>>>> its own implementation, e.g. the CountTrigger should just set a dummy
>>>> value in the counter in order to know if it should fire again or not.
>>>>
>>>> In other case, we will have to add more state and this can lead to
>>>> significant
>>>> performance degradation, as in most cases this state has to be checked
>> on
>>>> every element.
>>>>
>>>> Another potential solution, which I am not sure if it covers all cases,
>>>> could
>>>> be to have a State abstraction like CompositeState, apart from the
>>>> Value, List, Reduce, Fold, which can fetch more than one types of state
>>>> with one round trip to the backend. Imagine having the “counter" and the
>>>> “canceled” states in the same entry in the backend and always fetch them
>>>> together. This can lead to zero additional cost for the extra state.
>>>>
>>>> What do you think?
>>>>
>>>> Kostas
>>>>
>>>>> On Aug 17, 2016, at 11:57 AM, Aljoscha Krettek <[hidden email]>
>>>> wrote:
>>>>>
>>>>> Regarding Repeat.forever() and the default being to not repeat. The
>>>> simple
>>>>> reason is that Beam (née Google Dataflow) provides basically the same
>>>> thing
>>>>> with their trigger DSL and that their triggers behave like this. I
>> think
>>>> it
>>>>> would not be beneficial to have the same feature in two systems in that
>>>>> space where the behavior is the opposite. That would make it confusing
>>>> for
>>>>> users.
>>>>>
>>>>> On the implementation side, I think in most cases you need to have a
>> way
>>>> of
>>>>> telling when triggers are finished or not anyways. There could be a
>>>> central
>>>>> component in the TriggerRunner that has a finished bit for every
>> trigger
>>>> in
>>>>> the tree. In most cases this would be a simple byte. Triggers could set
>>>> and
>>>>> query this finished bit. In some cases, where you know that triggers
>> can
>>>>> never finish you could have a dummy implementation of the finished set
>>>> that
>>>>> does not store any state and always returns false when queried.
>>>>>
>>>>> On Wed, 17 Aug 2016 at 11:52 Aljoscha Krettek <[hidden email]>
>>>> wrote:
>>>>>
>>>>>> Kostas already nicely explained this!
>>>>>>
>>>>>> I just want to give some theoretical background. I see the underlying
>>>> idea
>>>>>> of triggers similar to predicates, i.e.
>>>>>>
>>>>
>> "EventTimeTrigger.afterEndOfWindow().withEarlyTrigger(earlyFiringTrigger)"
>>>>>> translates to a predicate "(E and ET) or WT" (where E is a predicate
>>>> that
>>>>>> is true when we are in early phase, ET is the early trigger and WT is
>>>> the
>>>>>> watermark trigger). The other trigger translates to "(!E and LT) or
>> WT",
>>>>>> i.e. it triggers if we're not early and LT is true or if the watermark
>>>>>> trigger is true. If we combine the two we get:
>>>>>>
>>>>>> ((E and ET) or WT) and ((!E and LT) or WT)
>>>>>>
>>>>>> now we can eliminate the two parts with E and !E because they can
>> never
>>>> be
>>>>>> true and are in an "or":
>>>>>>
>>>>>> WT and WT
>>>>>>
>>>>>> which yield just "WT".
>>>>>>
>>>>>> Hope that makes sense to you.
>>>>>>
>>>>>> Cheers,
>>>>>> Aljoscha
>>>>>>
>>>>>>
>>>>>> On Wed, 17 Aug 2016 at 10:47 Kostas Kloudas <
>>>> [hidden email]>
>>>>>> wrote:
>>>>>>
>>>>>>> Hello Jark Wu,
>>>>>>>
>>>>>>> Both of them will work in the new DSL. The idea is that there should
>>>> be no
>>>>>>> restrictions on the combinations one can do.
>>>>>>>
>>>>>>> Coming to what does the early and the late trigger do, the early
>>>> trigger
>>>>>>> will
>>>>>>> be responsible for specifying when the trigger should fire in the
>>>> period
>>>>>>> between
>>>>>>> the beginning of the window and the time when the watermark passes
>> the
>>>> end
>>>>>>> of the window. The late trigger takes over after the watermark passes
>>>> the
>>>>>>> end of
>>>>>>> the window, and specifies when the trigger should fire in the period
>>>>>>> between the
>>>>>>> endOfWindow and endOfWindow + allowedLateness.
>>>>>>>
>>>>>>> So in the case of the:
>>>>>>> All(EventTimeTrigger.afterEndOfWindow()
>>>>>>> .withEarlyTrigger(earlyFiringTrigger),
>>>>>>> EventTimeTrigger.afterEndOfWindow()
>>>>>>> .withLateTrigger(lateFiringTrigger))
>>>>>>>
>>>>>>> The trigger will only fire at the end of the window, as this is the
>>>> only
>>>>>>> time both
>>>>>>> triggers will say FIRE.
>>>>>>>
>>>>>>> Although the above will work, the example that you gave is a nice one
>>>> as
>>>>>>> it
>>>>>>> degenerates to an:
>>>>>>>
>>>>>>> EventTimeTrigger.afterEndOfWindow()
>>>>>>>
>>>>>>> Detecting this and giving the simplest trigger for the job can lead
>> to
>>>>>>> further
>>>>>>> optimizations, as it can for example reduce the amount of state the
>>>>>>> trigger has to keep.
>>>>>>>
>>>>>>> That would actually be a very nice addition to have as in some cases
>> it
>>>>>>> can lead
>>>>>>> to performance improvements.
>>>>>>>
>>>>>>> Thanks for the feedback!
>>>>>>>
>>>>>>> Kostas
>>>>>>>
>>>>>>>> On Aug 17, 2016, at 4:36 AM, Jark Wu <[hidden email]>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> It’s a cool design, I really like it ! I have two questions here.
>>>>>>>>
>>>>>>>> The first is whether do we have the complex composite triggers, i.e.
>>>>>>> nested All and Any. Such as :
>>>>>>>>
>>>>>>>> Any(
>>>>>>>> All(trigger1, trigger2),
>>>>>>>> Any(trigger3, trigger4)
>>>>>>>> )
>>>>>>>>
>>>>>>>> Can the above code work?
>>>>>>>>
>>>>>>>> Another question is ： In composite triggers, what’s the behavior of
>>>>>>> withEarlyTrigger and withLateTrigger ? For example,
>>>>>>>>
>>>>>>>> All(EventTimeTrigger.afterEndOfWindow()
>>>>>>>> .withEarlyTrigger(earlyFiringTrigger),
>>>>>>>> EventTimeTrigger.afterEndOfWindow()
>>>>>>>> .withLateTrigger(lateFiringTrigger))
>>>>>>>>
>>>>>>>> Is it legal? Will the earlyFiringTrigger and lateFiringTrigger both
>>>>>>> work ?
>>>>>>>>
>>>>>>>>
>>>>>>>> - Jark Wu
>>>>>>>>
>>>>>>>>> 在 2016年8月17日，上午12:24，Kostas Kloudas <[hidden email]>
>>>> 写道：
>>>>>>>>>
>>>>>>>>> Hi Aljoscha,
>>>>>>>>>
>>>>>>>>> Thanks for the feedback!
>>>>>>>>>
>>>>>>>>> It is a nice feature to have. The reason it is not included in the
>>>> FLIP
>>>>>>>>> is that I have not seen somebody asking for something similar in
>> the
>>>>>>>>> mailing list.
>>>>>>>>>
>>>>>>>>> A point that I have to add is that it seems (from the user ML) that
>>>>>>>>> most of the times users expect the “Repeated.forever” behavior to
>>>>>>>>> be the default.
>>>>>>>>>
>>>>>>>>> Given this, I would say that we should make this the default and
>>>>>>>>> add something like “Repeat.Once” option which will just let the
>>>> trigger
>>>>>>>>> fire once, e.g. the first time the counter reaches 5 in your
>> example,
>>>>>>>>> and then stop.
>>>>>>>>>
>>>>>>>>> In other case, the trigger specification may become too verbose,
>>>>>>>>> as the user will have to write the “Repeat.forever” for all child
>>>>>>> triggers.
>>>>>>>>>
>>>>>>>>> What do you think?
>>>>>>>>>
>>>>>>>>> Kostas
>>>>>>>>>
>>>>>>>>>> On Aug 16, 2016, at 4:38 PM, Aljoscha Krettek <
>> [hidden email]>
>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Ah, I just read the document again and noticed that it might be
>> good
>>>>>>> to
>>>>>>>>>> differentiate between repeatable triggers and non-repeating
>>>> triggers.
>>>>>>> I'm
>>>>>>>>>> proposing to make most triggers non-repeating with the addition
>> of a
>>>>>>>>>> trigger that makes other triggers repeatable.
>>>>>>>>>>
>>>>>>>>>> Example Non-Repeating:
>>>>>>>>>> EventTimeTrigger.pastEndOfWindow()
>>>>>>>>>> .withEarlyFiring(CountTrigger.of(5))
>>>>>>>>>>
>>>>>>>>>> this gives me an early firing once I got 5 elements and then an
>>>>>>> on-time
>>>>>>>>>> firing once the watermark passes the end of the window.
>>>>>>>>>>
>>>>>>>>>> Example with Repeating:
>>>>>>>>>> EventTimeTrigger.pastEndOfWindow()
>>>>>>>>>> .withEarlyFiring(Repeated.forever(CountTrigger.of(5)))
>>>>>>>>>>
>>>>>>>>>> this gives me early firings whenever I see 5 new elements plus the
>>>>>>>>>> watermark firing.
>>>>>>>>>>
>>>>>>>>>> What do you think?
>>>>>>>>>>
>>>>>>>>>> On Tue, 16 Aug 2016 at 15:31 Kostas Kloudas <
>>>>>>> [hidden email]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks Till!
>>>>>>>>>>>
>>>>>>>>>>> Kostas
>>>>>>>>>>>
>>>>>>>>>>>> On Aug 16, 2016, at 3:30 PM, Till Rohrmann <
>> [hidden email]>
>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Cool design doc Klou. It's well described with a lot of
>> details. I
>>>>>>> like
>>>>>>>>>>> it
>>>>>>>>>>>> a lot :-) +1 for implementing the trigger DSL.
>>>>>>>>>>>>
>>>>>>>>>>>> Cheers,
>>>>>>>>>>>> Till
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Aug 16, 2016 at 3:18 PM, Kostas Kloudas <
>>>>>>>>>>> [hidden email]
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for the feedback Ufuk!
>>>>>>>>>>>>> I will do that.
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Aug 16, 2016, at 1:41 PM, Ufuk Celebi <[hidden email]>
>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hey Kostas! Thanks for sharing the documents. I think it makes
>>>>>>> sense
>>>>>>>>>>>>>> to merge the two documents by moving the Google doc contents
>> to
>>>>>>> the
>>>>>>>>>>>>>> Wiki. I think they form one unit.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Aug 16, 2016 at 12:34 PM, Kostas Kloudas
>>>>>>>>>>>>>> <[hidden email]> wrote:
>>>>>>>>>>>>>>> Hi all!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've created a FLIP for the trigger DSL. This is the triggers
>>>>>>>>>>>>>>> that we want Apache Flink to support out-of-the-box. This
>>>>>>> proposal
>>>>>>>>>>>>>>> builds on various discussions on the mailing list and aims at
>>>>>>>>>>>>>>> serving as a base for further ones.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>
>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-9%3A+Trigger+DSL
>>>>>>>>>>>>> <
>>>>>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-9:+Trigger+DSL>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> FLIP-9 provides a description of the triggers Flink already
>>>>>>> offers,
>>>>>>>>>>>>>>> the new that we think should be added, how the APIs could
>> look
>>>>>>> like,
>>>>>>>>>>>>>>> some discussion on the implementation implications and some
>>>> ideas
>>>>>>>>>>>>>>> on how to implement them.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> There is also a shared document giving a bit more insight on
>>>> the
>>>>>>>>>>>>> implementation
>>>>>>>>>>>>>>> implications. Feel free to read but please keep the
>> discussion
>>>>>>> in the
>>>>>>>>>>>>> mailing list.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://docs.google.com/a/data-artisans.com/document/d/
>>>>>>>>>>>>> 1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing <
>>>>>>>>>>>>> https://docs.google.com/a/data-artisans.com/document/d/
>>>>>>>>>>>>> 1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I would like to start working on an the implementation next
>>>> week.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Let the discussion begin!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Kostas
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>
>>>>
>>
>>