Parallel CEP

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Parallel CEP

dhanuka ranasinghe
Hi All,

Is there way to run CEP function parallel. Currently CEP run only sequentially


.

Cheers,
Dhanuka

--
Nothing Impossible,Creativity is more important than knowledge.
Reply | Threaded
Open this post in threaded view
|

Re: Parallel CEP

Dian Fu-2
Hi Dhanuka,

In order to make the CEP operator to run parallel, the input stream should be KeyedStream. You can refer [1] for detailed information.

Regards,
Dian

[1]: https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns

> 在 2019年1月24日,上午10:18,dhanuka ranasinghe <[hidden email]> 写道:
>
> Hi All,
>
> Is there way to run CEP function parallel. Currently CEP run only sequentially
>
> <flink-CEP.png>
> .
>
> Cheers,
> Dhanuka
>
> --
> Nothing Impossible,Creativity is more important than knowledge.

Reply | Threaded
Open this post in threaded view
|

Re: Parallel CEP

dhanuka ranasinghe
Hi Dian,

I tried that but then kafkaproducer only produce to single partition and
only single flink host working while rest not contribute for processing . I
will share the code and screenshot

Cheers
Dhanuka

On Thu, 24 Jan 2019, 12:31 Dian Fu <[hidden email] wrote:

> Hi Dhanuka,
>
> In order to make the CEP operator to run parallel, the input stream should
> be KeyedStream. You can refer [1] for detailed information.
>
> Regards,
> Dian
>
> [1]:
> https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns
>
> > 在 2019年1月24日,上午10:18,dhanuka ranasinghe <[hidden email]>
> 写道:
> >
> > Hi All,
> >
> > Is there way to run CEP function parallel. Currently CEP run only
> sequentially
> >
> > <flink-CEP.png>
> > .
> >
> > Cheers,
> > Dhanuka
> >
> > --
> > Nothing Impossible,Creativity is more important than knowledge.
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Parallel CEP

Dian Fu-2
Whether using KeyedStream depends on the logic of your job, i.e, whether you are looking for patterns for some partitions, i.e, patterns for a particular user. If so, you should partition the input data before the CEP operator. Otherwise, the input data should not be partitioned.

Regards,
Dian

> 在 2019年1月24日,下午12:37,dhanuka ranasinghe <[hidden email]> 写道:
>
> Hi Dian,
>
> I tried that but then kafkaproducer only produce to single partition and only single flink host working while rest not contribute for processing . I will share the code and screenshot
>
> Cheers
> Dhanuka
>
> On Thu, 24 Jan 2019, 12:31 Dian Fu <[hidden email] <mailto:[hidden email]> wrote:
> Hi Dhanuka,
>
> In order to make the CEP operator to run parallel, the input stream should be KeyedStream. You can refer [1] for detailed information.
>
> Regards,
> Dian
>
> [1]: https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns <https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns>
>
> > 在 2019年1月24日,上午10:18,dhanuka ranasinghe <[hidden email] <mailto:[hidden email]>> 写道:
> >
> > Hi All,
> >
> > Is there way to run CEP function parallel. Currently CEP run only sequentially
> >
> > <flink-CEP.png>
> > .
> >
> > Cheers,
> > Dhanuka
> >
> > --
> > Nothing Impossible,Creativity is more important than knowledge.
>

Reply | Threaded
Open this post in threaded view
|

Re: Parallel CEP

dhanuka ranasinghe
Hi Dian,

Thanks for the explanation. Please find the screen shot and source code for above mention use case. And in main issue is though I use KeyedStream , parallelism not apply properly.
Only one host is processing messages.

Flink-CEP-Keyby.png

Cheers,
Dhanuka

On Thu, Jan 24, 2019 at 1:40 PM Dian Fu <[hidden email]> wrote:
Whether using KeyedStream depends on the logic of your job, i.e, whether you are looking for patterns for some partitions, i.e, patterns for a particular user. If so, you should partition the input data before the CEP operator. Otherwise, the input data should not be partitioned.

Regards,
Dian 

在 2019年1月24日,下午12:37,dhanuka ranasinghe <[hidden email]> 写道:

Hi Dian,

I tried that but then kafkaproducer only produce to single partition and only single flink host working while rest not contribute for processing . I will share the code and screenshot

Cheers 
Dhanuka

On Thu, 24 Jan 2019, 12:31 Dian Fu <[hidden email] wrote:
Hi Dhanuka,

In order to make the CEP operator to run parallel, the input stream should be KeyedStream. You can refer [1] for detailed information.

Regards,
Dian

[1]: https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns

> 在 2019年1月24日,上午10:18,dhanuka ranasinghe <[hidden email]> 写道:
>
> Hi All,
>
> Is there way to run CEP function parallel. Currently CEP run only sequentially
>
> <flink-CEP.png>
> .
>
> Cheers,
> Dhanuka
>
> --
> Nothing Impossible,Creativity is more important than knowledge.




--
Nothing Impossible,Creativity is more important than knowledge.

FlinkCEP.java (13K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Parallel CEP

Dian Fu-2
Hi Dhanuka,

Does the KeySelector of Event::getTriggerID generate the same key for all the inputs or only generate very few key values and these key values happen to be hashed to the same downstream operator? You can print the results of Event::getTriggerID to check if it's that case.

Regards,
Dian

> 在 2019年1月24日,下午2:08,dhanuka ranasinghe <[hidden email]> 写道:
>
> Hi Dian,
>
> Thanks for the explanation. Please find the screen shot and source code for above mention use case. And in main issue is though I use KeyedStream , parallelism not apply properly.
> Only one host is processing messages.
>
> <Flink-CEP-Keyby.png>
>
> Cheers,
> Dhanuka
>
> On Thu, Jan 24, 2019 at 1:40 PM Dian Fu <[hidden email] <mailto:[hidden email]>> wrote:
> Whether using KeyedStream depends on the logic of your job, i.e, whether you are looking for patterns for some partitions, i.e, patterns for a particular user. If so, you should partition the input data before the CEP operator. Otherwise, the input data should not be partitioned.
>
> Regards,
> Dian
>
>> 在 2019年1月24日,下午12:37,dhanuka ranasinghe <[hidden email] <mailto:[hidden email]>> 写道:
>>
>> Hi Dian,
>>
>> I tried that but then kafkaproducer only produce to single partition and only single flink host working while rest not contribute for processing . I will share the code and screenshot
>>
>> Cheers
>> Dhanuka
>>
>> On Thu, 24 Jan 2019, 12:31 Dian Fu <[hidden email] <mailto:[hidden email]> wrote:
>> Hi Dhanuka,
>>
>> In order to make the CEP operator to run parallel, the input stream should be KeyedStream. You can refer [1] for detailed information.
>>
>> Regards,
>> Dian
>>
>> [1]: https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns <https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns>
>>
>> > 在 2019年1月24日,上午10:18,dhanuka ranasinghe <[hidden email] <mailto:[hidden email]>> 写道:
>> >
>> > Hi All,
>> >
>> > Is there way to run CEP function parallel. Currently CEP run only sequentially
>> >
>> > <flink-CEP.png>
>> > .
>> >
>> > Cheers,
>> > Dhanuka
>> >
>> > --
>> > Nothing Impossible,Creativity is more important than knowledge.
>>
>
>
>
> --
> Nothing Impossible,Creativity is more important than knowledge.
> <FlinkCEP.java>

Reply | Threaded
Open this post in threaded view
|

Re: Parallel CEP

dhanuka ranasinghe
In this example key will be same. I am using 1 million messages with same
key for performance testing. But still I want to process them parallel.
Can't I use Split function and get a SplitStream for that purpose?

On Thu, Jan 24, 2019 at 2:49 PM Dian Fu <[hidden email]> wrote:

> Hi Dhanuka,
>
> Does the KeySelector of Event::getTriggerID generate the same key for all
> the inputs or only generate very few key values and these key values happen
> to be hashed to the same downstream operator? You can print the results of
> Event::getTriggerID to check if it's that case.
>
> Regards,
> Dian
>
> 在 2019年1月24日,下午2:08,dhanuka ranasinghe <[hidden email]> 写道:
>
> Hi Dian,
>
> Thanks for the explanation. Please find the screen shot and source code
> for above mention use case. And in main issue is though I use KeyedStream ,
> parallelism not apply properly.
> Only one host is processing messages.
>
> <Flink-CEP-Keyby.png>
>
> Cheers,
> Dhanuka
>
> On Thu, Jan 24, 2019 at 1:40 PM Dian Fu <[hidden email]> wrote:
>
>> Whether using KeyedStream depends on the logic of your job, i.e, whether
>> you are looking for patterns for some partitions, i.e, patterns for a
>> particular user. If so, you should partition the input data before the CEP
>> operator. Otherwise, the input data should not be partitioned.
>>
>> Regards,
>> Dian
>>
>> 在 2019年1月24日,下午12:37,dhanuka ranasinghe <[hidden email]> 写道:
>>
>> Hi Dian,
>>
>> I tried that but then kafkaproducer only produce to single partition and
>> only single flink host working while rest not contribute for processing . I
>> will share the code and screenshot
>>
>> Cheers
>> Dhanuka
>>
>> On Thu, 24 Jan 2019, 12:31 Dian Fu <[hidden email] wrote:
>>
>>> Hi Dhanuka,
>>>
>>> In order to make the CEP operator to run parallel, the input stream
>>> should be KeyedStream. You can refer [1] for detailed information.
>>>
>>> Regards,
>>> Dian
>>>
>>> [1]:
>>> https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns
>>>
>>> > 在 2019年1月24日,上午10:18,dhanuka ranasinghe <[hidden email]>
>>> 写道:
>>> >
>>> > Hi All,
>>> >
>>> > Is there way to run CEP function parallel. Currently CEP run only
>>> sequentially
>>> >
>>> > <flink-CEP.png>
>>> > .
>>> >
>>> > Cheers,
>>> > Dhanuka
>>> >
>>> > --
>>> > Nothing Impossible,Creativity is more important than knowledge.
>>>
>>>
>>
>
> --
> Nothing Impossible,Creativity is more important than knowledge.
> <FlinkCEP.java>
>
>
>

--
Nothing Impossible,Creativity is more important than knowledge.
Reply | Threaded
Open this post in threaded view
|

Re: Parallel CEP

Dian Fu-2
I'm afraid you cannot do that. The inputs having the same key should be processed by the same CEP operator. Otherwise the results will be nondeterministic and also be wrong.

Regards,
Dian

> 在 2019年1月24日,下午2:56,dhanuka ranasinghe <[hidden email]> 写道:
>
> In this example key will be same. I am using 1 million messages with same key for performance testing. But still I want to process them parallel. Can't I use Split function and get a SplitStream for that purpose?
>
> On Thu, Jan 24, 2019 at 2:49 PM Dian Fu <[hidden email] <mailto:[hidden email]>> wrote:
> Hi Dhanuka,
>
> Does the KeySelector of Event::getTriggerID generate the same key for all the inputs or only generate very few key values and these key values happen to be hashed to the same downstream operator? You can print the results of Event::getTriggerID to check if it's that case.
>
> Regards,
> Dian
>
>> 在 2019年1月24日,下午2:08,dhanuka ranasinghe <[hidden email] <mailto:[hidden email]>> 写道:
>>
>> Hi Dian,
>>
>> Thanks for the explanation. Please find the screen shot and source code for above mention use case. And in main issue is though I use KeyedStream , parallelism not apply properly.
>> Only one host is processing messages.
>>
>> <Flink-CEP-Keyby.png>
>>
>> Cheers,
>> Dhanuka
>>
>> On Thu, Jan 24, 2019 at 1:40 PM Dian Fu <[hidden email] <mailto:[hidden email]>> wrote:
>> Whether using KeyedStream depends on the logic of your job, i.e, whether you are looking for patterns for some partitions, i.e, patterns for a particular user. If so, you should partition the input data before the CEP operator. Otherwise, the input data should not be partitioned.
>>
>> Regards,
>> Dian
>>
>>> 在 2019年1月24日,下午12:37,dhanuka ranasinghe <[hidden email] <mailto:[hidden email]>> 写道:
>>>
>>> Hi Dian,
>>>
>>> I tried that but then kafkaproducer only produce to single partition and only single flink host working while rest not contribute for processing . I will share the code and screenshot
>>>
>>> Cheers
>>> Dhanuka
>>>
>>> On Thu, 24 Jan 2019, 12:31 Dian Fu <[hidden email] <mailto:[hidden email]> wrote:
>>> Hi Dhanuka,
>>>
>>> In order to make the CEP operator to run parallel, the input stream should be KeyedStream. You can refer [1] for detailed information.
>>>
>>> Regards,
>>> Dian
>>>
>>> [1]: https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns <https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns>
>>>
>>> > 在 2019年1月24日,上午10:18,dhanuka ranasinghe <[hidden email] <mailto:[hidden email]>> 写道:
>>> >
>>> > Hi All,
>>> >
>>> > Is there way to run CEP function parallel. Currently CEP run only sequentially
>>> >
>>> > <flink-CEP.png>
>>> > .
>>> >
>>> > Cheers,
>>> > Dhanuka
>>> >
>>> > --
>>> > Nothing Impossible,Creativity is more important than knowledge.
>>>
>>
>>
>>
>> --
>> Nothing Impossible,Creativity is more important than knowledge.
>> <FlinkCEP.java>
>
>
>
> --
> Nothing Impossible,Creativity is more important than knowledge.