One source is much slower than the other side when join history data

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

One source is much slower than the other side when join history data

刘建刚
      When consuming history data in join operator with eventTime, reading
data from one source is much slower than the other. As a result, the join
operator will cache much data from the faster source in order to wait the
slower source.
      The question is that how can I make the difference of consumers'
speed small?
Reply | Threaded
Open this post in threaded view
|

Re: One source is much slower than the other side when join history data

Konstantin Knauf-3
Hi,

this topic has been discussed a lot recently in the community as "Event
Time Alignment/Synchronization" [1,2]. These discussion should provide a
starting point.

Cheers,

Konstantin

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Sharing-state-between-subtasks-td24489.html
[2] https://issues.apache.org/jira/browse/FLINK-10886



On Wed, Feb 27, 2019 at 3:03 AM 刘建刚 <[hidden email]> wrote:

>       When consuming history data in join operator with eventTime, reading
> data from one source is much slower than the other. As a result, the join
> operator will cache much data from the faster source in order to wait the
> slower source.
>       The question is that how can I make the difference of consumers'
> speed small?
>


--

Konstantin Knauf | Solutions Architect

+49 160 91394525

<https://www.ververica.com/>

Follow us @VervericaData

--

Join Flink Forward <https://flink-forward.org/> - The Apache Flink
Conference

Stream Processing | Event Driven | Real Time

--

Data Artisans GmbH | Invalidenstrasse 115, 10115 Berlin, Germany

--
Data Artisans GmbH
Registered at Amtsgericht Charlottenburg: HRB 158244 B
Managing Directors: Dr. Kostas Tzoumas, Dr. Stephan Ewen