Recommended module for connector ratelimiting?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Recommended module for connector ratelimiting?

Lakshmi Gururaja Rao
Hi all,

The PR that I've been iterating on w.r.t. Kafka ratelimiting —
https://github.com/apache/flink/pull/7679 is now in a state where a lot of
the rate limiting logic is generic and does not apply specifically to
Kafka. I'd thus, like to move it to a module that is outside of the*
flink-connector-kafka-0.9/. *Looking at some of the other common modules,
it seems like org.apache.flink.api.java
<https://github.com/lyft/flink/tree/master/flink-core/src/main/java/org/apache/flink/api/java>
maybe
a good option? Thoughts or suggestions?

Thanks,
Lakshmi
Reply | Threaded
Open this post in threaded view
|

Re: Recommended module for connector ratelimiting?

Thomas Weise
Since we want to be able to use the rate limiter for multiple connectors
and potentially elsewhere, plus it is user facing, this seems to be the
right area. Perhaps org.apache.flink.api.common.io specifically?


On Mon, Feb 25, 2019 at 5:40 PM Lakshmi Gururaja Rao <[hidden email]>
wrote:

> Hi all,
>
> The PR that I've been iterating on w.r.t. Kafka ratelimiting —
> https://github.com/apache/flink/pull/7679 is now in a state where a lot of
> the rate limiting logic is generic and does not apply specifically to
> Kafka. I'd thus, like to move it to a module that is outside of the*
> flink-connector-kafka-0.9/. *Looking at some of the other common modules,
> it seems like org.apache.flink.api.java
> <
> https://github.com/lyft/flink/tree/master/flink-core/src/main/java/org/apache/flink/api/java
> >
> maybe
> a good option? Thoughts or suggestions?
>
> Thanks,
> Lakshmi
>
Reply | Threaded
Open this post in threaded view
|

Re: Recommended module for connector ratelimiting?

Roshan Naik-2
 Sorry, not looked into it much... but it occurred to me that it would be great to have it as "Throttling: Operator" that can be applied anywhere after a source and before a sink.  Each parallel instance of it can operate at the specified rate limit  divided by the number of parallel instances of this operator.
    On Tuesday, February 26, 2019, 7:39:06 PM PST, Thomas Weise <[hidden email]> wrote:  
 
 Since we want to be able to use the rate limiter for multiple connectors
and potentially elsewhere, plus it is user facing, this seems to be the
right area. Perhaps org.apache.flink.api.common.io specifically?


On Mon, Feb 25, 2019 at 5:40 PM Lakshmi Gururaja Rao <[hidden email]>
wrote:

> Hi all,
>
> The PR that I've been iterating on w.r.t. Kafka ratelimiting —
> https://github.com/apache/flink/pull/7679 is now in a state where a lot of
> the rate limiting logic is generic and does not apply specifically to
> Kafka. I'd thus, like to move it to a module that is outside of the*
> flink-connector-kafka-0.9/. *Looking at some of the other common modules,
> it seems like org.apache.flink.api.java
> <
> https://github.com/lyft/flink/tree/master/flink-core/src/main/java/org/apache/flink/api/java
> >
> maybe
> a good option? Thoughts or suggestions?
>
> Thanks,
> Lakshmi
>  
Reply | Threaded
Open this post in threaded view
|

Re: Recommended module for connector ratelimiting?

Lakshmi Gururaja Rao
@Thomas Weise <[hidden email]> that seems reasonable to me. I'll
create a JIRA to track this.

@Roshan
The idea of a ratelimiting operator did come up before
<https://lists.apache.org/thread.html/8140b759ba83f33a22d809887fd2d711f5ffe7069c888eb9b1142272@%3Cdev.flink.apache.org%3E>.
The main concerns around that approach were:
1) Checkpoint barriers not flowing through (as a ratelimiting operator
would introduce backpressure).
2) If the operator is downstream of the source. the data is already
deserialized at that point and the byte count will thus not be accurate.

On Wed, Feb 27, 2019 at 10:59 PM Roshan Naik <[hidden email]>
wrote:

>  Sorry, not looked into it much... but it occurred to me that it would be
> great to have it as "Throttling: Operator" that can be applied anywhere
> after a source and before a sink.  Each parallel instance of it can operate
> at the specified rate limit  divided by the number of parallel instances of
> this operator.
>     On Tuesday, February 26, 2019, 7:39:06 PM PST, Thomas Weise <
> [hidden email]> wrote:
>
>  Since we want to be able to use the rate limiter for multiple connectors
> and potentially elsewhere, plus it is user facing, this seems to be the
> right area. Perhaps org.apache.flink.api.common.io specifically?
>
>
> On Mon, Feb 25, 2019 at 5:40 PM Lakshmi Gururaja Rao <[hidden email]
> >
> wrote:
>
> > Hi all,
> >
> > The PR that I've been iterating on w.r.t. Kafka ratelimiting —
> > https://github.com/apache/flink/pull/7679 is now in a state where a lot
> of
> > the rate limiting logic is generic and does not apply specifically to
> > Kafka. I'd thus, like to move it to a module that is outside of the*
> > flink-connector-kafka-0.9/. *Looking at some of the other common modules,
> > it seems like org.apache.flink.api.java
> > <
> >
> https://github.com/lyft/flink/tree/master/flink-core/src/main/java/org/apache/flink/api/java
> > >
> > maybe
> > a good option? Thoughts or suggestions?
> >
> > Thanks,
> > Lakshmi
> >



--
*Lakshmi Gururaja Rao*
SWE
217.778.7218 <+12177787218>
[image: Lyft] <http://www.lyft.com/>
Reply | Threaded
Open this post in threaded view
|

Re: Recommended module for connector ratelimiting?

Thomas Weise-2
Lakshmi,

I would prefer we move the ratelimiter interface to the appropriate
location before we merge the PR.

Thanks,
Thomas


On Thu, Feb 28, 2019 at 5:29 PM Lakshmi Gururaja Rao <[hidden email]> wrote:

> @Thomas Weise <[hidden email]> that seems reasonable to me. I'll
> create a JIRA to track this.
>
> @Roshan
> The idea of a ratelimiting operator did come up before
> <https://lists.apache.org/thread.html/8140b759ba83f33a22d809887fd2d711f5ffe7069c888eb9b1142272@%3Cdev.flink.apache.org%3E>.
> The main concerns around that approach were:
> 1) Checkpoint barriers not flowing through (as a ratelimiting operator
> would introduce backpressure).
> 2) If the operator is downstream of the source. the data is already
> deserialized at that point and the byte count will thus not be accurate.
>
> On Wed, Feb 27, 2019 at 10:59 PM Roshan Naik <[hidden email]>
> wrote:
>
>>  Sorry, not looked into it much... but it occurred to me that it would be
>> great to have it as "Throttling: Operator" that can be applied anywhere
>> after a source and before a sink.  Each parallel instance of it can operate
>> at the specified rate limit  divided by the number of parallel instances of
>> this operator.
>>     On Tuesday, February 26, 2019, 7:39:06 PM PST, Thomas Weise <
>> [hidden email]> wrote:
>>
>>  Since we want to be able to use the rate limiter for multiple connectors
>> and potentially elsewhere, plus it is user facing, this seems to be the
>> right area. Perhaps org.apache.flink.api.common.io specifically?
>>
>>
>> On Mon, Feb 25, 2019 at 5:40 PM Lakshmi Gururaja Rao
>> <[hidden email]>
>> wrote:
>>
>> > Hi all,
>> >
>> > The PR that I've been iterating on w.r.t. Kafka ratelimiting —
>> > https://github.com/apache/flink/pull/7679 is now in a state where a
>> lot of
>> > the rate limiting logic is generic and does not apply specifically to
>> > Kafka. I'd thus, like to move it to a module that is outside of the*
>> > flink-connector-kafka-0.9/. *Looking at some of the other common
>> modules,
>> > it seems like org.apache.flink.api.java
>> > <
>> >
>> https://github.com/lyft/flink/tree/master/flink-core/src/main/java/org/apache/flink/api/java
>> > >
>> > maybe
>> > a good option? Thoughts or suggestions?
>> >
>> > Thanks,
>> > Lakshmi
>> >
>
>
>
> --
> *Lakshmi Gururaja Rao*
> SWE
> 217.778.7218 <+12177787218>
> [image: Lyft] <http://www.lyft.com/>
>