Flink JDBCOutputFormat - Flush last batch enhancement

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink JDBCOutputFormat - Flush last batch enhancement

Swapnil Chougule
Hi Team,

Can we handle one case in connector JDBCOutputFormat to update last batch
(might be batch count is less than batch interval) without closing jdbc
connection?

During use case of my streaming project, I am updating jdbc sink (mysql db)
after every window.

Case : Say I have 450 queries to be updated in mysql with batch interval
100.
400 queries are executed in 4 batches (4 x 100).
Last 50 queries go into pending state in batch to be executed & wait for
next 50 queries from next window.
If window is of size 5 minutes, then it will take next 4-5 minutes to
reflect last 50 queries in mysql.

Can we have functionality in JDBCOuputFormat to flush last batch to jdbc
sink persisting same db connection.?

Thanks,
Swapnil
Reply | Threaded
Open this post in threaded view
|

Re: Flink JDBCOutputFormat - Flush last batch enhancement

Stephan Ewen
Hi!

I am not sure I understand what you want to do, but here are some comments:

  - There is no "batching" in Flink's streaming API, not sure what you are
referring to in with the "last batch"
  - JDBC connections are not closed between windows, they remain open as
long as the operator is open.

Thanks,
Stephan


On Mon, Sep 26, 2016 at 9:29 AM, Swapnil Chougule <[hidden email]>
wrote:

> Hi Team,
>
> Can we handle one case in connector JDBCOutputFormat to update last batch
> (might be batch count is less than batch interval) without closing jdbc
> connection?
>
> During use case of my streaming project, I am updating jdbc sink (mysql db)
> after every window.
>
> Case : Say I have 450 queries to be updated in mysql with batch interval
> 100.
> 400 queries are executed in 4 batches (4 x 100).
> Last 50 queries go into pending state in batch to be executed & wait for
> next 50 queries from next window.
> If window is of size 5 minutes, then it will take next 4-5 minutes to
> reflect last 50 queries in mysql.
>
> Can we have functionality in JDBCOuputFormat to flush last batch to jdbc
> sink persisting same db connection.?
>
> Thanks,
> Swapnil
>
Reply | Threaded
Open this post in threaded view
|

Re: Flink JDBCOutputFormat - Flush last batch enhancement

Chesnay Schepler-3
The JDBCOutputFormat writes records in batches, that's what he is
referring to.

On 26.09.2016 11:48, Stephan Ewen wrote:

> Hi!
>
> I am not sure I understand what you want to do, but here are some comments:
>
>    - There is no "batching" in Flink's streaming API, not sure what you are
> referring to in with the "last batch"
>    - JDBC connections are not closed between windows, they remain open as
> long as the operator is open.
>
> Thanks,
> Stephan
>
>
> On Mon, Sep 26, 2016 at 9:29 AM, Swapnil Chougule <[hidden email]>
> wrote:
>
>> Hi Team,
>>
>> Can we handle one case in connector JDBCOutputFormat to update last batch
>> (might be batch count is less than batch interval) without closing jdbc
>> connection?
>>
>> During use case of my streaming project, I am updating jdbc sink (mysql db)
>> after every window.
>>
>> Case : Say I have 450 queries to be updated in mysql with batch interval
>> 100.
>> 400 queries are executed in 4 batches (4 x 100).
>> Last 50 queries go into pending state in batch to be executed & wait for
>> next 50 queries from next window.
>> If window is of size 5 minutes, then it will take next 4-5 minutes to
>> reflect last 50 queries in mysql.
>>
>> Can we have functionality in JDBCOuputFormat to flush last batch to jdbc
>> sink persisting same db connection.?
>>
>> Thanks,
>> Swapnil
>>

Reply | Threaded
Open this post in threaded view
|

Re: Flink JDBCOutputFormat - Flush last batch enhancement

Swapnil Chougule
Hi Stephen/Chesnay,

I have used JDBCOutputFormat from batch connectors for my streaming use
case as I didn't find jdbc connector from streaming connectors.
If it is not there, may we have jdbc connector for streaming use cases?

Thanks,
Swapnil

On Mon, Sep 26, 2016 at 3:32 PM, Chesnay Schepler <[hidden email]>
wrote:

> The JDBCOutputFormat writes records in batches, that's what he is
> referring to.
>
>
> On 26.09.2016 11:48, Stephan Ewen wrote:
>
>> Hi!
>>
>> I am not sure I understand what you want to do, but here are some
>> comments:
>>
>>    - There is no "batching" in Flink's streaming API, not sure what you
>> are
>> referring to in with the "last batch"
>>    - JDBC connections are not closed between windows, they remain open as
>> long as the operator is open.
>>
>> Thanks,
>> Stephan
>>
>>
>> On Mon, Sep 26, 2016 at 9:29 AM, Swapnil Chougule <
>> [hidden email]>
>> wrote:
>>
>> Hi Team,
>>>
>>> Can we handle one case in connector JDBCOutputFormat to update last batch
>>> (might be batch count is less than batch interval) without closing jdbc
>>> connection?
>>>
>>> During use case of my streaming project, I am updating jdbc sink (mysql
>>> db)
>>> after every window.
>>>
>>> Case : Say I have 450 queries to be updated in mysql with batch interval
>>> 100.
>>> 400 queries are executed in 4 batches (4 x 100).
>>> Last 50 queries go into pending state in batch to be executed & wait for
>>> next 50 queries from next window.
>>> If window is of size 5 minutes, then it will take next 4-5 minutes to
>>> reflect last 50 queries in mysql.
>>>
>>> Can we have functionality in JDBCOuputFormat to flush last batch to jdbc
>>> sink persisting same db connection.?
>>>
>>> Thanks,
>>> Swapnil
>>>
>>>
>
Reply | Threaded
Open this post in threaded view
|

Re: Flink JDBCOutputFormat - Flush last batch enhancement

Chesnay Schepler-3
Hello Swapnil,

setting the batch interval should be pretty much equivalent to having a
streaming jdbc connector.

Regards,
Chesnay

On 26.09.2016 13:21, Swapnil Chougule wrote:

> Hi Stephen/Chesnay,
>
> I have used JDBCOutputFormat from batch connectors for my streaming use
> case as I didn't find jdbc connector from streaming connectors.
> If it is not there, may we have jdbc connector for streaming use cases?
>
> Thanks,
> Swapnil
>
> On Mon, Sep 26, 2016 at 3:32 PM, Chesnay Schepler <[hidden email]>
> wrote:
>
>> The JDBCOutputFormat writes records in batches, that's what he is
>> referring to.
>>
>>
>> On 26.09.2016 11:48, Stephan Ewen wrote:
>>
>>> Hi!
>>>
>>> I am not sure I understand what you want to do, but here are some
>>> comments:
>>>
>>>     - There is no "batching" in Flink's streaming API, not sure what you
>>> are
>>> referring to in with the "last batch"
>>>     - JDBC connections are not closed between windows, they remain open as
>>> long as the operator is open.
>>>
>>> Thanks,
>>> Stephan
>>>
>>>
>>> On Mon, Sep 26, 2016 at 9:29 AM, Swapnil Chougule <
>>> [hidden email]>
>>> wrote:
>>>
>>> Hi Team,
>>>> Can we handle one case in connector JDBCOutputFormat to update last batch
>>>> (might be batch count is less than batch interval) without closing jdbc
>>>> connection?
>>>>
>>>> During use case of my streaming project, I am updating jdbc sink (mysql
>>>> db)
>>>> after every window.
>>>>
>>>> Case : Say I have 450 queries to be updated in mysql with batch interval
>>>> 100.
>>>> 400 queries are executed in 4 batches (4 x 100).
>>>> Last 50 queries go into pending state in batch to be executed & wait for
>>>> next 50 queries from next window.
>>>> If window is of size 5 minutes, then it will take next 4-5 minutes to
>>>> reflect last 50 queries in mysql.
>>>>
>>>> Can we have functionality in JDBCOuputFormat to flush last batch to jdbc
>>>> sink persisting same db connection.?
>>>>
>>>> Thanks,
>>>> Swapnil
>>>>
>>>>

Reply | Threaded
Open this post in threaded view
|

Re: Flink JDBCOutputFormat - Flush last batch enhancement

Chesnay Schepler-3
* setting the batch interval _to 1_

On 26.09.2016 15:25, Chesnay Schepler wrote:

> Hello Swapnil,
>
> setting the batch interval should be pretty much equivalent to having
> a streaming jdbc connector.
>
> Regards,
> Chesnay
>
> On 26.09.2016 13:21, Swapnil Chougule wrote:
>> Hi Stephen/Chesnay,
>>
>> I have used JDBCOutputFormat from batch connectors for my streaming use
>> case as I didn't find jdbc connector from streaming connectors.
>> If it is not there, may we have jdbc connector for streaming use cases?
>>
>> Thanks,
>> Swapnil
>>
>> On Mon, Sep 26, 2016 at 3:32 PM, Chesnay Schepler <[hidden email]>
>> wrote:
>>
>>> The JDBCOutputFormat writes records in batches, that's what he is
>>> referring to.
>>>
>>>
>>> On 26.09.2016 11:48, Stephan Ewen wrote:
>>>
>>>> Hi!
>>>>
>>>> I am not sure I understand what you want to do, but here are some
>>>> comments:
>>>>
>>>>     - There is no "batching" in Flink's streaming API, not sure
>>>> what you
>>>> are
>>>> referring to in with the "last batch"
>>>>     - JDBC connections are not closed between windows, they remain
>>>> open as
>>>> long as the operator is open.
>>>>
>>>> Thanks,
>>>> Stephan
>>>>
>>>>
>>>> On Mon, Sep 26, 2016 at 9:29 AM, Swapnil Chougule <
>>>> [hidden email]>
>>>> wrote:
>>>>
>>>> Hi Team,
>>>>> Can we handle one case in connector JDBCOutputFormat to update
>>>>> last batch
>>>>> (might be batch count is less than batch interval) without closing
>>>>> jdbc
>>>>> connection?
>>>>>
>>>>> During use case of my streaming project, I am updating jdbc sink
>>>>> (mysql
>>>>> db)
>>>>> after every window.
>>>>>
>>>>> Case : Say I have 450 queries to be updated in mysql with batch
>>>>> interval
>>>>> 100.
>>>>> 400 queries are executed in 4 batches (4 x 100).
>>>>> Last 50 queries go into pending state in batch to be executed &
>>>>> wait for
>>>>> next 50 queries from next window.
>>>>> If window is of size 5 minutes, then it will take next 4-5 minutes to
>>>>> reflect last 50 queries in mysql.
>>>>>
>>>>> Can we have functionality in JDBCOuputFormat to flush last batch
>>>>> to jdbc
>>>>> sink persisting same db connection.?
>>>>>
>>>>> Thanks,
>>>>> Swapnil
>>>>>
>>>>>
>
>