Flink memory management in table api

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Flink memory management in table api

Amol S - iProgrammer
Hello folks,

I am using flink table api to join multiple tables and create a single
table from them. I have some doubts in my mind.

1. How long the query will maintain partial results per key and how it
maintains state of each key?

2. If it is maintains state in memory then the memory will continuously
grows and it leads to memory over head.

3. How much RAM server  needs to handle 10,000 per seconds incoming records
of average size of 3KB.

-----------------------------------------------
*Amol Suryawanshi*
Java Developer
[hidden email]


*iProgrammer Solutions Pvt. Ltd.*



*Office 103, 104, 1st Floor Pride Portal,Shivaji Housing Society,
Bahiratwadi,Near Hotel JW Marriott, Off Senapati Bapat Road, Pune - 411016,
MH, INDIA.**Phone: +91 9689077510 | Skype: amols_iprogrammer*
www.iprogrammer.com <[hidden email]>
------------------------------------------------
Reply | Threaded
Open this post in threaded view
|

Re: Flink memory management in table api

Fabian Hueske-2
Hi Amol,

The memory consumption depends on the query/operation that you are doing.
Time-based operations like group-window-aggregations,
over-window-aggregations, or window-joins can automatically clean up their
state once data is not no longer needed.
Operations such as non-windowed aggregations or joins have to persist all
data forever in state to guarantee absolute correctness.
However, you can also configure an idle state retention time [1] to remove
state that has not been accessed for a certain time.

Regarding you questions:

1) Unless you configure the idle state retention time, state is kept as
long as needed to guarantee correctness, potentially forever.
2) Queries use Flink's regular state features, i.e., you can configure the
RocksDBStateBackend to manage state on disk.
3) This depends on your query and the distribution of your data.

Best, Fabian

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/table/streaming.html#idle-state-retention-time

2018-07-04 7:46 GMT+02:00 Amol S - iProgrammer <[hidden email]>:

> Hello folks,
>
> I am using flink table api to join multiple tables and create a single
> table from them. I have some doubts in my mind.
>
> 1. How long the query will maintain partial results per key and how it
> maintains state of each key?
>
> 2. If it is maintains state in memory then the memory will continuously
> grows and it leads to memory over head.
>
> 3. How much RAM server  needs to handle 10,000 per seconds incoming records
> of average size of 3KB.
>
> -----------------------------------------------
> *Amol Suryawanshi*
> Java Developer
> [hidden email]
>
>
> *iProgrammer Solutions Pvt. Ltd.*
>
>
>
> *Office 103, 104, 1st Floor Pride Portal,Shivaji Housing Society,
> Bahiratwadi,Near Hotel JW Marriott, Off Senapati Bapat Road, Pune - 411016,
> MH, INDIA.**Phone: +91 9689077510 | Skype: amols_iprogrammer*
> www.iprogrammer.com <[hidden email]>
> ------------------------------------------------
>
Reply | Threaded
Open this post in threaded view
|

Re: Flink memory management in table api

Amol S - iProgrammer
Hello fabian,

Thanks for your quick response,

According to above conversation flink will persist state forever for non
windowed operations. I want to know how flink persiat the state i.e.
Database or file system or in memory etc.

On Wed, 4 Jul 2018 at 2:12 PM, Fabian Hueske <[hidden email]> wrote:

> Hi Amol,
>
> The memory consumption depends on the query/operation that you are doing.
> Time-based operations like group-window-aggregations,
> over-window-aggregations, or window-joins can automatically clean up their
> state once data is not no longer needed.
> Operations such as non-windowed aggregations or joins have to persist all
> data forever in state to guarantee absolute correctness.
> However, you can also configure an idle state retention time [1] to remove
> state that has not been accessed for a certain time.
>
> Regarding you questions:
>
> 1) Unless you configure the idle state retention time, state is kept as
> long as needed to guarantee correctness, potentially forever.
> 2) Queries use Flink's regular state features, i.e., you can configure the
> RocksDBStateBackend to manage state on disk.
> 3) This depends on your query and the distribution of your data.
>
> Best, Fabian
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/table/streaming.html#idle-state-retention-time
>
> 2018-07-04 7:46 GMT+02:00 Amol S - iProgrammer <[hidden email]>:
>
>> Hello folks,
>>
>> I am using flink table api to join multiple tables and create a single
>> table from them. I have some doubts in my mind.
>>
>> 1. How long the query will maintain partial results per key and how it
>> maintains state of each key?
>>
>> 2. If it is maintains state in memory then the memory will continuously
>> grows and it leads to memory over head.
>>
>> 3. How much RAM server  needs to handle 10,000 per seconds incoming
>> records
>> of average size of 3KB.
>>
>> -----------------------------------------------
>> *Amol Suryawanshi*
>> Java Developer
>> [hidden email]
>>
>>
>> *iProgrammer Solutions Pvt. Ltd.*
>>
>>
>>
>> *Office 103, 104, 1st Floor Pride Portal,Shivaji Housing Society,
>> Bahiratwadi,Near Hotel JW Marriott, Off Senapati Bapat Road, Pune -
>> 411016,
>> MH, INDIA.**Phone: +91 9689077510 | Skype: amols_iprogrammer*
>> www.iprogrammer.com <[hidden email]>
>> ------------------------------------------------
>>
> --

-----------------------------------------------
*Amol Suryawanshi*
Java Developer
[hidden email]


*iProgrammer Solutions Pvt. Ltd.*



*Office 103, 104, 1st Floor Pride Portal,Shivaji Housing Society,
Bahiratwadi,Near Hotel JW Marriott, Off Senapati Bapat Road, Pune - 411016,
MH, INDIA.**Phone: +91 9689077510 | Skype: amols_iprogrammer*
www.iprogrammer.com <[hidden email]>
------------------------------------------------
Reply | Threaded
Open this post in threaded view
|

Re: Flink memory management in table api

Fabian Hueske-2
State is maintained in the configured state backend [1].

Best, Fabian

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/state_backends.html

2018-07-04 11:38 GMT+02:00 Amol S - iProgrammer <[hidden email]>:

> Hello fabian,
>
> Thanks for your quick response,
>
> According to above conversation flink will persist state forever for non
> windowed operations. I want to know how flink persiat the state i.e.
> Database or file system or in memory etc.
>
> On Wed, 4 Jul 2018 at 2:12 PM, Fabian Hueske <[hidden email]> wrote:
>
>> Hi Amol,
>>
>> The memory consumption depends on the query/operation that you are doing.
>> Time-based operations like group-window-aggregations,
>> over-window-aggregations, or window-joins can automatically clean up their
>> state once data is not no longer needed.
>> Operations such as non-windowed aggregations or joins have to persist all
>> data forever in state to guarantee absolute correctness.
>> However, you can also configure an idle state retention time [1] to
>> remove state that has not been accessed for a certain time.
>>
>> Regarding you questions:
>>
>> 1) Unless you configure the idle state retention time, state is kept as
>> long as needed to guarantee correctness, potentially forever.
>> 2) Queries use Flink's regular state features, i.e., you can configure
>> the RocksDBStateBackend to manage state on disk.
>> 3) This depends on your query and the distribution of your data.
>>
>> Best, Fabian
>>
>> [1] https://ci.apache.org/projects/flink/flink-docs-
>> release-1.5/dev/table/streaming.html#idle-state-retention-time
>>
>> 2018-07-04 7:46 GMT+02:00 Amol S - iProgrammer <[hidden email]>:
>>
>>> Hello folks,
>>>
>>> I am using flink table api to join multiple tables and create a single
>>> table from them. I have some doubts in my mind.
>>>
>>> 1. How long the query will maintain partial results per key and how it
>>> maintains state of each key?
>>>
>>> 2. If it is maintains state in memory then the memory will continuously
>>> grows and it leads to memory over head.
>>>
>>> 3. How much RAM server  needs to handle 10,000 per seconds incoming
>>> records
>>> of average size of 3KB.
>>>
>>> -----------------------------------------------
>>> *Amol Suryawanshi*
>>> Java Developer
>>> [hidden email]
>>>
>>>
>>> *iProgrammer Solutions Pvt. Ltd.*
>>>
>>>
>>>
>>> *Office 103, 104, 1st Floor Pride Portal,Shivaji Housing Society,
>>> Bahiratwadi,Near Hotel JW Marriott, Off Senapati Bapat Road, Pune -
>>> 411016,
>>> MH, INDIA.**Phone: +91 9689077510 | Skype: amols_iprogrammer*
>>> www.iprogrammer.com <[hidden email]>
>>> ------------------------------------------------
>>>
>> --
>
> -----------------------------------------------
> *Amol Suryawanshi*
> Java Developer
> [hidden email]
>
>
> *iProgrammer Solutions Pvt. Ltd.*
>
>
>
> *Office 103, 104, 1st Floor Pride Portal,Shivaji Housing Society,
> Bahiratwadi,Near Hotel JW Marriott, Off Senapati Bapat Road, Pune - 411016,
> MH, INDIA.**Phone: +91 9689077510 | Skype: amols_iprogrammer*
> www.iprogrammer.com <[hidden email]>
> ------------------------------------------------
>