Memory Management in Streaming mode

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Memory Management in Streaming mode

Shen Li
Hi,

I have a question regarding memory management in the Streaming mode. The
documents state that the memory management module is only used in the Batch
mode, and Flink Streaming directly operates on JVM heap. Then, what if the
volume of data in a window becomes too large to fit in the heap? Will Flink
spill some data to disk or will it trigger an OOM exception?

Thanks,

Shen
Reply | Threaded
Open this post in threaded view
|

Re: Memory Management in Streaming mode

Robert Metzger
Hi Shen,

What you've read in the documentation is correct. The managed memory is
only used by the batch operators.
For streaming, we have different state backends with different
characteristics.
The FsStateBackend keeps the state (including window contents) on the heap
and makes backups to the (distrib) file system.
The RocksDB state backend keeps the data mostly on the local disk and makes
backups to a (distrib) file system.

So window state exceeding the available memory in the cluster will not
cause OOM exceptions when using the RocksDB state backend.

Regards,
Robert


On Mon, Aug 29, 2016 at 5:53 PM, Shen Li <[hidden email]> wrote:

> Hi,
>
> I have a question regarding memory management in the Streaming mode. The
> documents state that the memory management module is only used in the Batch
> mode, and Flink Streaming directly operates on JVM heap. Then, what if the
> volume of data in a window becomes too large to fit in the heap? Will Flink
> spill some data to disk or will it trigger an OOM exception?
>
> Thanks,
>
> Shen
>