Flink Memory Usage

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Flink Memory Usage

Pedro Elias
 Hi,

I have Flink running on 2 docker images, one for the job manager, and one
for the task manager, with the configuration below.

64GB RAM machine
200 GB SSD used only by RocksDB

Flink's memory configuration file is like that:

jobmanager.heap.mb: 3072
taskmanager.heap.mb: 53248
taskmanager.memory.fraction: 0.7

I have a very large and heavy job running in this server. The problem is
that the task manager is trying to take more memory than defined on the
configuration and eventually crashes the server, although the heap never
reaches the maximum memory. The last memory log before crashing shows:

Memory usage stats: [HEAP: 44432/53248/53248 MB, NON HEAP: 157/160/-1 MB
(used/committed/max)]

But the memory used by the task manager container is near 64GB


I have some doubts regarding memory usage of Flink.


1. Shouldn't the sum of the job manager memory and the task manager memory
account for all the memory allocated by Flink?  Am I missing any
configuration?

2. How can I mantain the server working in this scenario?

3. I thought that RocksDB would do the job, but it didn't happen.

4. In the past, I have seen Flink taking a checkpoint of 3GB, but
allocating initially 35GB of RAM. Where does this extra memory come from?


Can anyone help me, please?

Thanks in advance.

Pedro Luis