Questions around MemoryStateBackend size limits

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Questions around MemoryStateBackend size limits

vishalovercome
The documentation says:

*The size of each individual state is by default limited to 5 MB. This value
can be increased in the constructor of the MemoryStateBackend.*

1. I want to know what would happen if a job ended up adding elements to a
state variable causing its size to exceed 5MB. There are other questions I
have around this topic.
2. The doc mentions that akka frame size is the upper limit on state size.
What would happen if we were to exceed that as well? Would it cause the
program to fail or would it only affect checkpointing (as communication
between job manager and task manager would breakdown)
3. If the size is within 5MB but the size of the checkpoint (additions,
removals, updates) were to be greater than 5MB (or the akka frame size),
then would the checkpoint fail?

It will also help if you could provide this information within the
documentation itself.



--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: Questions around MemoryStateBackend size limits

Yun Tang
Hi

+ user mail list

The limit max state size is because we would send the checkpointed data as a byte array and send it back to jobmanager. If the checkpointed byte stream is too large, the job manager would meet the risk of out-of-memory-error.
 If you want to use heap-based state-backend, you could use FsStateBackend which actually share the same code as MemoryStateBackend but only checkpoint to external file system storage.


  1.  If the checkpoint stream size is over limit 5MB, the checkpoint phase on task side would fail due to the size check [1]
  2.  If the checkpoint stream over the limit, task would fail to report successful checkpoint message to JM [2]
  3.  Actually, we only care the checkpoint stream size instead of the state size for this field.

[1] https://github.com/apache/flink/blob/1c78ab397de524836fd69c6218b1122aa387c251/flink-runtime/src/main/java/org/apache/flink/runtime/state/memory/MemCheckpointStreamFactory.java#L62-L69
[2] https://github.com/apache/flink/blob/1c78ab397de524836fd69c6218b1122aa387c251/flink-runtime/src/main/java/org/apache/flink/runtime/state/TaskStateManagerImpl.java#L117-L122

Best
Yun Tang
________________________________
From: vishalovercome <[hidden email]>
Sent: Tuesday, June 2, 2020 13:58
To: [hidden email] <[hidden email]>
Subject: Questions around MemoryStateBackend size limits

The documentation says:

*The size of each individual state is by default limited to 5 MB. This value
can be increased in the constructor of the MemoryStateBackend.*

1. I want to know what would happen if a job ended up adding elements to a
state variable causing its size to exceed 5MB. There are other questions I
have around this topic.
2. The doc mentions that akka frame size is the upper limit on state size.
What would happen if we were to exceed that as well? Would it cause the
program to fail or would it only affect checkpointing (as communication
between job manager and task manager would breakdown)
3. If the size is within 5MB but the size of the checkpoint (additions,
removals, updates) were to be greater than 5MB (or the akka frame size),
then would the checkpoint fail?

It will also help if you could provide this information within the
documentation itself.



--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/