[jira] [Created] (FLINK-14618) Give more detailed debug information on akka framesize exception

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-14618) Give more detailed debug information on akka framesize exception

Shang Yuanchun (Jira)
Jacob Sevart created FLINK-14618:
------------------------------------

             Summary: Give more detailed debug information on akka framesize exception
                 Key: FLINK-14618
                 URL: https://issues.apache.org/jira/browse/FLINK-14618
             Project: Flink
          Issue Type: Improvement
          Components: Documentation, Runtime / Network
    Affects Versions: 1.6.3
            Reporter: Jacob Sevart


I'm hitting the akka framesize limit in production with some regularity, often when the job has been running for a long time and we try to deploy or restart. I suspect it's checkpoint related because clearing the checkpoint enables the job to start up. 

The [guidance|[https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html]] says:
{quote}If Flink fails because messages exceed this limit, then you should increase it.
{quote}
The [error message|[https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/rpc/akka/AkkaInvocationHandler.java#L270]] is not very helpful towards that end. How large does it need to be? How do I know whether increasing the size will fix it, or if the message is unreasonably large due to a bug?

I'd like to modify the exception message to report the value of size. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)