flink 1.11 class loading question

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

flink 1.11 class loading question

chenqin
Hi there,

We were using flink 1.11.2 in production with a large setting. The job runs fine for a couple of days and ends up with a restart loop caused by YARN container memory kill. This is not observed while running against 1.9.1 with the same setting.
Here is JVM environment passed to 1.11 as well as 1.9.1 job


env.java.opts.taskmanager: '-XX:+UseG1GC -XX:MaxGCPauseMillis=500 -XX:ParallelGCThreads=20 -XX:ConcGCThreads=5 -XX:InitiatingHeapOccupancyPercent=45 -XX:NewRatio=1 -XX:+PrintClassHistogram -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCApplicationStoppedTime -Xloggc:<LOG_DIR>/gc.log'
env.java.opts.jobmanager: '-XX:+UseG1GC -XX:MaxGCPauseMillis=500 -XX:ParallelGCThreads=20 -XX:ConcGCThreads=5 -XX:InitiatingHeapOccupancyPercent=45 -XX:NewRatio=1 -XX:+PrintClassHistogram -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCApplicationStoppedTime -Xloggc:<LOG_DIR>/gc.log'

After primitive investigation, we found this might not be related to jvm heap space usage nor gc issue. Meanwhile, we observed jvm non heap usage on some containers keep rising while job fails into restart loop as stated below.


From a configuration perspective, we would like to learn how the task manager handles classloading and (unloading?) when we set include-user-jar to first. Is there suggestions how we can have a better understanding of how the new memory model introduced in 1.10 affects this issue?


cluster.evenly-spread-out-slots: true
zookeeper.sasl.disable: true
yarn.per-job-cluster.include-user-jar: first
yarn.properties-file.location: /usr/local/hadoop/etc/hadoop/ 


Thanks,
Chen

Reply | Threaded
Open this post in threaded view
|

Re: flink 1.11 class loading question

Till Rohrmann
Hi Chen,

You are right that Flink changed its memory model with Flink 1.10. Now the
memory model is better defined and stricter. You can find information about
it here [1]. For some pointers towards potential problems please take a
look at [2].

What you need to do is to figure out where the non-heap memory is
allocated. Maybe you are using a library which leaks some memory. Maybe
your code requires more non-heap memory than you have configured the system
with.

If you are using the per-job mode on Yarn without
yarn.per-job-cluster.include-user-jar: disabled, then you should not have
any classloader leak problems because the user code should be part of the
system classpath.

If you set yarn.per-job-cluster.include-user-jar: disabled, then the
TaskExecutor will create a user code class loader and keep it as long as
the TaskExecutor still has some slots allocated for the job.

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/memory/mem_setup_tm.html
[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/memory/mem_trouble.html

Cheers,
Till

On Sun, Mar 14, 2021 at 12:22 AM Chen Qin <[hidden email]> wrote:

> Hi there,
>
> We were using flink 1.11.2 in production with a large setting. The job
> runs fine for a couple of days and ends up with a restart loop caused by
> YARN container memory kill. This is not observed while running against
> 1.9.1 with the same setting.
> Here is JVM environment passed to 1.11 as well as 1.9.1 job
>
>
> env.java.opts.taskmanager: '-XX:+UseG1GC -XX:MaxGCPauseMillis=500
>> -XX:ParallelGCThreads=20 -XX:ConcGCThreads=5
>> -XX:InitiatingHeapOccupancyPercent=45 -XX:NewRatio=1
>> -XX:+PrintClassHistogram -XX:+PrintGCDateStamps -XX:+PrintGCDetails
>> -XX:+PrintGCApplicationStoppedTime -Xloggc:<LOG_DIR>/gc.log'
>> env.java.opts.jobmanager: '-XX:+UseG1GC -XX:MaxGCPauseMillis=500
>> -XX:ParallelGCThreads=20 -XX:ConcGCThreads=5
>> -XX:InitiatingHeapOccupancyPercent=45 -XX:NewRatio=1
>> -XX:+PrintClassHistogram -XX:+PrintGCDateStamps -XX:+PrintGCDetails
>> -XX:+PrintGCApplicationStoppedTime -Xloggc:<LOG_DIR>/gc.log'
>>
>
> After primitive investigation, we found this might not be related to jvm
> heap space usage nor gc issue. Meanwhile, we observed jvm non heap usage on
> some containers keep rising while job fails into restart loop as stated
> below.
> [image: image.png]
>
> From a configuration perspective, we would like to learn how the task
> manager handles classloading and (unloading?) when we set include-user-jar
> to first. Is there suggestions how we can have a better understanding of
> how the new memory model introduced in 1.10 affects this issue?
>
>
> cluster.evenly-spread-out-slots: true
> zookeeper.sasl.disable: true
> yarn.per-job-cluster.include-user-jar: first
> yarn.properties-file.location: /usr/local/hadoop/etc/hadoop/
>
>
> Thanks,
> Chen
>
>
Reply | Threaded
Open this post in threaded view
|

Re: flink 1.11 class loading question

chenqin
Hi Till,

We did some investigation and found this memory usage point to
rocksdbstatebackend running on managed memory. So far we have seen this bug
in rocksdbstatebackend on managed memory. we followed suggestion [1] and
disabled managed memory management so far not seeing issue.

I felt this might be a major bug since we run flink 1.11.2 with managed
RocksDBstatebackend in mulitple large production jobs and consistency repo
yarn kill after job runs a period of time.

[1]
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Debugging-quot-Container-is-running-beyond-physical-memory-limits-quot-on-YARN-for-a-long-running-stb-td38227.html





--
Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
Reply | Threaded
Open this post in threaded view
|

Re: flink 1.11 class loading question

Till Rohrmann
Hi Chen,

with version 1.10 Flink introduced that RocksDB uses Flink's managed memory
[1]. This shall prevent RocksDB from exceeding the memory limits of a
process/container. Unfortunately, this is not yet perfect due to a problem
in RocksDB [2]. Due to this fact, RocksDB can still exceed the managed
memory budget. What you could do is to configure a higher off-heap size for
your tasks via taskmanager.memory.task.off-heap.size to compensate for this.

I also pull in Yu Li who can tell you more about the current limitations of
the memory limitation for RocksDB.

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.12/ops/state/state_backends.html#memory-management
[2] https://issues.apache.org/jira/browse/FLINK-15532

Cheers,
Till

On Tue, Mar 30, 2021 at 7:36 PM chenqin <[hidden email]> wrote:

> Hi Till,
>
> We did some investigation and found this memory usage point to
> rocksdbstatebackend running on managed memory. So far we have seen this bug
> in rocksdbstatebackend on managed memory. we followed suggestion [1] and
> disabled managed memory management so far not seeing issue.
>
> I felt this might be a major bug since we run flink 1.11.2 with managed
> RocksDBstatebackend in mulitple large production jobs and consistency repo
> yarn kill after job runs a period of time.
>
> [1]
>
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Debugging-quot-Container-is-running-beyond-physical-memory-limits-quot-on-YARN-for-a-long-running-stb-td38227.html
>
>
>
>
>
> --
> Sent from: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/
>