Hi, I catch "OutOfMemoryError: Metaspace" on Batch Task When Write into Clickhouse. Attached *.java file is my task code. And I find that, after running 12 tasks, the 13th task will be failed. And the exception always is "OutOfMemoryError: Metaspace". see "task-failed.png" I conf -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/path/to/hprofFile and dump the hprof file. I analyse this hprof file. And find this error occurs may not caused by my user-code. So I came here ask for your help. To confirm whether the memory leak should be caused by Flink. Attached file "java_pid29294.hprof" is the dump file. Thanks. 从QQ邮箱发来的超大附件 java_pid29294.hprof (81.44M, 2020年09月23日 16:05 到期) 进入下载页面:http://mail.qq.com/cgi-bin/ftnExs_download?t=exs_ftn_download&k=71666339b499dbc2277cdd0e13640117095206000051525e14545b010349075a0a504e01570607155c020200015550080d55570e3577335258100266450d570a00545a0d1b0c434a56006304&code=9fc95d38 |
Hi,
thanks for reaching out to the community. Could you share a bit more details about the cluster setup (session cluster, per-job cluster deployment), Flink version and maybe also share the logs with us? Sharing your user code and the libraries you are using can also be helpful in figuring out what is going wrong. Cheers, Till On Mon, Aug 24, 2020 at 10:22 AM 耿延杰 <[hidden email]> wrote: > Hi, > > I catch "OutOfMemoryError: Metaspace" on Batch Task When Write into > Clickhouse. > Attached *.java file is my task code. > > And I find that, after running 12 tasks, the 13th task will be failed. And > the exception always is "OutOfMemoryError: Metaspace". see "task-failed.png" > > I conf -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/path/to/hprofFile > and dump the hprof file. > I analyse this hprof file. And find this error occurs may not caused by my > user-code. > So I came here ask for your help. To confirm whether the memory leak > should be caused by Flink. > > Attached file "java_pid29294.hprof" is the dump file. > > Thanks. > > > > ------------------------------ > *从QQ邮箱发来的超大附件* > > <http://mail.qq.com/cgi-bin/ftnExs_download?t=exs_ftn_download&k=71666339b499dbc2277cdd0e13640117095206000051525e14545b010349075a0a504e01570607155c020200015550080d55570e3577335258100266450d570a00545a0d1b0c434a56006304&code=9fc95d38> > java_pid29294.hprof > <http://mail.qq.com/cgi-bin/ftnExs_download?t=exs_ftn_download&k=71666339b499dbc2277cdd0e13640117095206000051525e14545b010349075a0a504e01570607155c020200015550080d55570e3577335258100266450d570a00545a0d1b0c434a56006304&code=9fc95d38> > (81.44M, 2020年09月23日 16:05 到期) > 进入下载页面 > <http://mail.qq.com/cgi-bin/ftnExs_download?t=exs_ftn_download&k=71666339b499dbc2277cdd0e13640117095206000051525e14545b010349075a0a504e01570607155c020200015550080d55570e3577335258100266450d570a00545a0d1b0c434a56006304&code=9fc95d38> > : > http://mail.qq.com/cgi-bin/ftnExs_download?t=exs_ftn_download&k=71666339b499dbc2277cdd0e13640117095206000051525e14545b010349075a0a504e01570607155c020200015550080d55570e3577335258100266450d570a00545a0d1b0c434a56006304&code=9fc95d38 > |
In reply to this post by 耿延杰
Additional info:
The exception info in Flink Manager Page: Caused by: java.lang.OutOfMemoryError: Metaspace at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:757) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:468) at java.net.URLClassLoader.access$100(URLClassLoader.java:74) at java.net.URLClassLoader$1.run(URLClassLoader.java:369) at java.net.URLClassLoader$1.run(URLClassLoader.java:363) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:362) at org.apache.flink.util.ChildFirstClassLoader.loadClass(ChildFirstClassLoader.java:66) at java.lang.ClassLoader.loadClass(ClassLoader.java:352) at org.apache.http.impl.client.CloseableHttpClient.determineTarget(CloseableHttpClient.java:93) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) at ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:614) at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:117) at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:100) at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:95) at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:90) at ru.yandex.clickhouse.ClickHouseConnectionImpl.initTimeZone(ClickHouseConnectionImpl.java:94) at ru.yandex.clickhouse.ClickHouseConnectionImpl.<init>(ClickHouseConnectionImpl.java:80) at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:55) at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:47) at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:29) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:270) at org.apache.flink.api.java.io.jdbc.AbstractJDBCOutputFormat.establishConnection(AbstractJDBCOutputFormat.java:68) at com.xx.xx.xx.ClickHouseJDBCOutputFormat.open(ClickHouseJDBCOutputFormat.java:53) at org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:205) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532) at java.lang.Thread.run(Thread.java:748) ------------------ 原始邮件 ------------------ 发件人: "耿延杰" <[hidden email]>; 发送时间: 2020年8月24日(星期一) 下午4:20 收件人: "dev"<[hidden email]>; 主题: OutOfMemoryError: Metaspace on Batch Task When Write into Clickhouse Hi, I catch "OutOfMemoryError: Metaspace" on Batch Task When Write into Clickhouse. Attached *.java file is my task code. And I find that, after running 12 tasks, the 13th task will be failed. And the exception always is "OutOfMemoryError: Metaspace". see "task-failed.png" I conf -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/path/to/hprofFile and dump the hprof file. I analyse this hprof file. And find this error occurs may not caused by my user-code. So I came here ask for your help. To confirm whether the memory leak should be caused by Flink. Attached file "java_pid29294.hprof" is the dump file. Thanks. |
Still failed after every 12 tasks.
And the exception stack of failed tasks is different. such as the recent failed tasks's exception info: Caused by: java.lang.OutOfMemoryError: Metaspace at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:757) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:468) at java.net.URLClassLoader.access$100(URLClassLoader.java:74) at java.net.URLClassLoader$1.run(URLClassLoader.java:369) at java.net.URLClassLoader$1.run(URLClassLoader.java:363) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:362) at org.apache.flink.util.ChildFirstClassLoader.loadClass(ChildFirstClassLoader.java:66) at java.lang.ClassLoader.loadClass(ClassLoader.java:352) at org.apache.http.impl.client.CloseableHttpClient.determineTarget(CloseableHttpClient.java:93) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) at ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:614) at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:117) at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:100) at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:95) at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:90) at ru.yandex.clickhouse.ClickHouseConnectionImpl.initTimeZone(ClickHouseConnectionImpl.java:94) at ru.yandex.clickhouse.ClickHouseConnectionImpl.<init>(ClickHouseConnectionImpl.java:80) at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:55) at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:47) at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:29) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:270) at org.apache.flink.api.java.io.jdbc.AbstractJDBCOutputFormat.establishConnection(AbstractJDBCOutputFormat.java:68) at com.xxx.clickhouse.ClickHouseJDBCOutputFormat.open(ClickHouseJDBCOutputFormat.java:53) at org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:205) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532) at java.lang.Thread.run(Thread.java:748) is different with the exception info in last email. So analyse the dump file is the key. ------------------ 原始邮件 ------------------ 发件人: "耿延杰" <[hidden email]>; 发送时间: 2020年8月24日(星期一) 下午4:33 收件人: "dev"<[hidden email]>; 主题: 回复:OutOfMemoryError: Metaspace on Batch Task When Write into Clickhouse Additional info: The exception info in Flink Manager Page: Caused by: java.lang.OutOfMemoryError: Metaspace at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClass(ClassLoader.java:757) at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) at java.net.URLClassLoader.defineClass(URLClassLoader.java:468) at java.net.URLClassLoader.access$100(URLClassLoader.java:74) at java.net.URLClassLoader$1.run(URLClassLoader.java:369) at java.net.URLClassLoader$1.run(URLClassLoader.java:363) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:362) at org.apache.flink.util.ChildFirstClassLoader.loadClass(ChildFirstClassLoader.java:66) at java.lang.ClassLoader.loadClass(ClassLoader.java:352) at org.apache.http.impl.client.CloseableHttpClient.determineTarget(CloseableHttpClient.java:93) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) at ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:614) at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:117) at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:100) at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:95) at ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:90) at ru.yandex.clickhouse.ClickHouseConnectionImpl.initTimeZone(ClickHouseConnectionImpl.java:94) at ru.yandex.clickhouse.ClickHouseConnectionImpl.<init>(ClickHouseConnectionImpl.java:80) at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:55) at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:47) at ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:29) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:270) at org.apache.flink.api.java.io.jdbc.AbstractJDBCOutputFormat.establishConnection(AbstractJDBCOutputFormat.java:68) at com.xx.xx.xx.ClickHouseJDBCOutputFormat.open(ClickHouseJDBCOutputFormat.java:53) at org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:205) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532) at java.lang.Thread.run(Thread.java:748) ------------------ 原始邮件 ------------------ 发件人: "耿延杰" <[hidden email]>; 发送时间: 2020年8月24日(星期一) 下午4:20 收件人: "dev"<[hidden email]>; 主题: OutOfMemoryError: Metaspace on Batch Task When Write into Clickhouse Hi, I catch "OutOfMemoryError: Metaspace" on Batch Task When Write into Clickhouse. Attached *.java file is my task code. And I find that, after running 12 tasks, the 13th task will be failed. And the exception always is "OutOfMemoryError: Metaspace". see "task-failed.png" I conf -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/path/to/hprofFile and dump the hprof file. I analyse this hprof file. And find this error occurs may not caused by my user-code. So I came here ask for your help. To confirm whether the memory leak should be caused by Flink. Attached file "java_pid29294.hprof" is the dump file. Thanks. |
Hi,
could you share with us the Flink cluster logs? This would help answering a lot of questions around your setup and the Flink version you are using. Thanks a lot! Cheers, Till On Mon, Aug 24, 2020 at 10:48 AM 耿延杰 <[hidden email]> wrote: > Still failed after every 12 tasks. > And the exception stack of failed tasks is different. > > > such as the recent failed tasks's exception info: > Caused by: java.lang.OutOfMemoryError: Metaspace > at java.lang.ClassLoader.defineClass1(Native > Method) > at > java.lang.ClassLoader.defineClass(ClassLoader.java:757) > at > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) > at > java.net.URLClassLoader.defineClass(URLClassLoader.java:468) > at > java.net.URLClassLoader.access$100(URLClassLoader.java:74) > at > java.net.URLClassLoader$1.run(URLClassLoader.java:369) > at > java.net.URLClassLoader$1.run(URLClassLoader.java:363) > at > java.security.AccessController.doPrivileged(Native Method) > at > java.net.URLClassLoader.findClass(URLClassLoader.java:362) > at > org.apache.flink.util.ChildFirstClassLoader.loadClass(ChildFirstClassLoader.java:66) > at > java.lang.ClassLoader.loadClass(ClassLoader.java:352) > at > org.apache.http.impl.client.CloseableHttpClient.determineTarget(CloseableHttpClient.java:93) > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) > at > ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:614) > at > ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:117) > at > ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:100) > at > ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:95) > at > ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:90) > at > ru.yandex.clickhouse.ClickHouseConnectionImpl.initTimeZone(ClickHouseConnectionImpl.java:94) > at > ru.yandex.clickhouse.ClickHouseConnectionImpl.<init>(ClickHouseConnectionImpl.java:80) > at > ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:55) > at > ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:47) > at > ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:29) > at > java.sql.DriverManager.getConnection(DriverManager.java:664) > at > java.sql.DriverManager.getConnection(DriverManager.java:270) > at org.apache.flink.api.java.io > .jdbc.AbstractJDBCOutputFormat.establishConnection(AbstractJDBCOutputFormat.java:68) > at > com.xxx.clickhouse.ClickHouseJDBCOutputFormat.open(ClickHouseJDBCOutputFormat.java:53) > at > org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:205) > at > org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707) > at > org.apache.flink.runtime.taskmanager.Task.run(Task.java:532) > at java.lang.Thread.run(Thread.java:748) > > > > > is different with the exception info in last email. > > > So analyse the dump file is the key. > > > > > > > ------------------ 原始邮件 ------------------ > 发件人: > "耿延杰" > < > [hidden email]>; > 发送时间: 2020年8月24日(星期一) 下午4:33 > 收件人: "dev"<[hidden email]>; > > 主题: 回复:OutOfMemoryError: Metaspace on Batch Task When Write into > Clickhouse > > > > Additional info: > > > The exception info in Flink Manager Page: > > > Caused by: java.lang.OutOfMemoryError: Metaspace > at java.lang.ClassLoader.defineClass1(Native Method) > at > java.lang.ClassLoader.defineClass(ClassLoader.java:757) > at > java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) > at > java.net.URLClassLoader.defineClass(URLClassLoader.java:468) > at > java.net.URLClassLoader.access$100(URLClassLoader.java:74) > at > java.net.URLClassLoader$1.run(URLClassLoader.java:369) > at > java.net.URLClassLoader$1.run(URLClassLoader.java:363) > at java.security.AccessController.doPrivileged(Native > Method) > at > java.net.URLClassLoader.findClass(URLClassLoader.java:362) > at > org.apache.flink.util.ChildFirstClassLoader.loadClass(ChildFirstClassLoader.java:66) > at java.lang.ClassLoader.loadClass(ClassLoader.java:352) > at > org.apache.http.impl.client.CloseableHttpClient.determineTarget(CloseableHttpClient.java:93) > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) > at > org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) > at > ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:614) > at > ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:117) > at > ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:100) > at > ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:95) > at > ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:90) > at > ru.yandex.clickhouse.ClickHouseConnectionImpl.initTimeZone(ClickHouseConnectionImpl.java:94) > at > ru.yandex.clickhouse.ClickHouseConnectionImpl.<init>(ClickHouseConnectionImpl.java:80) > at > ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:55) > at > ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:47) > at > ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:29) > at > java.sql.DriverManager.getConnection(DriverManager.java:664) > at > java.sql.DriverManager.getConnection(DriverManager.java:270) > at org.apache.flink.api.java.io > .jdbc.AbstractJDBCOutputFormat.establishConnection(AbstractJDBCOutputFormat.java:68) > at > com.xx.xx.xx.ClickHouseJDBCOutputFormat.open(ClickHouseJDBCOutputFormat.java:53) > at > org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:205) > at > org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707) > at > org.apache.flink.runtime.taskmanager.Task.run(Task.java:532) > at java.lang.Thread.run(Thread.java:748) > > > > > > > ------------------ 原始邮件 ------------------ > 发件人: > "耿延杰" > < > [hidden email]>; > 发送时间: 2020年8月24日(星期一) 下午4:20 > 收件人: "dev"<[hidden email]>; > > 主题: OutOfMemoryError: Metaspace on Batch Task When Write into > Clickhouse > > > > Hi, > > > I catch "OutOfMemoryError: Metaspace" on Batch Task When Write into > Clickhouse. > Attached *.java file is my task code. > > And I find that, after running 12 tasks, the 13th task will be failed. And > the exception always is "OutOfMemoryError: Metaspace". see "task-failed.png" > > > I conf -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/path/to/hprofFile > and dump the hprof file. > I analyse this hprof file. And find this error occurs may not caused by my > user-code. > So I came here ask for your help. To confirm whether the memory leak > should be caused by Flink. > > > Attached file "java_pid29294.hprof" is the dump file. > > > Thanks. |
What could also cause the problem is that the metaspace memory budget is
configured too tightly. Here is a pointer to increasing the metaspace size [1]. [1] https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_trouble.html#outofmemoryerror-metaspace Cheers, Till On Mon, Aug 24, 2020 at 1:49 PM Till Rohrmann <[hidden email]> wrote: > Hi, > > could you share with us the Flink cluster logs? This would help answering > a lot of questions around your setup and the Flink version you are using. > Thanks a lot! > > Cheers, > Till > > On Mon, Aug 24, 2020 at 10:48 AM 耿延杰 <[hidden email]> wrote: > >> Still failed after every 12 tasks. >> And the exception stack of failed tasks is different. >> >> >> such as the recent failed tasks's exception info: >> Caused by: java.lang.OutOfMemoryError: Metaspace >> at java.lang.ClassLoader.defineClass1(Native >> Method) >> at >> java.lang.ClassLoader.defineClass(ClassLoader.java:757) >> at >> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) >> at >> java.net.URLClassLoader.defineClass(URLClassLoader.java:468) >> at >> java.net.URLClassLoader.access$100(URLClassLoader.java:74) >> at >> java.net.URLClassLoader$1.run(URLClassLoader.java:369) >> at >> java.net.URLClassLoader$1.run(URLClassLoader.java:363) >> at >> java.security.AccessController.doPrivileged(Native Method) >> at >> java.net.URLClassLoader.findClass(URLClassLoader.java:362) >> at >> org.apache.flink.util.ChildFirstClassLoader.loadClass(ChildFirstClassLoader.java:66) >> at >> java.lang.ClassLoader.loadClass(ClassLoader.java:352) >> at >> org.apache.http.impl.client.CloseableHttpClient.determineTarget(CloseableHttpClient.java:93) >> at >> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) >> at >> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) >> at >> ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:614) >> at >> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:117) >> at >> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:100) >> at >> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:95) >> at >> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:90) >> at >> ru.yandex.clickhouse.ClickHouseConnectionImpl.initTimeZone(ClickHouseConnectionImpl.java:94) >> at >> ru.yandex.clickhouse.ClickHouseConnectionImpl.<init>(ClickHouseConnectionImpl.java:80) >> at >> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:55) >> at >> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:47) >> at >> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:29) >> at >> java.sql.DriverManager.getConnection(DriverManager.java:664) >> at >> java.sql.DriverManager.getConnection(DriverManager.java:270) >> at org.apache.flink.api.java.io >> .jdbc.AbstractJDBCOutputFormat.establishConnection(AbstractJDBCOutputFormat.java:68) >> at >> com.xxx.clickhouse.ClickHouseJDBCOutputFormat.open(ClickHouseJDBCOutputFormat.java:53) >> at >> org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:205) >> at >> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707) >> at >> org.apache.flink.runtime.taskmanager.Task.run(Task.java:532) >> at java.lang.Thread.run(Thread.java:748) >> >> >> >> >> is different with the exception info in last email. >> >> >> So analyse the dump file is the key. >> >> >> >> >> >> >> ------------------ 原始邮件 ------------------ >> 发件人: >> "耿延杰" >> < >> [hidden email]>; >> 发送时间: 2020年8月24日(星期一) 下午4:33 >> 收件人: "dev"<[hidden email]>; >> >> 主题: 回复:OutOfMemoryError: Metaspace on Batch Task When Write into >> Clickhouse >> >> >> >> Additional info: >> >> >> The exception info in Flink Manager Page: >> >> >> Caused by: java.lang.OutOfMemoryError: Metaspace >> at java.lang.ClassLoader.defineClass1(Native Method) >> at >> java.lang.ClassLoader.defineClass(ClassLoader.java:757) >> at >> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) >> at >> java.net.URLClassLoader.defineClass(URLClassLoader.java:468) >> at >> java.net.URLClassLoader.access$100(URLClassLoader.java:74) >> at >> java.net.URLClassLoader$1.run(URLClassLoader.java:369) >> at >> java.net.URLClassLoader$1.run(URLClassLoader.java:363) >> at java.security.AccessController.doPrivileged(Native >> Method) >> at >> java.net.URLClassLoader.findClass(URLClassLoader.java:362) >> at >> org.apache.flink.util.ChildFirstClassLoader.loadClass(ChildFirstClassLoader.java:66) >> at >> java.lang.ClassLoader.loadClass(ClassLoader.java:352) >> at >> org.apache.http.impl.client.CloseableHttpClient.determineTarget(CloseableHttpClient.java:93) >> at >> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) >> at >> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) >> at >> ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:614) >> at >> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:117) >> at >> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:100) >> at >> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:95) >> at >> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:90) >> at >> ru.yandex.clickhouse.ClickHouseConnectionImpl.initTimeZone(ClickHouseConnectionImpl.java:94) >> at >> ru.yandex.clickhouse.ClickHouseConnectionImpl.<init>(ClickHouseConnectionImpl.java:80) >> at >> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:55) >> at >> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:47) >> at >> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:29) >> at >> java.sql.DriverManager.getConnection(DriverManager.java:664) >> at >> java.sql.DriverManager.getConnection(DriverManager.java:270) >> at org.apache.flink.api.java.io >> .jdbc.AbstractJDBCOutputFormat.establishConnection(AbstractJDBCOutputFormat.java:68) >> at >> com.xx.xx.xx.ClickHouseJDBCOutputFormat.open(ClickHouseJDBCOutputFormat.java:53) >> at >> org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:205) >> at >> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707) >> at >> org.apache.flink.runtime.taskmanager.Task.run(Task.java:532) >> at java.lang.Thread.run(Thread.java:748) >> >> >> >> >> >> >> ------------------ 原始邮件 ------------------ >> 发件人: >> "耿延杰" >> < >> [hidden email]>; >> 发送时间: 2020年8月24日(星期一) 下午4:20 >> 收件人: "dev"<[hidden email]>; >> >> 主题: OutOfMemoryError: Metaspace on Batch Task When Write into >> Clickhouse >> >> >> >> Hi, >> >> >> I catch "OutOfMemoryError: Metaspace" on Batch Task When Write into >> Clickhouse. >> Attached *.java file is my task code. >> >> And I find that, after running 12 tasks, the 13th task will be failed. >> And the exception always is "OutOfMemoryError: Metaspace". see >> "task-failed.png" >> >> >> I conf -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/path/to/hprofFile >> and dump the hprof file. >> I analyse this hprof file. And find this error occurs may not caused by >> my user-code. >> So I came here ask for your help. To confirm whether the memory leak >> should be caused by Flink. >> >> >> Attached file "java_pid29294.hprof" is the dump file. >> >> >> Thanks. > > |
The heap dump did not show anything too suspicious. The only thing I
noticed is that there are 13 ChildFirstClassLoaders whereas there are only 6 Task instances in the heap dump. Are you running all 13 tasks on the same TaskExecutor? Cheers, Till On Mon, Aug 24, 2020 at 2:01 PM Till Rohrmann <[hidden email]> wrote: > What could also cause the problem is that the metaspace memory budget is > configured too tightly. Here is a pointer to increasing the metaspace size > [1]. > > [1] > https://ci.apache.org/projects/flink/flink-docs-master/ops/memory/mem_trouble.html#outofmemoryerror-metaspace > > Cheers, > Till > > On Mon, Aug 24, 2020 at 1:49 PM Till Rohrmann <[hidden email]> > wrote: > >> Hi, >> >> could you share with us the Flink cluster logs? This would help answering >> a lot of questions around your setup and the Flink version you are using. >> Thanks a lot! >> >> Cheers, >> Till >> >> On Mon, Aug 24, 2020 at 10:48 AM 耿延杰 <[hidden email]> wrote: >> >>> Still failed after every 12 tasks. >>> And the exception stack of failed tasks is different. >>> >>> >>> such as the recent failed tasks's exception info: >>> Caused by: java.lang.OutOfMemoryError: Metaspace >>> at java.lang.ClassLoader.defineClass1(Native >>> Method) >>> at >>> java.lang.ClassLoader.defineClass(ClassLoader.java:757) >>> at >>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) >>> at >>> java.net.URLClassLoader.defineClass(URLClassLoader.java:468) >>> at >>> java.net.URLClassLoader.access$100(URLClassLoader.java:74) >>> at >>> java.net.URLClassLoader$1.run(URLClassLoader.java:369) >>> at >>> java.net.URLClassLoader$1.run(URLClassLoader.java:363) >>> at >>> java.security.AccessController.doPrivileged(Native Method) >>> at >>> java.net.URLClassLoader.findClass(URLClassLoader.java:362) >>> at >>> org.apache.flink.util.ChildFirstClassLoader.loadClass(ChildFirstClassLoader.java:66) >>> at >>> java.lang.ClassLoader.loadClass(ClassLoader.java:352) >>> at >>> org.apache.http.impl.client.CloseableHttpClient.determineTarget(CloseableHttpClient.java:93) >>> at >>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) >>> at >>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) >>> at >>> ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:614) >>> at >>> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:117) >>> at >>> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:100) >>> at >>> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:95) >>> at >>> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:90) >>> at >>> ru.yandex.clickhouse.ClickHouseConnectionImpl.initTimeZone(ClickHouseConnectionImpl.java:94) >>> at >>> ru.yandex.clickhouse.ClickHouseConnectionImpl.<init>(ClickHouseConnectionImpl.java:80) >>> at >>> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:55) >>> at >>> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:47) >>> at >>> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:29) >>> at >>> java.sql.DriverManager.getConnection(DriverManager.java:664) >>> at >>> java.sql.DriverManager.getConnection(DriverManager.java:270) >>> at org.apache.flink.api.java.io >>> .jdbc.AbstractJDBCOutputFormat.establishConnection(AbstractJDBCOutputFormat.java:68) >>> at >>> com.xxx.clickhouse.ClickHouseJDBCOutputFormat.open(ClickHouseJDBCOutputFormat.java:53) >>> at >>> org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:205) >>> at >>> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707) >>> at >>> org.apache.flink.runtime.taskmanager.Task.run(Task.java:532) >>> at java.lang.Thread.run(Thread.java:748) >>> >>> >>> >>> >>> is different with the exception info in last email. >>> >>> >>> So analyse the dump file is the key. >>> >>> >>> >>> >>> >>> >>> ------------------ 原始邮件 ------------------ >>> 发件人: >>> "耿延杰" >>> < >>> [hidden email]>; >>> 发送时间: 2020年8月24日(星期一) 下午4:33 >>> 收件人: "dev"<[hidden email]>; >>> >>> 主题: 回复:OutOfMemoryError: Metaspace on Batch Task When Write into >>> Clickhouse >>> >>> >>> >>> Additional info: >>> >>> >>> The exception info in Flink Manager Page: >>> >>> >>> Caused by: java.lang.OutOfMemoryError: Metaspace >>> at java.lang.ClassLoader.defineClass1(Native Method) >>> at >>> java.lang.ClassLoader.defineClass(ClassLoader.java:757) >>> at >>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142) >>> at >>> java.net.URLClassLoader.defineClass(URLClassLoader.java:468) >>> at >>> java.net.URLClassLoader.access$100(URLClassLoader.java:74) >>> at >>> java.net.URLClassLoader$1.run(URLClassLoader.java:369) >>> at >>> java.net.URLClassLoader$1.run(URLClassLoader.java:363) >>> at java.security.AccessController.doPrivileged(Native >>> Method) >>> at >>> java.net.URLClassLoader.findClass(URLClassLoader.java:362) >>> at >>> org.apache.flink.util.ChildFirstClassLoader.loadClass(ChildFirstClassLoader.java:66) >>> at >>> java.lang.ClassLoader.loadClass(ClassLoader.java:352) >>> at >>> org.apache.http.impl.client.CloseableHttpClient.determineTarget(CloseableHttpClient.java:93) >>> at >>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83) >>> at >>> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108) >>> at >>> ru.yandex.clickhouse.ClickHouseStatementImpl.getInputStream(ClickHouseStatementImpl.java:614) >>> at >>> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:117) >>> at >>> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:100) >>> at >>> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:95) >>> at >>> ru.yandex.clickhouse.ClickHouseStatementImpl.executeQuery(ClickHouseStatementImpl.java:90) >>> at >>> ru.yandex.clickhouse.ClickHouseConnectionImpl.initTimeZone(ClickHouseConnectionImpl.java:94) >>> at >>> ru.yandex.clickhouse.ClickHouseConnectionImpl.<init>(ClickHouseConnectionImpl.java:80) >>> at >>> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:55) >>> at >>> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:47) >>> at >>> ru.yandex.clickhouse.ClickHouseDriver.connect(ClickHouseDriver.java:29) >>> at >>> java.sql.DriverManager.getConnection(DriverManager.java:664) >>> at >>> java.sql.DriverManager.getConnection(DriverManager.java:270) >>> at org.apache.flink.api.java.io >>> .jdbc.AbstractJDBCOutputFormat.establishConnection(AbstractJDBCOutputFormat.java:68) >>> at >>> com.xx.xx.xx.ClickHouseJDBCOutputFormat.open(ClickHouseJDBCOutputFormat.java:53) >>> at >>> org.apache.flink.runtime.operators.DataSinkTask.invoke(DataSinkTask.java:205) >>> at >>> org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707) >>> at >>> org.apache.flink.runtime.taskmanager.Task.run(Task.java:532) >>> at java.lang.Thread.run(Thread.java:748) >>> >>> >>> >>> >>> >>> >>> ------------------ 原始邮件 ------------------ >>> 发件人: >>> "耿延杰" >>> < >>> [hidden email]>; >>> 发送时间: 2020年8月24日(星期一) 下午4:20 >>> 收件人: "dev"<[hidden email]>; >>> >>> 主题: OutOfMemoryError: Metaspace on Batch Task When Write into >>> Clickhouse >>> >>> >>> >>> Hi, >>> >>> >>> I catch "OutOfMemoryError: Metaspace" on Batch Task When Write >>> into Clickhouse. >>> Attached *.java file is my task code. >>> >>> And I find that, after running 12 tasks, the 13th task will be failed. >>> And the exception always is "OutOfMemoryError: Metaspace". see >>> "task-failed.png" >>> >>> >>> I conf -XX:+HeapDumpOnOutOfMemoryError >>> -XX:HeapDumpPath=/path/to/hprofFile >>> and dump the hprof file. >>> I analyse this hprof file. And find this error occurs may not caused by >>> my user-code. >>> So I came here ask for your help. To confirm whether the memory leak >>> should be caused by Flink. >>> >>> >>> Attached file "java_pid29294.hprof" is the dump file. >>> >>> >>> Thanks. >> >> |
Free forum by Nabble | Edit this page |