xu created FLINK-12097:
-------------------------- Summary: Flink SQL hangs in StreamTableEnvironment.sqlUpdate, keeps executing and seems never stop, finally lead to java.lang.OutOfMemoryError: GC overhead limit exceeded Key: FLINK-12097 URL: https://issues.apache.org/jira/browse/FLINK-12097 Project: Flink Issue Type: Bug Components: Table SQL / Planner Affects Versions: 1.7.2 Reporter: xu Attachments: DSL.txt Hi Experts, There is a Flink application(Version 1.7.2) which is written in Flink SQL, and the SQL in the application is quite long, consists of about 10 tables, 1500 lines in total. When executing, I found it is hanged in StreamTableEnvironment.sqlUpdate, keep executing some code about calcite and the memory usage keeps grown up, after several minutes java.lang.OutOfMemoryError: GC overhead limit exceeded is got. I get some thread dumps: at org.apache.calcite.plan.volcano.RuleQueue.popMatch(RuleQueue.java:475) at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:640) at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:339) at org.apache.flink.table.api.TableEnvironment.runVolcanoPlanner(TableEnvironment.scala:373) at org.apache.flink.table.api.TableEnvironment.optimizeLogicalPlan(TableEnvironment.scala:292) at org.apache.flink.table.api.StreamTableEnvironment.optimize(StreamTableEnvironment.scala:812) at org.apache.flink.table.api.StreamTableEnvironment.translate(StreamTableEnvironment.scala:860) at org.apache.flink.table.api.StreamTableEnvironment.writeToSink(StreamTableEnvironment.scala:344) at org.apache.flink.table.api.TableEnvironment.insertInto(TableEnvironment.scala:879) at org.apache.flink.table.api.TableEnvironment.sqlUpdate(TableEnvironment.scala:817) at org.apache.flink.table.api.TableEnvironment.sqlUpdate(TableEnvironment.scala:777) at java.io.PrintWriter.write(PrintWriter.java:473) at org.apache.calcite.rel.AbstractRelNode$1.explain_(AbstractRelNode.java:415) at org.apache.calcite.rel.externalize.RelWriterImpl.done(RelWriterImpl.java:156) at org.apache.calcite.rel.AbstractRelNode.explain(AbstractRelNode.java:312) at org.apache.calcite.rel.AbstractRelNode.computeDigest(AbstractRelNode.java:420) at org.apache.calcite.rel.AbstractRelNode.recomputeDigest(AbstractRelNode.java:356) at org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:350) at org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1484) at org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:859) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:879) at org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:1755) at org.apache.calcite.plan.volcano.VolcanoRuleCall.transformTo(VolcanoRuleCall.java:135) at org.apache.calcite.plan.RelOptRuleCall.transformTo(RelOptRuleCall.java:234) at org.apache.calcite.rel.convert.ConverterRule.onMatch(ConverterRule.java:141) at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:212) at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:646) at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:339) at org.apache.flink.table.api.TableEnvironment.runVolcanoPlanner(TableEnvironment.scala:373) at org.apache.flink.table.api.TableEnvironment.optimizeLogicalPlan(TableEnvironment.scala:292) at org.apache.flink.table.api.StreamTableEnvironment.optimize(StreamTableEnvironment.scala:812) at org.apache.flink.table.api.StreamTableEnvironment.translate(StreamTableEnvironment.scala:860) at org.apache.flink.table.api.StreamTableEnvironment.writeToSink(StreamTableEnvironment.scala:344) at org.apache.flink.table.api.TableEnvironment.insertInto(TableEnvironment.scala:879) at org.apache.flink.table.api.TableEnvironment.sqlUpdate(TableEnvironment.scala:817) at org.apache.flink.table.api.TableEnvironment.sqlUpdate(TableEnvironment.scala:777) *Both point to some code about calcite.* And also I get the heap dump, found that there are *5703373 RexCall instances, and 5909525 String instances, 5909692 char[] instances ,**size is 6.8G*. I wonder why there are so many RexCall instances here, why it keeps on executing some calcite code and seems never stop. |char[]| |5,909,692 (16.4%)|6,873,430,938 (84.3%)| |java.lang.String| |5,909,525 (16.4%)|165,466,700 (2%)| |org.apache.calcite.rex.RexLocalRef| |5,901,313 (16.4%)|259,657,772 (3.2%)| |org.apache.flink.calcite.shaded.com.google.common.collect.RegularImmutableList| |5,739,479 (15.9%)|229,579,160 (2.8%)| |java.lang.Object[]| |5,732,702 (15.9%)|279,902,336 (3.4%)| |org.apache.calcite.rex.RexCall| |5,703,373 (15.8%)|273,761,904 (3.4%)| Because the SQL is a bit long, I put the SQL in the attachment. Also the SQL is written in DSL I invented. But it is translated to Flink SQL later. Look forward to your reply. Thanks a lot. Best Henry -- This message was sent by Atlassian JIRA (v7.6.3#76005) |
Free forum by Nabble | Edit this page |