Jiayi Liao created FLINK-20044:
----------------------------------
Summary: Disposal of RocksDB could last forever
Key: FLINK-20044
URL:
https://issues.apache.org/jira/browse/FLINK-20044 Project: Flink
Issue Type: Bug
Components: Runtime / State Backends
Affects Versions: 1.9.0
Reporter: Jiayi Liao
The task cannot fail itself because it's stuck on the disposal of RocksDB, which also affects the job. I saw this for several times in recent months, most of the errors come from the broken disk. But I think we should also do something to deal with it more elegantly from Flink's perspective.
{code:java}
"LookUp_Join -> Sink_Unnamed (898/1777)- execution # 4" #411 prio=5 os_prio=0 tid=0x00007fc9b0286800 nid=0xff6fc runnable [0x00007fc966cfc000]
java.lang.Thread.State: RUNNABLE
at org.rocksdb.RocksDB.disposeInternal(Native Method)
at org.rocksdb.RocksObject.disposeInternal(RocksObject.java:37)
at org.rocksdb.AbstractImmutableNativeReference.close(AbstractImmutableNativeReference.java:57)
at org.apache.flink.util.IOUtils.closeQuietly(IOUtils.java:263)
at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.dispose(RocksDBKeyedStateBackend.java:349)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.dispose(AbstractStreamOperator.java:371)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.dispose(AbstractUdfStreamOperator.java:124)
at org.apache.flink.streaming.runtime.tasks.StreamTask.disposeAllOperators(StreamTask.java:618)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:517)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:733)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:539)
at java.lang.Thread.run(Thread.java:748)
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)