Gyula Fora created FLINK-6291:
--------------------------------- Summary: Internal Timer service cannot be "removed" Key: FLINK-6291 URL: https://issues.apache.org/jira/browse/FLINK-6291 Project: Flink Issue Type: Bug Components: State Backends, Checkpointing, Streaming Affects Versions: 1.2.0 Reporter: Gyula Fora Currently it is not possible to register an internal timer service in one job and remove it after a savepoint as a nullpointer exception is thrown in the next savepoint: Caused by: java.lang.Exception: Could not write timer service of MyOperator (17/60) to checkpoint state stream. at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:418) at com.king.rbea.backend.operators.scriptexecution.RBEAOperator.snapshotState(RBEAOperator.java:327) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:357) ... 13 more Caused by: java.lang.NullPointerException at org.apache.flink.streaming.api.operators.HeapInternalTimerService.snapshotTimersForKeyGroup(HeapInternalTimerService.java:294) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:414) ... 15 more The timer serializer is null in this case as the timer service has never been started properly. We should probably discard the timers for the services that are not reregistered after restore so we can get rid of the state completely. -- This message was sent by Atlassian JIRA (v6.3.15#6346) |
Hi everybody,
I think I'm in the same issue above described in https://issues.apache.org/jira/browse/FLINK-6291 . Flink1-6.4 I have had this savepoint with a timer service belonging to a process function. When I restore a new job w/o the former process function ti fails in the following way. What is a valuable workaround for this? at org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:641) at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:586) at org.apache.flink.streaming.runtime.io.BarrierBuffer.notifyCheckpoint(BarrierBuffer.java:396) at org.apache.flink.streaming.runtime.io.BarrierBuffer.processBarrier(BarrierBuffer.java:292) at org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:200) at org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:209) at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) at java.lang.Thread.run(Thread.java:748) Caused by: java.lang.NullPointerException at org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.snapshotTimersForKeyGroup(InternalTimerServiceImpl.java:264) at org.apache.flink.streaming.api.operators.InternalTimerServiceSerializationProxy.write(InternalTimerServiceSerializationProxy.java:90) at org.apache.flink.streaming.api.operators.InternalTimeServiceManager.snapshotStateForKeyGroup(InternalTimeServiceManager.java:139) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:452) ... 15 more Really thank you, Andrea Il giorno mar 11 apr 2017 alle ore 10:49 Gyula Fora (JIRA) <[hidden email]> ha scritto: > Gyula Fora created FLINK-6291: > --------------------------------- > > Summary: Internal Timer service cannot be "removed" > Key: FLINK-6291 > URL: https://issues.apache.org/jira/browse/FLINK-6291 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing, Streaming > Affects Versions: 1.2.0 > Reporter: Gyula Fora > > > Currently it is not possible to register an internal timer service in one > job and remove it after a savepoint as a nullpointer exception is thrown in > the next savepoint: > > Caused by: java.lang.Exception: Could not write timer service of > MyOperator (17/60) to checkpoint state stream. > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:418) > at > com.king.rbea.backend.operators.scriptexecution.RBEAOperator.snapshotState(RBEAOperator.java:327) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:357) > ... 13 more > Caused by: java.lang.NullPointerException > at > org.apache.flink.streaming.api.operators.HeapInternalTimerService.snapshotTimersForKeyGroup(HeapInternalTimerService.java:294) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:414) > ... 15 more > > The timer serializer is null in this case as the timer service has never > been started properly. > > We should probably discard the timers for the services that are not > reregistered after restore so we can get rid of the state completely. > > > > -- > This message was sent by Atlassian JIRA > (v6.3.15#6346) > -- *Andrea Spina* Head of R&D @ Radicalbit Srl Via Giovanni Battista Pirelli 11, 20124, Milano - IT |
Free forum by Nabble | Edit this page |