[jira] [Created] (FLINK-6291) Internal Timer service cannot be "removed"

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-6291) Internal Timer service cannot be "removed"

Shang Yuanchun (Jira)
Gyula Fora created FLINK-6291:
---------------------------------

             Summary: Internal Timer service cannot be "removed"
                 Key: FLINK-6291
                 URL: https://issues.apache.org/jira/browse/FLINK-6291
             Project: Flink
          Issue Type: Bug
          Components: State Backends, Checkpointing, Streaming
    Affects Versions: 1.2.0
            Reporter: Gyula Fora


Currently it is not possible to register an internal timer service in one job and remove it after a savepoint as a nullpointer exception is thrown in the next savepoint:

Caused by: java.lang.Exception: Could not write timer service of MyOperator (17/60) to checkpoint state stream.
        at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:418)
        at com.king.rbea.backend.operators.scriptexecution.RBEAOperator.snapshotState(RBEAOperator.java:327)
        at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:357)
        ... 13 more
Caused by: java.lang.NullPointerException
        at org.apache.flink.streaming.api.operators.HeapInternalTimerService.snapshotTimersForKeyGroup(HeapInternalTimerService.java:294)
        at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:414)
        ... 15 more

The timer serializer is null in this case as the timer service has never been started properly.

We should probably discard the timers for the services that are not reregistered after restore so we can get rid of the state completely.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
Reply | Threaded
Open this post in threaded view
|

Re: [jira] [Created] (FLINK-6291) Internal Timer service cannot be "removed"

Andrea Spina
Hi everybody,
I think I'm in the same issue above described in
https://issues.apache.org/jira/browse/FLINK-6291 . Flink1-6.4
I have had this savepoint with a timer service belonging to a process
function. When I restore a new job w/o the former process function ti fails
in the following way.
What is a valuable workaround for this?

        at
org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:641)
        at
org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:586)
        at
org.apache.flink.streaming.runtime.io.BarrierBuffer.notifyCheckpoint(BarrierBuffer.java:396)
        at
org.apache.flink.streaming.runtime.io.BarrierBuffer.processBarrier(BarrierBuffer.java:292)
        at
org.apache.flink.streaming.runtime.io.BarrierBuffer.getNextNonBlocked(BarrierBuffer.java:200)
        at
org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:209)
        at
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:105)
        at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
        at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
        at
org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.snapshotTimersForKeyGroup(InternalTimerServiceImpl.java:264)
        at
org.apache.flink.streaming.api.operators.InternalTimerServiceSerializationProxy.write(InternalTimerServiceSerializationProxy.java:90)
        at
org.apache.flink.streaming.api.operators.InternalTimeServiceManager.snapshotStateForKeyGroup(InternalTimeServiceManager.java:139)
        at
org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:452)
        ... 15 more

Really thank you,
Andrea

Il giorno mar 11 apr 2017 alle ore 10:49 Gyula Fora (JIRA) <[hidden email]>
ha scritto:

> Gyula Fora created FLINK-6291:
> ---------------------------------
>
>              Summary: Internal Timer service cannot be "removed"
>                  Key: FLINK-6291
>                  URL: https://issues.apache.org/jira/browse/FLINK-6291
>              Project: Flink
>           Issue Type: Bug
>           Components: State Backends, Checkpointing, Streaming
>     Affects Versions: 1.2.0
>             Reporter: Gyula Fora
>
>
> Currently it is not possible to register an internal timer service in one
> job and remove it after a savepoint as a nullpointer exception is thrown in
> the next savepoint:
>
> Caused by: java.lang.Exception: Could not write timer service of
> MyOperator (17/60) to checkpoint state stream.
>         at
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:418)
>         at
> com.king.rbea.backend.operators.scriptexecution.RBEAOperator.snapshotState(RBEAOperator.java:327)
>         at
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:357)
>         ... 13 more
> Caused by: java.lang.NullPointerException
>         at
> org.apache.flink.streaming.api.operators.HeapInternalTimerService.snapshotTimersForKeyGroup(HeapInternalTimerService.java:294)
>         at
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:414)
>         ... 15 more
>
> The timer serializer is null in this case as the timer service has never
> been started properly.
>
> We should probably discard the timers for the services that are not
> reregistered after restore so we can get rid of the state completely.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.15#6346)
>


--
*Andrea Spina*
Head of R&D @ Radicalbit Srl
Via Giovanni Battista Pirelli 11, 20124, Milano - IT