[jira] [Created] (FLINK-1843) Job History gets cleared too fast

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-1843) Job History gets cleared too fast

Shang Yuanchun (Jira)
Maximilian Michels created FLINK-1843:
-----------------------------------------

             Summary: Job History gets cleared too fast
                 Key: FLINK-1843
                 URL: https://issues.apache.org/jira/browse/FLINK-1843
             Project: Flink
          Issue Type: Bug
          Components: JobManager
    Affects Versions: 0.9
            Reporter: Maximilian Michels
             Fix For: 0.9


As per FLINK-1442, the JobManager stores the archived ExecutionGraph behind a SoftReference. At least for local setups, this mechanism doesn't seem to work properly. There are two issues:

- The history gets cleared too fast
- The history gets cleared in a non-sequential fashion, i.e. arbitrary old ExecutionGraph are discarded

To solve these problems we might

- Store the least recent ExecutionGraph behind a SoftReference
- Store the most recent ExecutionGraphs without a SoftReference

That way, we can save memory but have the latest history available to the user. We might introduce a configuration variable where the user can specify the number of ExecutionGraphs that should be held in memory. The remaining can be stored behind a SoftReference.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
Reply | Threaded
Open this post in threaded view
|

Re: [jira] [Created] (FLINK-1843) Job History gets cleared too fast

till.rohrmann
This also happens for cluster setups.

On Wed, Apr 8, 2015 at 11:29 AM, Maximilian Michels (JIRA) <[hidden email]>
wrote:

> Maximilian Michels created FLINK-1843:
> -----------------------------------------
>
>              Summary: Job History gets cleared too fast
>                  Key: FLINK-1843
>                  URL: https://issues.apache.org/jira/browse/FLINK-1843
>              Project: Flink
>           Issue Type: Bug
>           Components: JobManager
>     Affects Versions: 0.9
>             Reporter: Maximilian Michels
>              Fix For: 0.9
>
>
> As per FLINK-1442, the JobManager stores the archived ExecutionGraph
> behind a SoftReference. At least for local setups, this mechanism doesn't
> seem to work properly. There are two issues:
>
> - The history gets cleared too fast
> - The history gets cleared in a non-sequential fashion, i.e. arbitrary old
> ExecutionGraph are discarded
>
> To solve these problems we might
>
> - Store the least recent ExecutionGraph behind a SoftReference
> - Store the most recent ExecutionGraphs without a SoftReference
>
> That way, we can save memory but have the latest history available to the
> user. We might introduce a configuration variable where the user can
> specify the number of ExecutionGraphs that should be held in memory. The
> remaining can be stored behind a SoftReference.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>