(DEPRECATED) Apache Flink Mailing List archive.

Accessing TM metrics

Classic

List

Threaded

3 messages Options

Gyula Fóra

Accessing TM metrics

Hey guys,

I am trying to look at the throughput of my Flink Streaming job over time.
Is there any way to extract this information from the dashboard or is it
only possible to view the cumulative statistics at given time points.

Also I am wondering whether there is any info about the latency in the
metrics somewhere.

Cheers,
Gyula

Stephan Ewen

Re: Accessing TM metrics

You probably need to calculate the throughput yourself at this point, from
accumulated number of records. You can periodically poll the following URLs
via HTTP GET

- /jobs/<jobid> : This gives you the aggregate number of records / bytes
per JobVertex
- /jobs/<jobid>/vertices/<vertexid> : This gives you accumulated records /
bytes for subtasks

There is no latency metric right now. The latency is quite tricky to
assess, in general. It needs timestamps attached at the sources and
measured at the sinks. So far, no problem, but this assumes that source and
sink clocks are quite in sync. If they are off by a few milliseconds, then
the low latencies are quite off already. We may decide to accept that
inaccuracy, or to try and correct it a bit by letting the JobManager
broadcast its clock offsets and TaskManagers offset theirs.

For experiments, we wrote special jobs where we could sample the records
that after two re-partitionings return to the same JVM, so we would not
have clock misalignment. Still thinking about good ways to have a general
purpose latency measurement mechanism.

If you have any ideas there, let me know!

Greetings,
Stephan

On Sat, Nov 7, 2015 at 7:39 PM, Gyula Fóra <[hidden email]> wrote:

> Hey guys,
>
> I am trying to look at the throughput of my Flink Streaming job over time.
> Is there any way to extract this information from the dashboard or is it
> only possible to view the cumulative statistics at given time points.
>
> Also I am wondering whether there is any info about the latency in the
> metrics somewhere.
>
> Cheers,
> Gyula
>

Gyula Fóra

Re: Accessing TM metrics

Thanks Stephan, this should work for now :)

You are right, latency is quite tricky, I don't have any better ideas
either, but I will definitely let you know if there are any.

Gyula

Stephan Ewen <[hidden email]> ezt írta (időpont: 2015. nov. 7., Szo,
21:58):

> You probably need to calculate the throughput yourself at this point, from
> accumulated number of records. You can periodically poll the following URLs
> via HTTP GET
>
> - /jobs/<jobid> : This gives you the aggregate number of records / bytes
> per JobVertex
> - /jobs/<jobid>/vertices/<vertexid> : This gives you accumulated records /
> bytes for subtasks
>
> There is no latency metric right now. The latency is quite tricky to
> assess, in general. It needs timestamps attached at the sources and
> measured at the sinks. So far, no problem, but this assumes that source and
> sink clocks are quite in sync. If they are off by a few milliseconds, then
> the low latencies are quite off already. We may decide to accept that
> inaccuracy, or to try and correct it a bit by letting the JobManager
> broadcast its clock offsets and TaskManagers offset theirs.
>
> For experiments, we wrote special jobs where we could sample the records
> that after two re-partitionings return to the same JVM, so we would not
> have clock misalignment. Still thinking about good ways to have a general
> purpose latency measurement mechanism.
>
> If you have any ideas there, let me know!
>
> Greetings,
> Stephan
>
>
> On Sat, Nov 7, 2015 at 7:39 PM, Gyula Fóra <[hidden email]> wrote:
>
> > Hey guys,
> >
> > I am trying to look at the throughput of my Flink Streaming job over
> time.
> > Is there any way to extract this information from the dashboard or is it
> > only possible to view the cumulative statistics at given time points.
> >
> > Also I am wondering whether there is any info about the latency in the
> > metrics somewhere.
> >
> > Cheers,
> > Gyula
> >
>