[jira] [Commented] (FLINK-939) Distribute required JAR files with seperate service

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Commented] (FLINK-939) Distribute required JAR files with seperate service

Shang Yuanchun (Jira)

    [ https://issues.apache.org/jira/browse/FLINK-939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14032147#comment-14032147 ]

Stephan Ewen commented on FLINK-939:
------------------------------------

Daniel, I'd be very happy to have you on board for this :-) The current state of the discussion is the following:

We plan to switch from the custom (Hadoop inspired) RPC to Akka. The reason is that we want a fast RPC that works also asynchronously (with futures) in order to get away from the polling that happens at several places. The polling latencies in the client and when asking for a remote endpoint address that is to be lazily deployed currently eat up most of the local execution times in our tests. Akka seems to be a good fit for that. The actor system also does the heartbeats between different nodes and allows you to listen for failures and delays.

Asterios Katsifodimos ([hidden email], https://github.com/asteriosk) has been working on this the past days/weeks.

The restriction in akka is the maximum frame size of messages. We are looking into different options to get around that. A "download" service for large blobs is one option. I personally would like to avoid a DFS dependency, because that would mean more configuration (currently it runs very nicely out of the box) and more latency (which we are trying to get down at the moment).



> Distribute required JAR files with seperate service
> ---------------------------------------------------
>
>                 Key: FLINK-939
>                 URL: https://issues.apache.org/jira/browse/FLINK-939
>             Project: Flink
>          Issue Type: Improvement
>            Reporter: Ufuk Celebi
>            Assignee: Daniel Warneke
>
> Currently, required user JAR files are distributed via the RPC service in {{JobGraph.writeRequiredJarFiles(DataOutput, AbstractJobVertex[])}}. The RPC service then tries to allocate a buffer on the client side heap to write the on-disk JAR, which [can lead to problems|https://github.com/apache/incubator-flink/pull/18].
> This should be replaced with a seperate service.



--
This message was sent by Atlassian JIRA
(v6.2#6252)