This post was updated on .
I need to load a file that can be around 10 GB and later process it with hadrian which is the
java implementation for PFA's (https://github.com/opendatagroup/hadrian). I would like to execute this transformation step inside a specific task manager of the cluster (since I don't want to load 10 GB on every task manager node). Unfortunately, hadrian cannot be executed in a distributed way. So my question would be if there is a way to do some routing with Flink and execute this particular transformation step using always the same task manager node? Perhaps my approach is completely wrong, so if anybody has any suggestions I would be more than happy to hear them:) Thanks |
Any ideas?
![]() |
Looks like what I am currently doing, or at least close.No need to copy the big file on every node.Copy on one node. Read the data, and send it to a Kafka cluster using KafkaProducer() object.Use KafkaIO() (in case its a Beam app). Deploy to the node where the JM is running.It will be executed in s distributed fashion across all nodes.If that help, I can privately help you how the logistics & the code may look like + a loooooooooooot of tricksI have learned the hard way LOL!Thanks.
Amir- From: Mariano Gonzalez <[hidden email]> To: [hidden email] Sent: Thursday, September 29, 2016 3:07 PM Subject: Re: Use specific Task Manager for heavy computations Any ideas? -- View this message in context: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Use-specific-Task-Manager-for-heavy-computations-tp13747p13771.html Sent from the Apache Flink Mailing List archive. mailing list archive at Nabble.com. |
I think what you're suggesting is to load a large file into Kafka which will replicate it and make it available to all nodes. However, that is not what I want.
What I want is to run a specific transformation step on a specific TaskManager. |
In reply to this post by Mariano Gonzalez
Hi Mariano,
currently, there is nothing available in Flink to execute an operation on a specific machine. Regards, Robert On Wed, Sep 28, 2016 at 9:40 PM, Mariano Gonzalez < [hidden email]> wrote: > I need to load a PFA (portable format for analytics) that can be around 30 > GB and later process it with hadrian which is the java implementation for > PFA's (https://github.com/opendatagroup/hadrian). > > I would like to execute this transformation step inside a specific worker > of the cluster (since I don't want to load 30 GB on every single worker > node). Unfortunately, hadrian cannot be executed in a distributed way. > > So my question would be if there is a way to do some routing with Flink and > execute this particular transformation step using always the same worker > node? > > Perhaps my approach is completely wrong, so if anybody has any suggestions > I would be more than happy to hear them:) > > Thanks > |
So far, we have not introduced location constraints.
The reason is that this goes a bit against the paradigm of location transparency, which is necessary for failover, dynamically adjusting parallelism (which is a feature being worked on), etc. On Wed, Oct 12, 2016 at 10:35 AM, Robert Metzger <[hidden email]> wrote: > Hi Mariano, > > currently, there is nothing available in Flink to execute an operation on a > specific machine. > > Regards, > Robert > > > On Wed, Sep 28, 2016 at 9:40 PM, Mariano Gonzalez < > [hidden email]> wrote: > > > I need to load a PFA (portable format for analytics) that can be around > 30 > > GB and later process it with hadrian which is the java implementation for > > PFA's (https://github.com/opendatagroup/hadrian). > > > > I would like to execute this transformation step inside a specific worker > > of the cluster (since I don't want to load 30 GB on every single worker > > node). Unfortunately, hadrian cannot be executed in a distributed way. > > > > So my question would be if there is a way to do some routing with Flink > and > > execute this particular transformation step using always the same worker > > node? > > > > Perhaps my approach is completely wrong, so if anybody has any > suggestions > > I would be more than happy to hear them:) > > > > Thanks > > > |
Free forum by Nabble | Edit this page |