(DEPRECATED) Apache Flink Mailing List archive.

How to run wordcount program on Multi-node Cluster.

Classic

List

Threaded

10 messages Options

P. Ramanjaneya Reddy

How to run wordcount program on Multi-node Cluster.

Hello Everyone,

I have followed the steps specified below link to Install & Run Apache
Flink on Multi-node Cluster.

http://data-flair.training/blogs/install-run-deploy-flink-multi-node-cluster/
used flink-1.3.2-bin-hadoop27-scala_2.10.tgz for install

using the command
" bin/flink run
/home/root1/NAI/Tools/BEAM/Flink_Cluster/rama/flink/examples/streaming/WordCount.jar"
able to run wordcount, but where can i see which input consider and output
generated?

and how can i specify the input and output paths?

I'm trying to understand how the wordcount will work using Multi-node
Cluster.?

any suggestions will help me further understanding?

Thanks & Regards,
Ramanji.

Timo Walther-2

Re: How to run wordcount program on Multi-node Cluster.

Hi Ramanji,

you can find the source code of the examples here:
https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/wordcount/WordCount.java

A general introduction how the cluster execution works can be found here:
https://ci.apache.org/projects/flink/flink-docs-release-1.4/concepts/programming-model.html#programs-and-dataflows
https://ci.apache.org/projects/flink/flink-docs-release-1.4/concepts/runtime.html

It might also be helpful to have a look at the web interface which can
show you a nice graph of the job.

I hope this helps. Feel free to ask further questions.

Regards,
Timo

Am 07.08.17 um 14:00 schrieb P. Ramanjaneya Reddy:

> Hello Everyone,
>
> I have followed the steps specified below link to Install & Run Apache
> Flink on Multi-node Cluster.
>
> http://data-flair.training/blogs/install-run-deploy-flink-multi-node-cluster/
> used flink-1.3.2-bin-hadoop27-scala_2.10.tgz for install
>
> using the command
> " bin/flink run
> /home/root1/NAI/Tools/BEAM/Flink_Cluster/rama/flink/examples/streaming/WordCount.jar"
> able to run wordcount, but where can i see which input consider and output
> generated?
>
> and how can i specify the input and output paths?
>
> I'm trying to understand how the wordcount will work using Multi-node
> Cluster.?
>
> any suggestions will help me further understanding?
>
> Thanks & Regards,
> Ramanji.
>

P. Ramanjaneya Reddy

Re: How to run wordcount program on Multi-node Cluster.

Thank you Timo.

root1@root1-HP-EliteBook-840-G2:~/NAI/Tools/BEAM/Flink_Cluster/rama/flink$
*./bin/flink
run ./examples/streaming/WordCount.jar --input
file:///home/root1/hamlet.txt --output file:///home/root1/wordcount_out*

Execution of worcountjar gives error...

08/07/2017 18:03:16 Source: Custom File Source(1/1) switched to FAILED
java.io.FileNotFoundException: The provided file path
file:/home/root1/hamlet.txt does not exist.
at
org.apache.flink.streaming.api.functions.source.ContinuousFileMonitoringFunction.run(ContinuousFileMonitoringFunction.java:192)
at
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87)
at
org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:55)
at
org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:95)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:263)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
at java.lang.Thread.run(Thread.java:748)

On Mon, Aug 7, 2017 at 5:56 PM, Timo Walther <[hidden email]> wrote:

> Hi Ramanji,
>
> you can find the source code of the examples here:
> https://github.com/apache/flink/blob/master/flink-examples/
> flink-examples-streaming/src/main/java/org/apache/flink/
> streaming/examples/wordcount/WordCount.java
>
> A general introduction how the cluster execution works can be found here:
> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
> concepts/programming-model.html#programs-and-dataflows
> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
> concepts/runtime.html
>
> It might also be helpful to have a look at the web interface which can
> show you a nice graph of the job.
>
> I hope this helps. Feel free to ask further questions.
>
> Regards,
> Timo
>
>
> Am 07.08.17 um 14:00 schrieb P. Ramanjaneya Reddy:
>
> Hello Everyone,
>>
>> I have followed the steps specified below link to Install & Run Apache
>> Flink on Multi-node Cluster.
>>
>> http://data-flair.training/blogs/install-run-deploy-flink-
>> multi-node-cluster/
>> used flink-1.3.2-bin-hadoop27-scala_2.10.tgz for install
>>
>> using the command
>> " bin/flink run
>> /home/root1/NAI/Tools/BEAM/Flink_Cluster/rama/flink/examples
>> /streaming/WordCount.jar"
>> able to run wordcount, but where can i see which input consider and output
>> generated?
>>
>> and how can i specify the input and output paths?
>>
>> I'm trying to understand how the wordcount will work using Multi-node
>> Cluster.?
>>
>> any suggestions will help me further understanding?
>>
>> Thanks & Regards,
>> Ramanji.
>>
>>
>

Timo Walther-2

Re: How to run wordcount program on Multi-node Cluster.

Make sure that the file exists and is accessible from all Flink tasks
managers.

Am 07.08.17 um 14:35 schrieb P. Ramanjaneya Reddy:

> Thank you Timo.
>
>
> root1@root1-HP-EliteBook-840-G2:~/NAI/Tools/BEAM/Flink_Cluster/rama/flink$
> *./bin/flink
> run ./examples/streaming/WordCount.jar --input
> file:///home/root1/hamlet.txt --output file:///home/root1/wordcount_out*
>
>
> Execution of worcountjar gives error...
>
> 08/07/2017 18:03:16 Source: Custom File Source(1/1) switched to FAILED
> java.io.FileNotFoundException: The provided file path
> file:/home/root1/hamlet.txt does not exist.
> at
> org.apache.flink.streaming.api.functions.source.ContinuousFileMonitoringFunction.run(ContinuousFileMonitoringFunction.java:192)
> at
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87)
> at
> org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:55)
> at
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:95)
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:263)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
> at java.lang.Thread.run(Thread.java:748)
>
>
> On Mon, Aug 7, 2017 at 5:56 PM, Timo Walther <[hidden email]> wrote:
>
>> Hi Ramanji,
>>
>> you can find the source code of the examples here:
>> https://github.com/apache/flink/blob/master/flink-examples/
>> flink-examples-streaming/src/main/java/org/apache/flink/
>> streaming/examples/wordcount/WordCount.java
>>
>> A general introduction how the cluster execution works can be found here:
>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
>> concepts/programming-model.html#programs-and-dataflows
>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
>> concepts/runtime.html
>>
>> It might also be helpful to have a look at the web interface which can
>> show you a nice graph of the job.
>>
>> I hope this helps. Feel free to ask further questions.
>>
>> Regards,
>> Timo
>>
>>
>> Am 07.08.17 um 14:00 schrieb P. Ramanjaneya Reddy:
>>
>> Hello Everyone,
>>> I have followed the steps specified below link to Install & Run Apache
>>> Flink on Multi-node Cluster.
>>>
>>> http://data-flair.training/blogs/install-run-deploy-flink-
>>> multi-node-cluster/
>>> used flink-1.3.2-bin-hadoop27-scala_2.10.tgz for install
>>>
>>> using the command
>>> " bin/flink run
>>> /home/root1/NAI/Tools/BEAM/Flink_Cluster/rama/flink/examples
>>> /streaming/WordCount.jar"
>>> able to run wordcount, but where can i see which input consider and output
>>> generated?
>>>
>>> and how can i specify the input and output paths?
>>>
>>> I'm trying to understand how the wordcount will work using Multi-node
>>> Cluster.?
>>>
>>> any suggestions will help me further understanding?
>>>
>>> Thanks & Regards,
>>> Ramanji.
>>>
>>>

P. Ramanjaneya Reddy

Re: How to run wordcount program on Multi-node Cluster.

Hi Timo,
Problem is resolved after copy input file to all tasks managers.

and where should generate outputfile? Is it in jobmanager or task manager?

Where can i see the execution logs to understand how word count done each
task manager?

By the way any option to overwride...?

08/07/2017 19:27:00 Keyed Aggregation -> Sink: Unnamed(1/1) switched to
FAILED
java.io.IOException: File or directory already exists. Existing files and
directories are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to
overwrite existing files and directories.
at
org.apache.flink.core.fs.FileSystem.initOutPathLocalFS(FileSystem.java:763)
at
org.apache.flink.core.fs.SafetyNetWrapperFileSystem.initOutPathLocalFS(SafetyNetWrapperFileSystem.java:135)
at
org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:231)
at
org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:78)
at
org.apache.flink.streaming.api.functions.sink.OutputFormatSinkFunction.open(OutputFormatSinkFunction.java:61)
at
org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
at
org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:111)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:376)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
at java.lang.Thread.run(Thread.java:745)

On Mon, Aug 7, 2017 at 6:49 PM, Timo Walther <[hidden email]> wrote:

> Make sure that the file exists and is accessible from all Flink tasks
> managers.
>
>
> Am 07.08.17 um 14:35 schrieb P. Ramanjaneya Reddy:
>
>> Thank you Timo.
>>
>>
>> root1@root1-HP-EliteBook-840-G2:~/NAI/Tools/BEAM/Flink_Clust
>> er/rama/flink$
>> *./bin/flink
>> run ./examples/streaming/WordCount.jar --input
>> file:///home/root1/hamlet.txt --output file:///home/root1/wordcount_out*
>>
>>
>>
>> Execution of worcountjar gives error...
>>
>> 08/07/2017 18:03:16 Source: Custom File Source(1/1) switched to FAILED
>> java.io.FileNotFoundException: The provided file path
>> file:/home/root1/hamlet.txt does not exist.
>> at
>> org.apache.flink.streaming.api.functions.source.ContinuousFi
>> leMonitoringFunction.run(ContinuousFileMonitoringFunction.java:192)
>> at
>> org.apache.flink.streaming.api.operators.StreamSource.run(
>> StreamSource.java:87)
>> at
>> org.apache.flink.streaming.api.operators.StreamSource.run(
>> StreamSource.java:55)
>> at
>> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.
>> run(SourceStreamTask.java:95)
>> at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(
>> StreamTask.java:263)
>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>> at java.lang.Thread.run(Thread.java:748)
>>
>>
>> On Mon, Aug 7, 2017 at 5:56 PM, Timo Walther <[hidden email]> wrote:
>>
>> Hi Ramanji,
>>>
>>> you can find the source code of the examples here:
>>> https://github.com/apache/flink/blob/master/flink-examples/
>>> flink-examples-streaming/src/main/java/org/apache/flink/
>>> streaming/examples/wordcount/WordCount.java
>>>
>>> A general introduction how the cluster execution works can be found here:
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
>>> concepts/programming-model.html#programs-and-dataflows
>>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
>>> concepts/runtime.html
>>>
>>> It might also be helpful to have a look at the web interface which can
>>> show you a nice graph of the job.
>>>
>>> I hope this helps. Feel free to ask further questions.
>>>
>>> Regards,
>>> Timo
>>>
>>>
>>> Am 07.08.17 um 14:00 schrieb P. Ramanjaneya Reddy:
>>>
>>> Hello Everyone,
>>>
>>>> I have followed the steps specified below link to Install & Run Apache
>>>> Flink on Multi-node Cluster.
>>>>
>>>> http://data-flair.training/blogs/install-run-deploy-flink-
>>>> multi-node-cluster/
>>>> used flink-1.3.2-bin-hadoop27-scala_2.10.tgz for install
>>>>
>>>> using the command
>>>> " bin/flink run
>>>> /home/root1/NAI/Tools/BEAM/Flink_Cluster/rama/flink/examples
>>>> /streaming/WordCount.jar"
>>>> able to run wordcount, but where can i see which input consider and
>>>> output
>>>> generated?
>>>>
>>>> and how can i specify the input and output paths?
>>>>
>>>> I'm trying to understand how the wordcount will work using Multi-node
>>>> Cluster.?
>>>>
>>>> any suggestions will help me further understanding?
>>>>
>>>> Thanks & Regards,
>>>> Ramanji.
>>>>
>>>>
>>>>
>

Timo Walther-2

Re: How to run wordcount program on Multi-node Cluster.

Flink is a distributed software for clusters. You need something like a
distributed file system. So that input file and output files can be
accessed from all nodes.

Each TM has a log directory where the execution logs are stored.

You can set additional properties to your output format by importing the
code in your IDE.

Am 07.08.17 um 16:03 schrieb P. Ramanjaneya Reddy:

> Hi Timo,
> Problem is resolved after copy input file to all tasks managers.
>
> and where should generate outputfile? Is it in jobmanager or task manager?
>
> Where can i see the execution logs to understand how word count done each
> task manager?
>
>
> By the way any option to overwride...?
>
> 08/07/2017 19:27:00 Keyed Aggregation -> Sink: Unnamed(1/1) switched to
> FAILED
> java.io.IOException: File or directory already exists. Existing files and
> directories are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode to
> overwrite existing files and directories.
> at
> org.apache.flink.core.fs.FileSystem.initOutPathLocalFS(FileSystem.java:763)
> at
> org.apache.flink.core.fs.SafetyNetWrapperFileSystem.initOutPathLocalFS(SafetyNetWrapperFileSystem.java:135)
> at
> org.apache.flink.api.common.io.FileOutputFormat.open(FileOutputFormat.java:231)
> at
> org.apache.flink.api.java.io.TextOutputFormat.open(TextOutputFormat.java:78)
> at
> org.apache.flink.streaming.api.functions.sink.OutputFormatSinkFunction.open(OutputFormatSinkFunction.java:61)
> at
> org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:36)
> at
> org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.open(AbstractUdfStreamOperator.java:111)
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:376)
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
> at java.lang.Thread.run(Thread.java:745)
>
>
> On Mon, Aug 7, 2017 at 6:49 PM, Timo Walther <[hidden email]> wrote:
>
>> Make sure that the file exists and is accessible from all Flink tasks
>> managers.
>>
>>
>> Am 07.08.17 um 14:35 schrieb P. Ramanjaneya Reddy:
>>
>>> Thank you Timo.
>>>
>>>
>>> root1@root1-HP-EliteBook-840-G2:~/NAI/Tools/BEAM/Flink_Clust
>>> er/rama/flink$
>>> *./bin/flink
>>> run ./examples/streaming/WordCount.jar --input
>>> file:///home/root1/hamlet.txt --output file:///home/root1/wordcount_out*
>>>
>>>
>>>
>>> Execution of worcountjar gives error...
>>>
>>> 08/07/2017 18:03:16 Source: Custom File Source(1/1) switched to FAILED
>>> java.io.FileNotFoundException: The provided file path
>>> file:/home/root1/hamlet.txt does not exist.
>>> at
>>> org.apache.flink.streaming.api.functions.source.ContinuousFi
>>> leMonitoringFunction.run(ContinuousFileMonitoringFunction.java:192)
>>> at
>>> org.apache.flink.streaming.api.operators.StreamSource.run(
>>> StreamSource.java:87)
>>> at
>>> org.apache.flink.streaming.api.operators.StreamSource.run(
>>> StreamSource.java:55)
>>> at
>>> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.
>>> run(SourceStreamTask.java:95)
>>> at
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(
>>> StreamTask.java:263)
>>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>>> at java.lang.Thread.run(Thread.java:748)
>>>
>>>
>>> On Mon, Aug 7, 2017 at 5:56 PM, Timo Walther <[hidden email]> wrote:
>>>
>>> Hi Ramanji,
>>>> you can find the source code of the examples here:
>>>> https://github.com/apache/flink/blob/master/flink-examples/
>>>> flink-examples-streaming/src/main/java/org/apache/flink/
>>>> streaming/examples/wordcount/WordCount.java
>>>>
>>>> A general introduction how the cluster execution works can be found here:
>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
>>>> concepts/programming-model.html#programs-and-dataflows
>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
>>>> concepts/runtime.html
>>>>
>>>> It might also be helpful to have a look at the web interface which can
>>>> show you a nice graph of the job.
>>>>
>>>> I hope this helps. Feel free to ask further questions.
>>>>
>>>> Regards,
>>>> Timo
>>>>
>>>>
>>>> Am 07.08.17 um 14:00 schrieb P. Ramanjaneya Reddy:
>>>>
>>>> Hello Everyone,
>>>>
>>>>> I have followed the steps specified below link to Install & Run Apache
>>>>> Flink on Multi-node Cluster.
>>>>>
>>>>> http://data-flair.training/blogs/install-run-deploy-flink-
>>>>> multi-node-cluster/
>>>>> used flink-1.3.2-bin-hadoop27-scala_2.10.tgz for install
>>>>>
>>>>> using the command
>>>>> " bin/flink run
>>>>> /home/root1/NAI/Tools/BEAM/Flink_Cluster/rama/flink/examples
>>>>> /streaming/WordCount.jar"
>>>>> able to run wordcount, but where can i see which input consider and
>>>>> output
>>>>> generated?
>>>>>
>>>>> and how can i specify the input and output paths?
>>>>>
>>>>> I'm trying to understand how the wordcount will work using Multi-node
>>>>> Cluster.?
>>>>>
>>>>> any suggestions will help me further understanding?
>>>>>
>>>>> Thanks & Regards,
>>>>> Ramanji.
>>>>>
>>>>>
>>>>>

P. Ramanjaneya Reddy

Re: How to run wordcount program on Multi-node Cluster.

Hi Timo,

How to make access the files across TM?

Thanks & Regards,
Ramanji.

On Mon, Aug 7, 2017 at 7:45 PM, Timo Walther <[hidden email]> wrote:

> Flink is a distributed software for clusters. You need something like a
> distributed file system. So that input file and output files can be
> accessed from all nodes.
>
> Each TM has a log directory where the execution logs are stored.
>
> You can set additional properties to your output format by importing the
> code in your IDE.
>
> Am 07.08.17 um 16:03 schrieb P. Ramanjaneya Reddy:
>
> Hi Timo,
>> Problem is resolved after copy input file to all tasks managers.
>>
>> and where should generate outputfile? Is it in jobmanager or task manager?
>>
>> Where can i see the execution logs to understand how word count done each
>> task manager?
>>
>>
>> By the way any option to overwride...?
>>
>> 08/07/2017 19:27:00 Keyed Aggregation -> Sink: Unnamed(1/1) switched to
>> FAILED
>> java.io.IOException: File or directory already exists. Existing files and
>> directories are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode
>> to
>> overwrite existing files and directories.
>> at
>> org.apache.flink.core.fs.FileSystem.initOutPathLocalFS(FileS
>> ystem.java:763)
>> at
>> org.apache.flink.core.fs.SafetyNetWrapperFileSystem.initOutP
>> athLocalFS(SafetyNetWrapperFileSystem.java:135)
>> at
>> org.apache.flink.api.common.io.FileOutputFormat.open(FileOut
>> putFormat.java:231)
>> at
>> org.apache.flink.api.java.io.TextOutputFormat.open(TextOutpu
>> tFormat.java:78)
>> at
>> org.apache.flink.streaming.api.functions.sink.OutputFormatSi
>> nkFunction.open(OutputFormatSinkFunction.java:61)
>> at
>> org.apache.flink.api.common.functions.util.FunctionUtils.ope
>> nFunction(FunctionUtils.java:36)
>> at
>> org.apache.flink.streaming.api.operators.AbstractUdfStreamOp
>> erator.open(AbstractUdfStreamOperator.java:111)
>> at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllO
>> perators(StreamTask.java:376)
>> at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(
>> StreamTask.java:253)
>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>> at java.lang.Thread.run(Thread.java:745)
>>
>>
>> On Mon, Aug 7, 2017 at 6:49 PM, Timo Walther <[hidden email]> wrote:
>>
>> Make sure that the file exists and is accessible from all Flink tasks
>>> managers.
>>>
>>>
>>> Am 07.08.17 um 14:35 schrieb P. Ramanjaneya Reddy:
>>>
>>> Thank you Timo.
>>>>
>>>>
>>>> root1@root1-HP-EliteBook-840-G2:~/NAI/Tools/BEAM/Flink_Clust
>>>> er/rama/flink$
>>>> *./bin/flink
>>>> run ./examples/streaming/WordCount.jar --input
>>>> file:///home/root1/hamlet.txt --output file:///home/root1/wordcount_o
>>>> ut*
>>>>
>>>>
>>>>
>>>> Execution of worcountjar gives error...
>>>>
>>>> 08/07/2017 18:03:16 Source: Custom File Source(1/1) switched to FAILED
>>>> java.io.FileNotFoundException: The provided file path
>>>> file:/home/root1/hamlet.txt does not exist.
>>>> at
>>>> org.apache.flink.streaming.api.functions.source.ContinuousFi
>>>> leMonitoringFunction.run(ContinuousFileMonitoringFunction.java:192)
>>>> at
>>>> org.apache.flink.streaming.api.operators.StreamSource.run(
>>>> StreamSource.java:87)
>>>> at
>>>> org.apache.flink.streaming.api.operators.StreamSource.run(
>>>> StreamSource.java:55)
>>>> at
>>>> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.
>>>> run(SourceStreamTask.java:95)
>>>> at
>>>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(
>>>> StreamTask.java:263)
>>>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>>>> at java.lang.Thread.run(Thread.java:748)
>>>>
>>>>
>>>> On Mon, Aug 7, 2017 at 5:56 PM, Timo Walther <[hidden email]>
>>>> wrote:
>>>>
>>>> Hi Ramanji,
>>>>
>>>>> you can find the source code of the examples here:
>>>>> https://github.com/apache/flink/blob/master/flink-examples/
>>>>> flink-examples-streaming/src/main/java/org/apache/flink/
>>>>> streaming/examples/wordcount/WordCount.java
>>>>>
>>>>> A general introduction how the cluster execution works can be found
>>>>> here:
>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
>>>>> concepts/programming-model.html#programs-and-dataflows
>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
>>>>> concepts/runtime.html
>>>>>
>>>>> It might also be helpful to have a look at the web interface which can
>>>>> show you a nice graph of the job.
>>>>>
>>>>> I hope this helps. Feel free to ask further questions.
>>>>>
>>>>> Regards,
>>>>> Timo
>>>>>
>>>>>
>>>>> Am 07.08.17 um 14:00 schrieb P. Ramanjaneya Reddy:
>>>>>
>>>>> Hello Everyone,
>>>>>
>>>>> I have followed the steps specified below link to Install & Run Apache
>>>>>> Flink on Multi-node Cluster.
>>>>>>
>>>>>> http://data-flair.training/blogs/install-run-deploy-flink-
>>>>>> multi-node-cluster/
>>>>>> used flink-1.3.2-bin-hadoop27-scala_2.10.tgz for install
>>>>>>
>>>>>> using the command
>>>>>> " bin/flink run
>>>>>> /home/root1/NAI/Tools/BEAM/Flink_Cluster/rama/flink/examples
>>>>>> /streaming/WordCount.jar"
>>>>>> able to run wordcount, but where can i see which input consider and
>>>>>> output
>>>>>> generated?
>>>>>>
>>>>>> and how can i specify the input and output paths?
>>>>>>
>>>>>> I'm trying to understand how the wordcount will work using Multi-node
>>>>>> Cluster.?
>>>>>>
>>>>>> any suggestions will help me further understanding?
>>>>>>
>>>>>> Thanks & Regards,
>>>>>> Ramanji.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>

Felix Neutatz

Re: How to run wordcount program on Multi-node Cluster.

Hi,

like Timo said e.g. you need a distributed filesystem like HDFS.

Best regards,
Felix

On Aug 8, 2017 09:01, "P. Ramanjaneya Reddy" <[hidden email]> wrote:

Hi Timo,

How to make access the files across TM?

Thanks & Regards,
Ramanji.

On Mon, Aug 7, 2017 at 7:45 PM, Timo Walther <[hidden email]> wrote:

manager?

>>
>> Where can i see the execution logs to understand how word count done each
>> task manager?
>>
>>
>> By the way any option to overwride...?
>>
>> 08/07/2017 19:27:00 Keyed Aggregation -> Sink: Unnamed(1/1) switched to
>> FAILED
>> java.io.IOException: File or directory already exists. Existing files and
>> directories are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode
>> to
>> overwrite existing files and directories.
>> at
>> org.apache.flink.core.fs.FileSystem.initOutPathLocalFS(FileS
>> ystem.java:763)
>> at
>> org.apache.flink.core.fs.SafetyNetWrapperFileSystem.initOutP
>> athLocalFS(SafetyNetWrapperFileSystem.java:135)
>> at
>> org.apache.flink.api.common.io.FileOutputFormat.open(FileOut
>> putFormat.java:231)
>> at
>> org.apache.flink.api.java.io.TextOutputFormat.open(TextOutpu
>> tFormat.java:78)
>> at
>> org.apache.flink.streaming.api.functions.sink.OutputFormatSi
>> nkFunction.open(OutputFormatSinkFunction.java:61)
>> at
>> org.apache.flink.api.common.functions.util.FunctionUtils.ope
>> nFunction(FunctionUtils.java:36)
>> at
>> org.apache.flink.streaming.api.operators.AbstractUdfStreamOp
>> erator.open(AbstractUdfStreamOperator.java:111)
>> at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllO
>> perators(StreamTask.java:376)
>> at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(
>> StreamTask.java:253)
>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>> at java.lang.Thread.run(Thread.java:745)
>>
>>
>> On Mon, Aug 7, 2017 at 6:49 PM, Timo Walther <[hidden email]> wrote:
>>
>> Make sure that the file exists and is accessible from all Flink tasks
>>> managers.
>>>
>>>
>>> Am 07.08.17 um 14:35 schrieb P. Ramanjaneya Reddy:
>>>
>>> Thank you Timo.
>>>>
>>>>
>>>> root1@root1-HP-EliteBook-840-G2:~/NAI/Tools/BEAM/Flink_Clust
>>>> er/rama/flink$
>>>> *./bin/flink
>>>> run ./examples/streaming/WordCount.jar --input
>>>> file:///home/root1/hamlet.txt --output file:///home/root1/wordcount_o
>>>> ut*
>>>>
>>>>
>>>>
>>>> Execution of worcountjar gives error...
>>>>
>>>> 08/07/2017 18:03:16 Source: Custom File Source(1/1) switched to FAILED
>>>> java.io.FileNotFoundException: The provided file path
>>>> file:/home/root1/hamlet.txt does not exist.
>>>> at
>>>> org.apache.flink.streaming.api.functions.source.ContinuousFi
>>>> leMonitoringFunction.run(ContinuousFileMonitoringFunction.java:192)
>>>> at
>>>> org.apache.flink.streaming.api.operators.StreamSource.run(
>>>> StreamSource.java:87)
>>>> at
>>>> org.apache.flink.streaming.api.operators.StreamSource.run(
>>>> StreamSource.java:55)
>>>> at
>>>> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.
>>>> run(SourceStreamTask.java:95)
>>>> at
>>>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(
>>>> StreamTask.java:263)
>>>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>>>> at java.lang.Thread.run(Thread.java:748)
>>>>
>>>>
>>>> On Mon, Aug 7, 2017 at 5:56 PM, Timo Walther <[hidden email]>
>>>> wrote:
>>>>
>>>> Hi Ramanji,
>>>>
>>>>> you can find the source code of the examples here:
>>>>> https://github.com/apache/flink/blob/master/flink-examples/
>>>>> flink-examples-streaming/src/main/java/org/apache/flink/
>>>>> streaming/examples/wordcount/WordCount.java
>>>>>
>>>>> A general introduction how the cluster execution works can be found
>>>>> here:
>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
>>>>> concepts/programming-model.html#programs-and-dataflows
>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
>>>>> concepts/runtime.html
>>>>>
>>>>> It might also be helpful to have a look at the web interface which can
>>>>> show you a nice graph of the job.
>>>>>
>>>>> I hope this helps. Feel free to ask further questions.
>>>>>
>>>>> Regards,
>>>>> Timo
>>>>>
>>>>>
>>>>> Am 07.08.17 um 14:00 schrieb P. Ramanjaneya Reddy:
>>>>>
>>>>> Hello Everyone,
>>>>>
>>>>> I have followed the steps specified below link to Install & Run Apache
>>>>>> Flink on Multi-node Cluster.
>>>>>>
>>>>>> http://data-flair.training/blogs/install-run-deploy-flink-
>>>>>> multi-node-cluster/
>>>>>> used flink-1.3.2-bin-hadoop27-scala_2.10.tgz for install
>>>>>>
>>>>>> using the command
>>>>>> " bin/flink run
>>>>>> /home/root1/NAI/Tools/BEAM/Flink_Cluster/rama/flink/examples
>>>>>> /streaming/WordCount.jar"
>>>>>> able to run wordcount, but where can i see which input consider and
>>>>>> output
>>>>>> generated?
>>>>>>
>>>>>> and how can i specify the input and output paths?
>>>>>>
>>>>>> I'm trying to understand how the wordcount will work using Multi-node
>>>>>> Cluster.?
>>>>>>
>>>>>> any suggestions will help me further understanding?
>>>>>>
>>>>>> Thanks & Regards,
>>>>>> Ramanji.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>

P. Ramanjaneya Reddy

Re: How to run wordcount program on Multi-node Cluster.

Yes Felix,

I have created input and output files in HDFS.
http://localhost:50070/explorer.html#/user

But how we can access it ?

bin/flink run ./examples/streaming/WordCount.jar --input
*hdfs:///user/wordcount_input.txt* --output
*hdfs:///user/wordcount_output.txt *

On Tue, Aug 8, 2017 at 2:55 PM, Felix Neutatz <[hidden email]>
wrote:

> Hi,
>
> like Timo said e.g. you need a distributed filesystem like HDFS.
>
> Best regards,
> Felix
>
> On Aug 8, 2017 09:01, "P. Ramanjaneya Reddy" <[hidden email]> wrote:
>
> Hi Timo,
>
> How to make access the files across TM?
>
> Thanks & Regards,
> Ramanji.
>
> On Mon, Aug 7, 2017 at 7:45 PM, Timo Walther <[hidden email]> wrote:
>
> > Flink is a distributed software for clusters. You need something like a
> > distributed file system. So that input file and output files can be
> > accessed from all nodes.
> >
> > Each TM has a log directory where the execution logs are stored.
> >
> > You can set additional properties to your output format by importing the
> > code in your IDE.
> >
> > Am 07.08.17 um 16:03 schrieb P. Ramanjaneya Reddy:
> >
> > Hi Timo,
> >> Problem is resolved after copy input file to all tasks managers.
> >>
> >> and where should generate outputfile? Is it in jobmanager or task
> manager?
> >>
> >> Where can i see the execution logs to understand how word count done
> each
> >> task manager?
> >>
> >>
> >> By the way any option to overwride...?
> >>
> >> 08/07/2017 19:27:00 Keyed Aggregation -> Sink: Unnamed(1/1) switched to
> >> FAILED
> >> java.io.IOException: File or directory already exists. Existing files
> and
> >> directories are not overwritten in NO_OVERWRITE mode. Use OVERWRITE mode
> >> to
> >> overwrite existing files and directories.
> >> at
> >> org.apache.flink.core.fs.FileSystem.initOutPathLocalFS(FileS
> >> ystem.java:763)
> >> at
> >> org.apache.flink.core.fs.SafetyNetWrapperFileSystem.initOutP
> >> athLocalFS(SafetyNetWrapperFileSystem.java:135)
> >> at
> >> org.apache.flink.api.common.io.FileOutputFormat.open(FileOut
> >> putFormat.java:231)
> >> at
> >> org.apache.flink.api.java.io.TextOutputFormat.open(TextOutpu
> >> tFormat.java:78)
> >> at
> >> org.apache.flink.streaming.api.functions.sink.OutputFormatSi
> >> nkFunction.open(OutputFormatSinkFunction.java:61)
> >> at
> >> org.apache.flink.api.common.functions.util.FunctionUtils.ope
> >> nFunction(FunctionUtils.java:36)
> >> at
> >> org.apache.flink.streaming.api.operators.AbstractUdfStreamOp
> >> erator.open(AbstractUdfStreamOperator.java:111)
> >> at
> >> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllO
> >> perators(StreamTask.java:376)
> >> at
> >> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(
> >> StreamTask.java:253)
> >> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
> >> at java.lang.Thread.run(Thread.java:745)
> >>
> >>
> >> On Mon, Aug 7, 2017 at 6:49 PM, Timo Walther <[hidden email]>
> wrote:
> >>
> >> Make sure that the file exists and is accessible from all Flink tasks
> >>> managers.
> >>>
> >>>
> >>> Am 07.08.17 um 14:35 schrieb P. Ramanjaneya Reddy:
> >>>
> >>> Thank you Timo.
> >>>>
> >>>>
> >>>> root1@root1-HP-EliteBook-840-G2:~/NAI/Tools/BEAM/Flink_Clust
> >>>> er/rama/flink$
> >>>> *./bin/flink
> >>>> run ./examples/streaming/WordCount.jar --input
> >>>> file:///home/root1/hamlet.txt --output file:///home/root1/wordcount_o
> >>>> ut*
> >>>>
> >>>>
> >>>>
> >>>> Execution of worcountjar gives error...
> >>>>
> >>>> 08/07/2017 18:03:16 Source: Custom File Source(1/1) switched to FAILED
> >>>> java.io.FileNotFoundException: The provided file path
> >>>> file:/home/root1/hamlet.txt does not exist.
> >>>> at
> >>>> org.apache.flink.streaming.api.functions.source.ContinuousFi
> >>>> leMonitoringFunction.run(ContinuousFileMonitoringFunction.java:192)
> >>>> at
> >>>> org.apache.flink.streaming.api.operators.StreamSource.run(
> >>>> StreamSource.java:87)
> >>>> at
> >>>> org.apache.flink.streaming.api.operators.StreamSource.run(
> >>>> StreamSource.java:55)
> >>>> at
> >>>> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.
> >>>> run(SourceStreamTask.java:95)
> >>>> at
> >>>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(
> >>>> StreamTask.java:263)
> >>>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
> >>>> at java.lang.Thread.run(Thread.java:748)
> >>>>
> >>>>
> >>>> On Mon, Aug 7, 2017 at 5:56 PM, Timo Walther <[hidden email]>
> >>>> wrote:
> >>>>
> >>>> Hi Ramanji,
> >>>>
> >>>>> you can find the source code of the examples here:
> >>>>> https://github.com/apache/flink/blob/master/flink-examples/
> >>>>> flink-examples-streaming/src/main/java/org/apache/flink/
> >>>>> streaming/examples/wordcount/WordCount.java
> >>>>>
> >>>>> A general introduction how the cluster execution works can be found
> >>>>> here:
> >>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
> >>>>> concepts/programming-model.html#programs-and-dataflows
> >>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
> >>>>> concepts/runtime.html
> >>>>>
> >>>>> It might also be helpful to have a look at the web interface which
> can
> >>>>> show you a nice graph of the job.
> >>>>>
> >>>>> I hope this helps. Feel free to ask further questions.
> >>>>>
> >>>>> Regards,
> >>>>> Timo
> >>>>>
> >>>>>
> >>>>> Am 07.08.17 um 14:00 schrieb P. Ramanjaneya Reddy:
> >>>>>
> >>>>> Hello Everyone,
> >>>>>
> >>>>> I have followed the steps specified below link to Install & Run
> Apache
> >>>>>> Flink on Multi-node Cluster.
> >>>>>>
> >>>>>> http://data-flair.training/blogs/install-run-deploy-flink-
> >>>>>> multi-node-cluster/
> >>>>>> used flink-1.3.2-bin-hadoop27-scala_2.10.tgz for install
> >>>>>>
> >>>>>> using the command
> >>>>>> " bin/flink run
> >>>>>> /home/root1/NAI/Tools/BEAM/Flink_Cluster/rama/flink/examples
> >>>>>> /streaming/WordCount.jar"
> >>>>>> able to run wordcount, but where can i see which input consider and
> >>>>>> output
> >>>>>> generated?
> >>>>>>
> >>>>>> and how can i specify the input and output paths?
> >>>>>>
> >>>>>> I'm trying to understand how the wordcount will work using
> Multi-node
> >>>>>> Cluster.?
> >>>>>>
> >>>>>> any suggestions will help me further understanding?
> >>>>>>
> >>>>>> Thanks & Regards,
> >>>>>> Ramanji.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>
> >
>

P. Ramanjaneya Reddy

Re: How to run wordcount program on Multi-node Cluster.

Basically I have added input and outputfiles to HDFS.

But how to specify access the input file in commandline to run wordcount
program?

bin/flink run ./examples/streaming/WordCount.jar --input
*hdfs:///user/wordcount_input.txt* --output
*hdfs:///user/wordcount_output.txt *

On Tue, Aug 8, 2017 at 3:48 PM, P. Ramanjaneya Reddy <[hidden email]>
wrote:

> Yes Felix,
>
> I have created input and output files in HDFS.
> http://localhost:50070/explorer.html#/user
>
>
> But how we can access it ?
>
> bin/flink run ./examples/streaming/WordCount.jar --input
> *hdfs:///user/wordcount_input.txt* --output
> *hdfs:///user/wordcount_output.txt *
>
>
>
> On Tue, Aug 8, 2017 at 2:55 PM, Felix Neutatz <[hidden email]>
> wrote:
>
>> Hi,
>>
>> like Timo said e.g. you need a distributed filesystem like HDFS.
>>
>> Best regards,
>> Felix
>>
>> On Aug 8, 2017 09:01, "P. Ramanjaneya Reddy" <[hidden email]>
>> wrote:
>>
>> Hi Timo,
>>
>> How to make access the files across TM?
>>
>> Thanks & Regards,
>> Ramanji.
>>
>> On Mon, Aug 7, 2017 at 7:45 PM, Timo Walther <[hidden email]> wrote:
>>
>> > Flink is a distributed software for clusters. You need something like a
>> > distributed file system. So that input file and output files can be
>> > accessed from all nodes.
>> >
>> > Each TM has a log directory where the execution logs are stored.
>> >
>> > You can set additional properties to your output format by importing the
>> > code in your IDE.
>> >
>> > Am 07.08.17 um 16:03 schrieb P. Ramanjaneya Reddy:
>> >
>> > Hi Timo,
>> >> Problem is resolved after copy input file to all tasks managers.
>> >>
>> >> and where should generate outputfile? Is it in jobmanager or task
>> manager?
>> >>
>> >> Where can i see the execution logs to understand how word count done
>> each
>> >> task manager?
>> >>
>> >>
>> >> By the way any option to overwride...?
>> >>
>> >> 08/07/2017 19:27:00 Keyed Aggregation -> Sink: Unnamed(1/1) switched to
>> >> FAILED
>> >> java.io.IOException: File or directory already exists. Existing files
>> and
>> >> directories are not overwritten in NO_OVERWRITE mode. Use OVERWRITE
>> mode
>> >> to
>> >> overwrite existing files and directories.
>> >> at
>> >> org.apache.flink.core.fs.FileSystem.initOutPathLocalFS(FileS
>> >> ystem.java:763)
>> >> at
>> >> org.apache.flink.core.fs.SafetyNetWrapperFileSystem.initOutP
>> >> athLocalFS(SafetyNetWrapperFileSystem.java:135)
>> >> at
>> >> org.apache.flink.api.common.io.FileOutputFormat.open(FileOut
>> >> putFormat.java:231)
>> >> at
>> >> org.apache.flink.api.java.io.TextOutputFormat.open(TextOutpu
>> >> tFormat.java:78)
>> >> at
>> >> org.apache.flink.streaming.api.functions.sink.OutputFormatSi
>> >> nkFunction.open(OutputFormatSinkFunction.java:61)
>> >> at
>> >> org.apache.flink.api.common.functions.util.FunctionUtils.ope
>> >> nFunction(FunctionUtils.java:36)
>> >> at
>> >> org.apache.flink.streaming.api.operators.AbstractUdfStreamOp
>> >> erator.open(AbstractUdfStreamOperator.java:111)
>> >> at
>> >> org.apache.flink.streaming.runtime.tasks.StreamTask.openAllO
>> >> perators(StreamTask.java:376)
>> >> at
>> >> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(
>> >> StreamTask.java:253)
>> >> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>> >> at java.lang.Thread.run(Thread.java:745)
>> >>
>> >>
>> >> On Mon, Aug 7, 2017 at 6:49 PM, Timo Walther <[hidden email]>
>> wrote:
>> >>
>> >> Make sure that the file exists and is accessible from all Flink tasks
>> >>> managers.
>> >>>
>> >>>
>> >>> Am 07.08.17 um 14:35 schrieb P. Ramanjaneya Reddy:
>> >>>
>> >>> Thank you Timo.
>> >>>>
>> >>>>
>> >>>> root1@root1-HP-EliteBook-840-G2:~/NAI/Tools/BEAM/Flink_Clust
>> >>>> er/rama/flink$
>> >>>> *./bin/flink
>> >>>> run ./examples/streaming/WordCount.jar --input
>> >>>> file:///home/root1/hamlet.txt --output file:///home/root1/wordcount_o
>> >>>> ut*
>> >>>>
>> >>>>
>> >>>>
>> >>>> Execution of worcountjar gives error...
>> >>>>
>> >>>> 08/07/2017 18:03:16 Source: Custom File Source(1/1) switched to
>> FAILED
>> >>>> java.io.FileNotFoundException: The provided file path
>> >>>> file:/home/root1/hamlet.txt does not exist.
>> >>>> at
>> >>>> org.apache.flink.streaming.api.functions.source.ContinuousFi
>> >>>> leMonitoringFunction.run(ContinuousFileMonitoringFunction.java:192)
>> >>>> at
>> >>>> org.apache.flink.streaming.api.operators.StreamSource.run(
>> >>>> StreamSource.java:87)
>> >>>> at
>> >>>> org.apache.flink.streaming.api.operators.StreamSource.run(
>> >>>> StreamSource.java:55)
>> >>>> at
>> >>>> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.
>> >>>> run(SourceStreamTask.java:95)
>> >>>> at
>> >>>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(
>> >>>> StreamTask.java:263)
>> >>>> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702)
>> >>>> at java.lang.Thread.run(Thread.java:748)
>> >>>>
>> >>>>
>> >>>> On Mon, Aug 7, 2017 at 5:56 PM, Timo Walther <[hidden email]>
>> >>>> wrote:
>> >>>>
>> >>>> Hi Ramanji,
>> >>>>
>> >>>>> you can find the source code of the examples here:
>> >>>>> https://github.com/apache/flink/blob/master/flink-examples/
>> >>>>> flink-examples-streaming/src/main/java/org/apache/flink/
>> >>>>> streaming/examples/wordcount/WordCount.java
>> >>>>>
>> >>>>> A general introduction how the cluster execution works can be found
>> >>>>> here:
>> >>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
>> >>>>> concepts/programming-model.html#programs-and-dataflows
>> >>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.4/
>> >>>>> concepts/runtime.html
>> >>>>>
>> >>>>> It might also be helpful to have a look at the web interface which
>> can
>> >>>>> show you a nice graph of the job.
>> >>>>>
>> >>>>> I hope this helps. Feel free to ask further questions.
>> >>>>>
>> >>>>> Regards,
>> >>>>> Timo
>> >>>>>
>> >>>>>
>> >>>>> Am 07.08.17 um 14:00 schrieb P. Ramanjaneya Reddy:
>> >>>>>
>> >>>>> Hello Everyone,
>> >>>>>
>> >>>>> I have followed the steps specified below link to Install & Run
>> Apache
>> >>>>>> Flink on Multi-node Cluster.
>> >>>>>>
>> >>>>>> http://data-flair.training/blogs/install-run-deploy-flink-
>> >>>>>> multi-node-cluster/
>> >>>>>> used flink-1.3.2-bin-hadoop27-scala_2.10.tgz for install
>> >>>>>>
>> >>>>>> using the command
>> >>>>>> " bin/flink run
>> >>>>>> /home/root1/NAI/Tools/BEAM/Flink_Cluster/rama/flink/examples
>> >>>>>> /streaming/WordCount.jar"
>> >>>>>> able to run wordcount, but where can i see which input consider and
>> >>>>>> output
>> >>>>>> generated?
>> >>>>>>
>> >>>>>> and how can i specify the input and output paths?
>> >>>>>>
>> >>>>>> I'm trying to understand how the wordcount will work using
>> Multi-node
>> >>>>>> Cluster.?
>> >>>>>>
>> >>>>>> any suggestions will help me further understanding?
>> >>>>>>
>> >>>>>> Thanks & Regards,
>> >>>>>> Ramanji.
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >
>>
>
>