[Dev] Issue related to using Flink DataSet<T> methods

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

[Dev] Issue related to using Flink DataSet<T> methods

Pawan Manishka Gunarathna
Hi,

I have implemented a Flink InputFormat interface related to my datasource.
It have our own data type as *Record*. So my class seems as follows,

public class DASInputFormat implements InputFormat<Record,DASInputSplit> {
}

So when I executed the print() method, my console shows the Flink execution,
but nothing will print. So how can I read/print available records in
my datasource
table.

-----------------------------------------------------------------------------------

ExecutionEnvironment environment =
ExecutionEnvironment.getExecutionEnvironment();
DASInputFormat dasInputFormat = new DASInputFormat(1, "SAMPLETABLE1",2, null,
Long.MIN_VALUE, Long.MAX_VALUE,0, -1);
DataSet<Record> dasRecords = environment.createInput(dasInputFormat);
dasRecords.print();

Thanks,
Pawan
--

*Pawan Gunaratne*
*Mob: +94 770373556*
Reply | Threaded
Open this post in threaded view
|

Re: [Dev] Issue related to using Flink DataSet<T> methods

xingcanc
Hi Pawan,

in Flink, most of the methods for DataSet (including print()) will just add
operators to the plan but not really run it. If the DASInputFormat has no
error, you can run the plan by calling environment.execute().

Best,
Xingcan

On Wed, Mar 1, 2017 at 12:17 PM, Pawan Manishka Gunarathna <
[hidden email]> wrote:

> Hi,
>
> I have implemented a Flink InputFormat interface related to my datasource.
> It have our own data type as *Record*. So my class seems as follows,
>
> public class DASInputFormat implements InputFormat<Record,DASInputSplit> {
> }
>
> So when I executed the print() method, my console shows the Flink
> execution,
> but nothing will print. So how can I read/print available records in
> my datasource
> table.
>
> ------------------------------------------------------------
> -----------------------
>
> ExecutionEnvironment environment =
> ExecutionEnvironment.getExecutionEnvironment();
> DASInputFormat dasInputFormat = new DASInputFormat(1, "SAMPLETABLE1",2,
> null,
> Long.MIN_VALUE, Long.MAX_VALUE,0, -1);
> DataSet<Record> dasRecords = environment.createInput(dasInputFormat);
> dasRecords.print();
>
> Thanks,
> Pawan
> --
>
> *Pawan Gunaratne*
> *Mob: +94 770373556*
>
Reply | Threaded
Open this post in threaded view
|

Re: [Dev] Issue related to using Flink DataSet<T> methods

Pawan Manishka Gunarathna
Hi,

So how can I read the available records of my datasource. I saw in some
examples that print() method will print the available data of that
datasource. ( like files )

Thanks,
Pawan


On Wed, Mar 1, 2017 at 11:30 AM, Xingcan Cui <[hidden email]> wrote:

> Hi Pawan,
>
> in Flink, most of the methods for DataSet (including print()) will just add
> operators to the plan but not really run it. If the DASInputFormat has no
> error, you can run the plan by calling environment.execute().
>
> Best,
> Xingcan
>
> On Wed, Mar 1, 2017 at 12:17 PM, Pawan Manishka Gunarathna <
> [hidden email]> wrote:
>
> > Hi,
> >
> > I have implemented a Flink InputFormat interface related to my
> datasource.
> > It have our own data type as *Record*. So my class seems as follows,
> >
> > public class DASInputFormat implements InputFormat<Record,DASInputSplit>
> {
> > }
> >
> > So when I executed the print() method, my console shows the Flink
> > execution,
> > but nothing will print. So how can I read/print available records in
> > my datasource
> > table.
> >
> > ------------------------------------------------------------
> > -----------------------
> >
> > ExecutionEnvironment environment =
> > ExecutionEnvironment.getExecutionEnvironment();
> > DASInputFormat dasInputFormat = new DASInputFormat(1, "SAMPLETABLE1",2,
> > null,
> > Long.MIN_VALUE, Long.MAX_VALUE,0, -1);
> > DataSet<Record> dasRecords = environment.createInput(dasInputFormat);
> > dasRecords.print();
> >
> > Thanks,
> > Pawan
> > --
> >
> > *Pawan Gunaratne*
> > *Mob: +94 770373556*
> >
>



--

*Pawan Gunaratne*
*Mob: +94 770373556*
Reply | Threaded
Open this post in threaded view
|

Re: [Dev] Issue related to using Flink DataSet<T> methods

Fabian Hueske-2
Hi Pawan,

in the DataSet API DataSet.print() will trigger the execution (you do not
need to call ExecutionEnvironment.execute()).
The DataSet will be printed on the standard out of the process that submits
the program. This does only work for small DataSets.
In general print() should only be used when developing jobs.

You can also use DataSet.printOnTaskManager() which writes to the standard
out of the TaskManager processes, usually to .out files in the ./log folder.

Best, Fabian

2017-03-01 7:16 GMT+01:00 Pawan Manishka Gunarathna <
[hidden email]>:

> Hi,
>
> So how can I read the available records of my datasource. I saw in some
> examples that print() method will print the available data of that
> datasource. ( like files )
>
> Thanks,
> Pawan
>
>
> On Wed, Mar 1, 2017 at 11:30 AM, Xingcan Cui <[hidden email]> wrote:
>
> > Hi Pawan,
> >
> > in Flink, most of the methods for DataSet (including print()) will just
> add
> > operators to the plan but not really run it. If the DASInputFormat has no
> > error, you can run the plan by calling environment.execute().
> >
> > Best,
> > Xingcan
> >
> > On Wed, Mar 1, 2017 at 12:17 PM, Pawan Manishka Gunarathna <
> > [hidden email]> wrote:
> >
> > > Hi,
> > >
> > > I have implemented a Flink InputFormat interface related to my
> > datasource.
> > > It have our own data type as *Record*. So my class seems as follows,
> > >
> > > public class DASInputFormat implements InputFormat<Record,
> DASInputSplit>
> > {
> > > }
> > >
> > > So when I executed the print() method, my console shows the Flink
> > > execution,
> > > but nothing will print. So how can I read/print available records in
> > > my datasource
> > > table.
> > >
> > > ------------------------------------------------------------
> > > -----------------------
> > >
> > > ExecutionEnvironment environment =
> > > ExecutionEnvironment.getExecutionEnvironment();
> > > DASInputFormat dasInputFormat = new DASInputFormat(1, "SAMPLETABLE1",2,
> > > null,
> > > Long.MIN_VALUE, Long.MAX_VALUE,0, -1);
> > > DataSet<Record> dasRecords = environment.createInput(dasInputFormat);
> > > dasRecords.print();
> > >
> > > Thanks,
> > > Pawan
> > > --
> > >
> > > *Pawan Gunaratne*
> > > *Mob: +94 770373556*
> > >
> >
>
>
>
> --
>
> *Pawan Gunaratne*
> *Mob: +94 770373556*
>
Reply | Threaded
Open this post in threaded view
|

Re: [Dev] Issue related to using Flink DataSet<T> methods

xingcanc
Hi Pawan,

@Fabian was right and I thought it was stream environment. Sorry for that.

What do you mean by `read the available records of my datasource`? How do
you implement the nextRecord() method in DASInputFormat?

Best,
Xingcan


On Wed, Mar 1, 2017 at 4:45 PM, Fabian Hueske <[hidden email]> wrote:

> Hi Pawan,
>
> in the DataSet API DataSet.print() will trigger the execution (you do not
> need to call ExecutionEnvironment.execute()).
> The DataSet will be printed on the standard out of the process that submits
> the program. This does only work for small DataSets.
> In general print() should only be used when developing jobs.
>
> You can also use DataSet.printOnTaskManager() which writes to the standard
> out of the TaskManager processes, usually to .out files in the ./log
> folder.
>
> Best, Fabian
>
> 2017-03-01 7:16 GMT+01:00 Pawan Manishka Gunarathna <
> [hidden email]>:
>
> > Hi,
> >
> > So how can I read the available records of my datasource. I saw in some
> > examples that print() method will print the available data of that
> > datasource. ( like files )
> >
> > Thanks,
> > Pawan
> >
> >
> > On Wed, Mar 1, 2017 at 11:30 AM, Xingcan Cui <[hidden email]> wrote:
> >
> > > Hi Pawan,
> > >
> > > in Flink, most of the methods for DataSet (including print()) will just
> > add
> > > operators to the plan but not really run it. If the DASInputFormat has
> no
> > > error, you can run the plan by calling environment.execute().
> > >
> > > Best,
> > > Xingcan
> > >
> > > On Wed, Mar 1, 2017 at 12:17 PM, Pawan Manishka Gunarathna <
> > > [hidden email]> wrote:
> > >
> > > > Hi,
> > > >
> > > > I have implemented a Flink InputFormat interface related to my
> > > datasource.
> > > > It have our own data type as *Record*. So my class seems as follows,
> > > >
> > > > public class DASInputFormat implements InputFormat<Record,
> > DASInputSplit>
> > > {
> > > > }
> > > >
> > > > So when I executed the print() method, my console shows the Flink
> > > > execution,
> > > > but nothing will print. So how can I read/print available records in
> > > > my datasource
> > > > table.
> > > >
> > > > ------------------------------------------------------------
> > > > -----------------------
> > > >
> > > > ExecutionEnvironment environment =
> > > > ExecutionEnvironment.getExecutionEnvironment();
> > > > DASInputFormat dasInputFormat = new DASInputFormat(1,
> "SAMPLETABLE1",2,
> > > > null,
> > > > Long.MIN_VALUE, Long.MAX_VALUE,0, -1);
> > > > DataSet<Record> dasRecords = environment.createInput(
> dasInputFormat);
> > > > dasRecords.print();
> > > >
> > > > Thanks,
> > > > Pawan
> > > > --
> > > >
> > > > *Pawan Gunaratne*
> > > > *Mob: +94 770373556*
> > > >
> > >
> >
> >
> >
> > --
> >
> > *Pawan Gunaratne*
> > *Mob: +94 770373556*
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [Dev] Issue related to using Flink DataSet<T> methods

Pawan Manishka Gunarathna
Hi,

Thanks @Fabian and @Xingcan for the explanation.

@Xingcan Here I mean I have a data analytics server that has *data tables*.
So my initial requirement is to make a client connector for Flink to access
that* data tables*.Then I started with implementing Flink InputFormat
Interface and that was done.So before going to do processing using flink, I
need to read that table data. That's the thing I mean there.

My nextRecord() method will return our own data type called *Record. *Here
following are some sample Records in my data table.

------------------------------------------------------------
------------------------------------------------------------
------------------



*TID: 1 Table Name: SAMPLETABLE1 ID: 818ab992-fdd6-4eea-7838-73dfbe5b3b0e
Timestamp: 1488359699780 Values: {STR2=Steve, STR1=Smith}TID: 1 Table Name:
SAMPLETABLE1 ID: dfe3a6c6-af82-e7f5-bfdd-4ac22f832464 Timestamp:
1488359699780 Values: {STR2=Nuwan, STR1=Pawan}TID: 1 Table Name:
SAMPLETABLE1 ID: 2933d3d9-fff7-94ed-ded4-ef023baa3783 Timestamp:
1488359699780 Values: {STR2=Gayle, STR1=Kohli}*

------------------------------------------------------------
------------------------------------------------------------
-------------------


Thanks,
Pawan

On Wed, Mar 1, 2017 at 3:12 PM, Xingcan Cui <[hidden email]> wrote:

> Hi Pawan,
>
> @Fabian was right and I thought it was stream environment. Sorry for that.
>
> What do you mean by `read the available records of my datasource`? How do
> you implement the nextRecord() method in DASInputFormat?
>
> Best,
> Xingcan
>
>
> On Wed, Mar 1, 2017 at 4:45 PM, Fabian Hueske <[hidden email]> wrote:
>
> > Hi Pawan,
> >
> > in the DataSet API DataSet.print() will trigger the execution (you do not
> > need to call ExecutionEnvironment.execute()).
> > The DataSet will be printed on the standard out of the process that
> submits
> > the program. This does only work for small DataSets.
> > In general print() should only be used when developing jobs.
> >
> > You can also use DataSet.printOnTaskManager() which writes to the
> standard
> > out of the TaskManager processes, usually to .out files in the ./log
> > folder.
> >
> > Best, Fabian
> >
> > 2017-03-01 7:16 GMT+01:00 Pawan Manishka Gunarathna <
> > [hidden email]>:
> >
> > > Hi,
> > >
> > > So how can I read the available records of my datasource. I saw in some
> > > examples that print() method will print the available data of that
> > > datasource. ( like files )
> > >
> > > Thanks,
> > > Pawan
> > >
> > >
> > > On Wed, Mar 1, 2017 at 11:30 AM, Xingcan Cui <[hidden email]>
> wrote:
> > >
> > > > Hi Pawan,
> > > >
> > > > in Flink, most of the methods for DataSet (including print()) will
> just
> > > add
> > > > operators to the plan but not really run it. If the DASInputFormat
> has
> > no
> > > > error, you can run the plan by calling environment.execute().
> > > >
> > > > Best,
> > > > Xingcan
> > > >
> > > > On Wed, Mar 1, 2017 at 12:17 PM, Pawan Manishka Gunarathna <
> > > > [hidden email]> wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I have implemented a Flink InputFormat interface related to my
> > > > datasource.
> > > > > It have our own data type as *Record*. So my class seems as
> follows,
> > > > >
> > > > > public class DASInputFormat implements InputFormat<Record,
> > > DASInputSplit>
> > > > {
> > > > > }
> > > > >
> > > > > So when I executed the print() method, my console shows the Flink
> > > > > execution,
> > > > > but nothing will print. So how can I read/print available records
> in
> > > > > my datasource
> > > > > table.
> > > > >
> > > > > ------------------------------------------------------------
> > > > > -----------------------
> > > > >
> > > > > ExecutionEnvironment environment =
> > > > > ExecutionEnvironment.getExecutionEnvironment();
> > > > > DASInputFormat dasInputFormat = new DASInputFormat(1,
> > "SAMPLETABLE1",2,
> > > > > null,
> > > > > Long.MIN_VALUE, Long.MAX_VALUE,0, -1);
> > > > > DataSet<Record> dasRecords = environment.createInput(
> > dasInputFormat);
> > > > > dasRecords.print();
> > > > >
> > > > > Thanks,
> > > > > Pawan
> > > > > --
> > > > >
> > > > > *Pawan Gunaratne*
> > > > > *Mob: +94 770373556*
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > >
> > > *Pawan Gunaratne*
> > > *Mob: +94 770373556*
> > >
> >
>



--

*Pawan Gunaratne*
*Mob: +94 770373556*