HBase 0.98 addon for Flink 0.8

classic Classic list List threaded Threaded
38 messages Options
12
Reply | Threaded
Open this post in threaded view
|

HBase 0.98 addon for Flink 0.8

Flavio Pompermaier
As suggestes by Fabian I moved the discussion on this mailing list.

I think that what is still to be discussed is how  to retrigger the build
on Travis (I don't have an account) and if the PR can be integrated.

Maybe what I can do is to move the HBase example in the test package (right
now I left it in the main folder) so it will force Travis to rebuild.
I'll do it within a couple of hours.

Another thing I forgot to say is that the hbase extension is now compatible
with both hadoop 1 and 2.

Best,
Flavio
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Fabian Hueske
You can also setup Travis to build your own Github repositories by linking
it to your Github account. That way Travis can build all your branches (and
you can also trigger rebuilds if something fails).
Not sure if we can manually trigger retrigger builds on the Apache
repository.

Support for Hadoop 1 and 2 is indeed a very good addition :-)

For the discusion about the PR itself, I would need a bit more time to
become more familiar with HBase. I do also not have a HBase setup available
here.
Maybe somebody else of the community who was involved with a previous
version of the HBase connector could comment on your question.

Best, Fabian

2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <[hidden email]>:

> As suggestes by Fabian I moved the discussion on this mailing list.
>
> I think that what is still to be discussed is how  to retrigger the build
> on Travis (I don't have an account) and if the PR can be integrated.
>
> Maybe what I can do is to move the HBase example in the test package (right
> now I left it in the main folder) so it will force Travis to rebuild.
> I'll do it within a couple of hours.
>
> Another thing I forgot to say is that the hbase extension is now compatible
> with both hadoop 1 and 2.
>
> Best,
> Flavio
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Flavio Pompermaier
Indeed this time the build has been successful :)

On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <[hidden email]> wrote:

> You can also setup Travis to build your own Github repositories by linking
> it to your Github account. That way Travis can build all your branches (and
> you can also trigger rebuilds if something fails).
> Not sure if we can manually trigger retrigger builds on the Apache
> repository.
>
> Support for Hadoop 1 and 2 is indeed a very good addition :-)
>
> For the discusion about the PR itself, I would need a bit more time to
> become more familiar with HBase. I do also not have a HBase setup available
> here.
> Maybe somebody else of the community who was involved with a previous
> version of the HBase connector could comment on your question.
>
> Best, Fabian
>
> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <[hidden email]>:
>
> > As suggestes by Fabian I moved the discussion on this mailing list.
> >
> > I think that what is still to be discussed is how  to retrigger the build
> > on Travis (I don't have an account) and if the PR can be integrated.
> >
> > Maybe what I can do is to move the HBase example in the test package
> (right
> > now I left it in the main folder) so it will force Travis to rebuild.
> > I'll do it within a couple of hours.
> >
> > Another thing I forgot to say is that the hbase extension is now
> compatible
> > with both hadoop 1 and 2.
> >
> > Best,
> > Flavio
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Flavio Pompermaier
Just one last thing..I removed the HbaseDataSink because I think it was
using the old APIs..can someone help me in updating that class?

On Sun, Nov 2, 2014 at 10:55 AM, Flavio Pompermaier <[hidden email]>
wrote:

> Indeed this time the build has been successful :)
>
> On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <[hidden email]> wrote:
>
>> You can also setup Travis to build your own Github repositories by linking
>> it to your Github account. That way Travis can build all your branches
>> (and
>> you can also trigger rebuilds if something fails).
>> Not sure if we can manually trigger retrigger builds on the Apache
>> repository.
>>
>> Support for Hadoop 1 and 2 is indeed a very good addition :-)
>>
>> For the discusion about the PR itself, I would need a bit more time to
>> become more familiar with HBase. I do also not have a HBase setup
>> available
>> here.
>> Maybe somebody else of the community who was involved with a previous
>> version of the HBase connector could comment on your question.
>>
>> Best, Fabian
>>
>> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <[hidden email]>:
>>
>> > As suggestes by Fabian I moved the discussion on this mailing list.
>> >
>> > I think that what is still to be discussed is how  to retrigger the
>> build
>> > on Travis (I don't have an account) and if the PR can be integrated.
>> >
>> > Maybe what I can do is to move the HBase example in the test package
>> (right
>> > now I left it in the main folder) so it will force Travis to rebuild.
>> > I'll do it within a couple of hours.
>> >
>> > Another thing I forgot to say is that the hbase extension is now
>> compatible
>> > with both hadoop 1 and 2.
>> >
>> > Best,
>> > Flavio
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Stephan Ewen
You do not really need a HBase data sink. You can call "DataSet.output(new
HBaseOutputFormat())"

Stephan
Am 02.11.2014 23:05 schrieb "Flavio Pompermaier" <[hidden email]>:

> Just one last thing..I removed the HbaseDataSink because I think it was
> using the old APIs..can someone help me in updating that class?
>
> On Sun, Nov 2, 2014 at 10:55 AM, Flavio Pompermaier <[hidden email]>
> wrote:
>
> > Indeed this time the build has been successful :)
> >
> > On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <[hidden email]>
> wrote:
> >
> >> You can also setup Travis to build your own Github repositories by
> linking
> >> it to your Github account. That way Travis can build all your branches
> >> (and
> >> you can also trigger rebuilds if something fails).
> >> Not sure if we can manually trigger retrigger builds on the Apache
> >> repository.
> >>
> >> Support for Hadoop 1 and 2 is indeed a very good addition :-)
> >>
> >> For the discusion about the PR itself, I would need a bit more time to
> >> become more familiar with HBase. I do also not have a HBase setup
> >> available
> >> here.
> >> Maybe somebody else of the community who was involved with a previous
> >> version of the HBase connector could comment on your question.
> >>
> >> Best, Fabian
> >>
> >> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <[hidden email]>:
> >>
> >> > As suggestes by Fabian I moved the discussion on this mailing list.
> >> >
> >> > I think that what is still to be discussed is how  to retrigger the
> >> build
> >> > on Travis (I don't have an account) and if the PR can be integrated.
> >> >
> >> > Maybe what I can do is to move the HBase example in the test package
> >> (right
> >> > now I left it in the main folder) so it will force Travis to rebuild.
> >> > I'll do it within a couple of hours.
> >> >
> >> > Another thing I forgot to say is that the hbase extension is now
> >> compatible
> >> > with both hadoop 1 and 2.
> >> >
> >> > Best,
> >> > Flavio
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Flavio Pompermaier
Ah ok, perfect! That was the reason why I removed it :)

On Mon, Nov 3, 2014 at 9:10 AM, Stephan Ewen <[hidden email]> wrote:

> You do not really need a HBase data sink. You can call "DataSet.output(new
> HBaseOutputFormat())"
>
> Stephan
> Am 02.11.2014 23:05 schrieb "Flavio Pompermaier" <[hidden email]>:
>
> > Just one last thing..I removed the HbaseDataSink because I think it was
> > using the old APIs..can someone help me in updating that class?
> >
> > On Sun, Nov 2, 2014 at 10:55 AM, Flavio Pompermaier <
> [hidden email]>
> > wrote:
> >
> > > Indeed this time the build has been successful :)
> > >
> > > On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <[hidden email]>
> > wrote:
> > >
> > >> You can also setup Travis to build your own Github repositories by
> > linking
> > >> it to your Github account. That way Travis can build all your branches
> > >> (and
> > >> you can also trigger rebuilds if something fails).
> > >> Not sure if we can manually trigger retrigger builds on the Apache
> > >> repository.
> > >>
> > >> Support for Hadoop 1 and 2 is indeed a very good addition :-)
> > >>
> > >> For the discusion about the PR itself, I would need a bit more time to
> > >> become more familiar with HBase. I do also not have a HBase setup
> > >> available
> > >> here.
> > >> Maybe somebody else of the community who was involved with a previous
> > >> version of the HBase connector could comment on your question.
> > >>
> > >> Best, Fabian
> > >>
> > >> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <[hidden email]>:
> > >>
> > >> > As suggestes by Fabian I moved the discussion on this mailing list.
> > >> >
> > >> > I think that what is still to be discussed is how  to retrigger the
> > >> build
> > >> > on Travis (I don't have an account) and if the PR can be integrated.
> > >> >
> > >> > Maybe what I can do is to move the HBase example in the test package
> > >> (right
> > >> > now I left it in the main folder) so it will force Travis to
> rebuild.
> > >> > I'll do it within a couple of hours.
> > >> >
> > >> > Another thing I forgot to say is that the hbase extension is now
> > >> compatible
> > >> > with both hadoop 1 and 2.
> > >> >
> > >> > Best,
> > >> > Flavio
> > >>
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Flavio Pompermaier
Maybe that's something I could add to the HBase example and that could be
better documented in the Wiki.

Since we're talking about the wiki..I was looking at the Java API (
http://flink.incubator.apache.org/docs/0.6-incubating/java_api_guide.html)
and the link to the KMeans example is not working (where it says For a
complete example program, have a look at KMeans Algorithm).

Best,
Flavio

On Mon, Nov 3, 2014 at 9:12 AM, Flavio Pompermaier <[hidden email]>
wrote:

> Ah ok, perfect! That was the reason why I removed it :)
>
> On Mon, Nov 3, 2014 at 9:10 AM, Stephan Ewen <[hidden email]> wrote:
>
>> You do not really need a HBase data sink. You can call "DataSet.output(new
>> HBaseOutputFormat())"
>>
>> Stephan
>> Am 02.11.2014 23:05 schrieb "Flavio Pompermaier" <[hidden email]>:
>>
>> > Just one last thing..I removed the HbaseDataSink because I think it was
>> > using the old APIs..can someone help me in updating that class?
>> >
>> > On Sun, Nov 2, 2014 at 10:55 AM, Flavio Pompermaier <
>> [hidden email]>
>> > wrote:
>> >
>> > > Indeed this time the build has been successful :)
>> > >
>> > > On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <[hidden email]>
>> > wrote:
>> > >
>> > >> You can also setup Travis to build your own Github repositories by
>> > linking
>> > >> it to your Github account. That way Travis can build all your
>> branches
>> > >> (and
>> > >> you can also trigger rebuilds if something fails).
>> > >> Not sure if we can manually trigger retrigger builds on the Apache
>> > >> repository.
>> > >>
>> > >> Support for Hadoop 1 and 2 is indeed a very good addition :-)
>> > >>
>> > >> For the discusion about the PR itself, I would need a bit more time
>> to
>> > >> become more familiar with HBase. I do also not have a HBase setup
>> > >> available
>> > >> here.
>> > >> Maybe somebody else of the community who was involved with a previous
>> > >> version of the HBase connector could comment on your question.
>> > >>
>> > >> Best, Fabian
>> > >>
>> > >> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <[hidden email]>:
>> > >>
>> > >> > As suggestes by Fabian I moved the discussion on this mailing list.
>> > >> >
>> > >> > I think that what is still to be discussed is how  to retrigger the
>> > >> build
>> > >> > on Travis (I don't have an account) and if the PR can be
>> integrated.
>> > >> >
>> > >> > Maybe what I can do is to move the HBase example in the test
>> package
>> > >> (right
>> > >> > now I left it in the main folder) so it will force Travis to
>> rebuild.
>> > >> > I'll do it within a couple of hours.
>> > >> >
>> > >> > Another thing I forgot to say is that the hbase extension is now
>> > >> compatible
>> > >> > with both hadoop 1 and 2.
>> > >> >
>> > >> > Best,
>> > >> > Flavio
>> > >>
>> > >
>> >
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Flavio Pompermaier
| was trying to modify the example setting hbaseDs.output(new
HBaseOutputFormat()); but I can't see any HBaseOutputFormat class..maybe we
shall use another class?

On Mon, Nov 3, 2014 at 9:39 AM, Flavio Pompermaier <[hidden email]>
wrote:

> Maybe that's something I could add to the HBase example and that could be
> better documented in the Wiki.
>
> Since we're talking about the wiki..I was looking at the Java API (
> http://flink.incubator.apache.org/docs/0.6-incubating/java_api_guide.html)
> and the link to the KMeans example is not working (where it says For a
> complete example program, have a look at KMeans Algorithm).
>
> Best,
> Flavio
>
>
> On Mon, Nov 3, 2014 at 9:12 AM, Flavio Pompermaier <[hidden email]>
> wrote:
>
>> Ah ok, perfect! That was the reason why I removed it :)
>>
>> On Mon, Nov 3, 2014 at 9:10 AM, Stephan Ewen <[hidden email]> wrote:
>>
>>> You do not really need a HBase data sink. You can call
>>> "DataSet.output(new
>>> HBaseOutputFormat())"
>>>
>>> Stephan
>>> Am 02.11.2014 23:05 schrieb "Flavio Pompermaier" <[hidden email]>:
>>>
>>> > Just one last thing..I removed the HbaseDataSink because I think it was
>>> > using the old APIs..can someone help me in updating that class?
>>> >
>>> > On Sun, Nov 2, 2014 at 10:55 AM, Flavio Pompermaier <
>>> [hidden email]>
>>> > wrote:
>>> >
>>> > > Indeed this time the build has been successful :)
>>> > >
>>> > > On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <[hidden email]>
>>> > wrote:
>>> > >
>>> > >> You can also setup Travis to build your own Github repositories by
>>> > linking
>>> > >> it to your Github account. That way Travis can build all your
>>> branches
>>> > >> (and
>>> > >> you can also trigger rebuilds if something fails).
>>> > >> Not sure if we can manually trigger retrigger builds on the Apache
>>> > >> repository.
>>> > >>
>>> > >> Support for Hadoop 1 and 2 is indeed a very good addition :-)
>>> > >>
>>> > >> For the discusion about the PR itself, I would need a bit more time
>>> to
>>> > >> become more familiar with HBase. I do also not have a HBase setup
>>> > >> available
>>> > >> here.
>>> > >> Maybe somebody else of the community who was involved with a
>>> previous
>>> > >> version of the HBase connector could comment on your question.
>>> > >>
>>> > >> Best, Fabian
>>> > >>
>>> > >> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <[hidden email]
>>> >:
>>> > >>
>>> > >> > As suggestes by Fabian I moved the discussion on this mailing
>>> list.
>>> > >> >
>>> > >> > I think that what is still to be discussed is how  to retrigger
>>> the
>>> > >> build
>>> > >> > on Travis (I don't have an account) and if the PR can be
>>> integrated.
>>> > >> >
>>> > >> > Maybe what I can do is to move the HBase example in the test
>>> package
>>> > >> (right
>>> > >> > now I left it in the main folder) so it will force Travis to
>>> rebuild.
>>> > >> > I'll do it within a couple of hours.
>>> > >> >
>>> > >> > Another thing I forgot to say is that the hbase extension is now
>>> > >> compatible
>>> > >> > with both hadoop 1 and 2.
>>> > >> >
>>> > >> > Best,
>>> > >> > Flavio
>>> > >>
>>> > >
>>> >
>>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Fabian Hueske
I'm not familiar with the HBase connector code, but are you maybe looking
for the GenericTableOutputFormat?

2014-11-03 9:44 GMT+01:00 Flavio Pompermaier <[hidden email]>:

> | was trying to modify the example setting hbaseDs.output(new
> HBaseOutputFormat()); but I can't see any HBaseOutputFormat class..maybe we
> shall use another class?
>
> On Mon, Nov 3, 2014 at 9:39 AM, Flavio Pompermaier <[hidden email]>
> wrote:
>
> > Maybe that's something I could add to the HBase example and that could be
> > better documented in the Wiki.
> >
> > Since we're talking about the wiki..I was looking at the Java API (
> >
> http://flink.incubator.apache.org/docs/0.6-incubating/java_api_guide.html)
> > and the link to the KMeans example is not working (where it says For a
> > complete example program, have a look at KMeans Algorithm).
> >
> > Best,
> > Flavio
> >
> >
> > On Mon, Nov 3, 2014 at 9:12 AM, Flavio Pompermaier <[hidden email]
> >
> > wrote:
> >
> >> Ah ok, perfect! That was the reason why I removed it :)
> >>
> >> On Mon, Nov 3, 2014 at 9:10 AM, Stephan Ewen <[hidden email]> wrote:
> >>
> >>> You do not really need a HBase data sink. You can call
> >>> "DataSet.output(new
> >>> HBaseOutputFormat())"
> >>>
> >>> Stephan
> >>> Am 02.11.2014 23:05 schrieb "Flavio Pompermaier" <[hidden email]
> >:
> >>>
> >>> > Just one last thing..I removed the HbaseDataSink because I think it
> was
> >>> > using the old APIs..can someone help me in updating that class?
> >>> >
> >>> > On Sun, Nov 2, 2014 at 10:55 AM, Flavio Pompermaier <
> >>> [hidden email]>
> >>> > wrote:
> >>> >
> >>> > > Indeed this time the build has been successful :)
> >>> > >
> >>> > > On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <[hidden email]
> >
> >>> > wrote:
> >>> > >
> >>> > >> You can also setup Travis to build your own Github repositories by
> >>> > linking
> >>> > >> it to your Github account. That way Travis can build all your
> >>> branches
> >>> > >> (and
> >>> > >> you can also trigger rebuilds if something fails).
> >>> > >> Not sure if we can manually trigger retrigger builds on the Apache
> >>> > >> repository.
> >>> > >>
> >>> > >> Support for Hadoop 1 and 2 is indeed a very good addition :-)
> >>> > >>
> >>> > >> For the discusion about the PR itself, I would need a bit more
> time
> >>> to
> >>> > >> become more familiar with HBase. I do also not have a HBase setup
> >>> > >> available
> >>> > >> here.
> >>> > >> Maybe somebody else of the community who was involved with a
> >>> previous
> >>> > >> version of the HBase connector could comment on your question.
> >>> > >>
> >>> > >> Best, Fabian
> >>> > >>
> >>> > >> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <
> [hidden email]
> >>> >:
> >>> > >>
> >>> > >> > As suggestes by Fabian I moved the discussion on this mailing
> >>> list.
> >>> > >> >
> >>> > >> > I think that what is still to be discussed is how  to retrigger
> >>> the
> >>> > >> build
> >>> > >> > on Travis (I don't have an account) and if the PR can be
> >>> integrated.
> >>> > >> >
> >>> > >> > Maybe what I can do is to move the HBase example in the test
> >>> package
> >>> > >> (right
> >>> > >> > now I left it in the main folder) so it will force Travis to
> >>> rebuild.
> >>> > >> > I'll do it within a couple of hours.
> >>> > >> >
> >>> > >> > Another thing I forgot to say is that the hbase extension is now
> >>> > >> compatible
> >>> > >> > with both hadoop 1 and 2.
> >>> > >> >
> >>> > >> > Best,
> >>> > >> > Flavio
> >>> > >>
> >>> > >
> >>> >
> >>>
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Fabian Hueske
Ah, sorry. That's the one you removed ;-)

2014-11-03 9:51 GMT+01:00 Fabian Hueske <[hidden email]>:

> I'm not familiar with the HBase connector code, but are you maybe looking
> for the GenericTableOutputFormat?
>
> 2014-11-03 9:44 GMT+01:00 Flavio Pompermaier <[hidden email]>:
>
>> | was trying to modify the example setting hbaseDs.output(new
>> HBaseOutputFormat()); but I can't see any HBaseOutputFormat class..maybe
>> we
>> shall use another class?
>>
>> On Mon, Nov 3, 2014 at 9:39 AM, Flavio Pompermaier <[hidden email]>
>> wrote:
>>
>> > Maybe that's something I could add to the HBase example and that could
>> be
>> > better documented in the Wiki.
>> >
>> > Since we're talking about the wiki..I was looking at the Java API (
>> >
>> http://flink.incubator.apache.org/docs/0.6-incubating/java_api_guide.html
>> )
>> > and the link to the KMeans example is not working (where it says For a
>> > complete example program, have a look at KMeans Algorithm).
>> >
>> > Best,
>> > Flavio
>> >
>> >
>> > On Mon, Nov 3, 2014 at 9:12 AM, Flavio Pompermaier <
>> [hidden email]>
>> > wrote:
>> >
>> >> Ah ok, perfect! That was the reason why I removed it :)
>> >>
>> >> On Mon, Nov 3, 2014 at 9:10 AM, Stephan Ewen <[hidden email]> wrote:
>> >>
>> >>> You do not really need a HBase data sink. You can call
>> >>> "DataSet.output(new
>> >>> HBaseOutputFormat())"
>> >>>
>> >>> Stephan
>> >>> Am 02.11.2014 23:05 schrieb "Flavio Pompermaier" <
>> [hidden email]>:
>> >>>
>> >>> > Just one last thing..I removed the HbaseDataSink because I think it
>> was
>> >>> > using the old APIs..can someone help me in updating that class?
>> >>> >
>> >>> > On Sun, Nov 2, 2014 at 10:55 AM, Flavio Pompermaier <
>> >>> [hidden email]>
>> >>> > wrote:
>> >>> >
>> >>> > > Indeed this time the build has been successful :)
>> >>> > >
>> >>> > > On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <
>> [hidden email]>
>> >>> > wrote:
>> >>> > >
>> >>> > >> You can also setup Travis to build your own Github repositories
>> by
>> >>> > linking
>> >>> > >> it to your Github account. That way Travis can build all your
>> >>> branches
>> >>> > >> (and
>> >>> > >> you can also trigger rebuilds if something fails).
>> >>> > >> Not sure if we can manually trigger retrigger builds on the
>> Apache
>> >>> > >> repository.
>> >>> > >>
>> >>> > >> Support for Hadoop 1 and 2 is indeed a very good addition :-)
>> >>> > >>
>> >>> > >> For the discusion about the PR itself, I would need a bit more
>> time
>> >>> to
>> >>> > >> become more familiar with HBase. I do also not have a HBase setup
>> >>> > >> available
>> >>> > >> here.
>> >>> > >> Maybe somebody else of the community who was involved with a
>> >>> previous
>> >>> > >> version of the HBase connector could comment on your question.
>> >>> > >>
>> >>> > >> Best, Fabian
>> >>> > >>
>> >>> > >> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <
>> [hidden email]
>> >>> >:
>> >>> > >>
>> >>> > >> > As suggestes by Fabian I moved the discussion on this mailing
>> >>> list.
>> >>> > >> >
>> >>> > >> > I think that what is still to be discussed is how  to retrigger
>> >>> the
>> >>> > >> build
>> >>> > >> > on Travis (I don't have an account) and if the PR can be
>> >>> integrated.
>> >>> > >> >
>> >>> > >> > Maybe what I can do is to move the HBase example in the test
>> >>> package
>> >>> > >> (right
>> >>> > >> > now I left it in the main folder) so it will force Travis to
>> >>> rebuild.
>> >>> > >> > I'll do it within a couple of hours.
>> >>> > >> >
>> >>> > >> > Another thing I forgot to say is that the hbase extension is
>> now
>> >>> > >> compatible
>> >>> > >> > with both hadoop 1 and 2.
>> >>> > >> >
>> >>> > >> > Best,
>> >>> > >> > Flavio
>> >>> > >>
>> >>> > >
>> >>> >
>> >>>
>> >>
>> >
>>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Fabian Hueske
There is no HBaseOutputFormat (and nothing equivalent) as far as I can see.
The only thing we had was the GenericTableOutputFormat which was
implemented against the deprecated Java Record API.
We would need to adapt the GenericTableOutputFormat to the new API.

2014-11-03 9:51 GMT+01:00 Fabian Hueske <[hidden email]>:

> Ah, sorry. That's the one you removed ;-)
>
> 2014-11-03 9:51 GMT+01:00 Fabian Hueske <[hidden email]>:
>
>> I'm not familiar with the HBase connector code, but are you maybe looking
>> for the GenericTableOutputFormat?
>>
>> 2014-11-03 9:44 GMT+01:00 Flavio Pompermaier <[hidden email]>:
>>
>>> | was trying to modify the example setting hbaseDs.output(new
>>> HBaseOutputFormat()); but I can't see any HBaseOutputFormat class..maybe
>>> we
>>> shall use another class?
>>>
>>> On Mon, Nov 3, 2014 at 9:39 AM, Flavio Pompermaier <[hidden email]
>>> >
>>> wrote:
>>>
>>> > Maybe that's something I could add to the HBase example and that could
>>> be
>>> > better documented in the Wiki.
>>> >
>>> > Since we're talking about the wiki..I was looking at the Java API (
>>> >
>>> http://flink.incubator.apache.org/docs/0.6-incubating/java_api_guide.html
>>> )
>>> > and the link to the KMeans example is not working (where it says For a
>>> > complete example program, have a look at KMeans Algorithm).
>>> >
>>> > Best,
>>> > Flavio
>>> >
>>> >
>>> > On Mon, Nov 3, 2014 at 9:12 AM, Flavio Pompermaier <
>>> [hidden email]>
>>> > wrote:
>>> >
>>> >> Ah ok, perfect! That was the reason why I removed it :)
>>> >>
>>> >> On Mon, Nov 3, 2014 at 9:10 AM, Stephan Ewen <[hidden email]>
>>> wrote:
>>> >>
>>> >>> You do not really need a HBase data sink. You can call
>>> >>> "DataSet.output(new
>>> >>> HBaseOutputFormat())"
>>> >>>
>>> >>> Stephan
>>> >>> Am 02.11.2014 23:05 schrieb "Flavio Pompermaier" <
>>> [hidden email]>:
>>> >>>
>>> >>> > Just one last thing..I removed the HbaseDataSink because I think
>>> it was
>>> >>> > using the old APIs..can someone help me in updating that class?
>>> >>> >
>>> >>> > On Sun, Nov 2, 2014 at 10:55 AM, Flavio Pompermaier <
>>> >>> [hidden email]>
>>> >>> > wrote:
>>> >>> >
>>> >>> > > Indeed this time the build has been successful :)
>>> >>> > >
>>> >>> > > On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <
>>> [hidden email]>
>>> >>> > wrote:
>>> >>> > >
>>> >>> > >> You can also setup Travis to build your own Github repositories
>>> by
>>> >>> > linking
>>> >>> > >> it to your Github account. That way Travis can build all your
>>> >>> branches
>>> >>> > >> (and
>>> >>> > >> you can also trigger rebuilds if something fails).
>>> >>> > >> Not sure if we can manually trigger retrigger builds on the
>>> Apache
>>> >>> > >> repository.
>>> >>> > >>
>>> >>> > >> Support for Hadoop 1 and 2 is indeed a very good addition :-)
>>> >>> > >>
>>> >>> > >> For the discusion about the PR itself, I would need a bit more
>>> time
>>> >>> to
>>> >>> > >> become more familiar with HBase. I do also not have a HBase
>>> setup
>>> >>> > >> available
>>> >>> > >> here.
>>> >>> > >> Maybe somebody else of the community who was involved with a
>>> >>> previous
>>> >>> > >> version of the HBase connector could comment on your question.
>>> >>> > >>
>>> >>> > >> Best, Fabian
>>> >>> > >>
>>> >>> > >> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <
>>> [hidden email]
>>> >>> >:
>>> >>> > >>
>>> >>> > >> > As suggestes by Fabian I moved the discussion on this mailing
>>> >>> list.
>>> >>> > >> >
>>> >>> > >> > I think that what is still to be discussed is how  to
>>> retrigger
>>> >>> the
>>> >>> > >> build
>>> >>> > >> > on Travis (I don't have an account) and if the PR can be
>>> >>> integrated.
>>> >>> > >> >
>>> >>> > >> > Maybe what I can do is to move the HBase example in the test
>>> >>> package
>>> >>> > >> (right
>>> >>> > >> > now I left it in the main folder) so it will force Travis to
>>> >>> rebuild.
>>> >>> > >> > I'll do it within a couple of hours.
>>> >>> > >> >
>>> >>> > >> > Another thing I forgot to say is that the hbase extension is
>>> now
>>> >>> > >> compatible
>>> >>> > >> > with both hadoop 1 and 2.
>>> >>> > >> >
>>> >>> > >> > Best,
>>> >>> > >> > Flavio
>>> >>> > >>
>>> >>> > >
>>> >>> >
>>> >>>
>>> >>
>>> >
>>>
>>
>>
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Stephan Ewen
Hi Flavio!

The link is broken, but it is also part of the outdates docs.

The current ones are the 0.7 docs under
http://flink.incubator.apache.org/docs/0.7-incubating/

Stephan


On Mon, Nov 3, 2014 at 9:55 AM, Fabian Hueske <[hidden email]> wrote:

> There is no HBaseOutputFormat (and nothing equivalent) as far as I can see.
> The only thing we had was the GenericTableOutputFormat which was
> implemented against the deprecated Java Record API.
> We would need to adapt the GenericTableOutputFormat to the new API.
>
> 2014-11-03 9:51 GMT+01:00 Fabian Hueske <[hidden email]>:
>
> > Ah, sorry. That's the one you removed ;-)
> >
> > 2014-11-03 9:51 GMT+01:00 Fabian Hueske <[hidden email]>:
> >
> >> I'm not familiar with the HBase connector code, but are you maybe
> looking
> >> for the GenericTableOutputFormat?
> >>
> >> 2014-11-03 9:44 GMT+01:00 Flavio Pompermaier <[hidden email]>:
> >>
> >>> | was trying to modify the example setting hbaseDs.output(new
> >>> HBaseOutputFormat()); but I can't see any HBaseOutputFormat
> class..maybe
> >>> we
> >>> shall use another class?
> >>>
> >>> On Mon, Nov 3, 2014 at 9:39 AM, Flavio Pompermaier <
> [hidden email]
> >>> >
> >>> wrote:
> >>>
> >>> > Maybe that's something I could add to the HBase example and that
> could
> >>> be
> >>> > better documented in the Wiki.
> >>> >
> >>> > Since we're talking about the wiki..I was looking at the Java API (
> >>> >
> >>>
> http://flink.incubator.apache.org/docs/0.6-incubating/java_api_guide.html
> >>> )
> >>> > and the link to the KMeans example is not working (where it says For
> a
> >>> > complete example program, have a look at KMeans Algorithm).
> >>> >
> >>> > Best,
> >>> > Flavio
> >>> >
> >>> >
> >>> > On Mon, Nov 3, 2014 at 9:12 AM, Flavio Pompermaier <
> >>> [hidden email]>
> >>> > wrote:
> >>> >
> >>> >> Ah ok, perfect! That was the reason why I removed it :)
> >>> >>
> >>> >> On Mon, Nov 3, 2014 at 9:10 AM, Stephan Ewen <[hidden email]>
> >>> wrote:
> >>> >>
> >>> >>> You do not really need a HBase data sink. You can call
> >>> >>> "DataSet.output(new
> >>> >>> HBaseOutputFormat())"
> >>> >>>
> >>> >>> Stephan
> >>> >>> Am 02.11.2014 23:05 schrieb "Flavio Pompermaier" <
> >>> [hidden email]>:
> >>> >>>
> >>> >>> > Just one last thing..I removed the HbaseDataSink because I think
> >>> it was
> >>> >>> > using the old APIs..can someone help me in updating that class?
> >>> >>> >
> >>> >>> > On Sun, Nov 2, 2014 at 10:55 AM, Flavio Pompermaier <
> >>> >>> [hidden email]>
> >>> >>> > wrote:
> >>> >>> >
> >>> >>> > > Indeed this time the build has been successful :)
> >>> >>> > >
> >>> >>> > > On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <
> >>> [hidden email]>
> >>> >>> > wrote:
> >>> >>> > >
> >>> >>> > >> You can also setup Travis to build your own Github
> repositories
> >>> by
> >>> >>> > linking
> >>> >>> > >> it to your Github account. That way Travis can build all your
> >>> >>> branches
> >>> >>> > >> (and
> >>> >>> > >> you can also trigger rebuilds if something fails).
> >>> >>> > >> Not sure if we can manually trigger retrigger builds on the
> >>> Apache
> >>> >>> > >> repository.
> >>> >>> > >>
> >>> >>> > >> Support for Hadoop 1 and 2 is indeed a very good addition :-)
> >>> >>> > >>
> >>> >>> > >> For the discusion about the PR itself, I would need a bit more
> >>> time
> >>> >>> to
> >>> >>> > >> become more familiar with HBase. I do also not have a HBase
> >>> setup
> >>> >>> > >> available
> >>> >>> > >> here.
> >>> >>> > >> Maybe somebody else of the community who was involved with a
> >>> >>> previous
> >>> >>> > >> version of the HBase connector could comment on your question.
> >>> >>> > >>
> >>> >>> > >> Best, Fabian
> >>> >>> > >>
> >>> >>> > >> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <
> >>> [hidden email]
> >>> >>> >:
> >>> >>> > >>
> >>> >>> > >> > As suggestes by Fabian I moved the discussion on this
> mailing
> >>> >>> list.
> >>> >>> > >> >
> >>> >>> > >> > I think that what is still to be discussed is how  to
> >>> retrigger
> >>> >>> the
> >>> >>> > >> build
> >>> >>> > >> > on Travis (I don't have an account) and if the PR can be
> >>> >>> integrated.
> >>> >>> > >> >
> >>> >>> > >> > Maybe what I can do is to move the HBase example in the test
> >>> >>> package
> >>> >>> > >> (right
> >>> >>> > >> > now I left it in the main folder) so it will force Travis to
> >>> >>> rebuild.
> >>> >>> > >> > I'll do it within a couple of hours.
> >>> >>> > >> >
> >>> >>> > >> > Another thing I forgot to say is that the hbase extension is
> >>> now
> >>> >>> > >> compatible
> >>> >>> > >> > with both hadoop 1 and 2.
> >>> >>> > >> >
> >>> >>> > >> > Best,
> >>> >>> > >> > Flavio
> >>> >>> > >>
> >>> >>> > >
> >>> >>> >
> >>> >>>
> >>> >>
> >>> >
> >>>
> >>
> >>
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Flavio Pompermaier
In reply to this post by Fabian Hueske
That is one class I removed because it was using the deprecated API
GenericDataSink..I can restore them but the it will be a good idea to
remove those warning (also because from what I understood the Record APIs
are going to be removed).

On Mon, Nov 3, 2014 at 9:51 AM, Fabian Hueske <[hidden email]> wrote:

> I'm not familiar with the HBase connector code, but are you maybe looking
> for the GenericTableOutputFormat?
>
> 2014-11-03 9:44 GMT+01:00 Flavio Pompermaier <[hidden email]>:
>
> > | was trying to modify the example setting hbaseDs.output(new
> > HBaseOutputFormat()); but I can't see any HBaseOutputFormat class..maybe
> we
> > shall use another class?
> >
> > On Mon, Nov 3, 2014 at 9:39 AM, Flavio Pompermaier <[hidden email]
> >
> > wrote:
> >
> > > Maybe that's something I could add to the HBase example and that could
> be
> > > better documented in the Wiki.
> > >
> > > Since we're talking about the wiki..I was looking at the Java API (
> > >
> >
> http://flink.incubator.apache.org/docs/0.6-incubating/java_api_guide.html)
> > > and the link to the KMeans example is not working (where it says For a
> > > complete example program, have a look at KMeans Algorithm).
> > >
> > > Best,
> > > Flavio
> > >
> > >
> > > On Mon, Nov 3, 2014 at 9:12 AM, Flavio Pompermaier <
> [hidden email]
> > >
> > > wrote:
> > >
> > >> Ah ok, perfect! That was the reason why I removed it :)
> > >>
> > >> On Mon, Nov 3, 2014 at 9:10 AM, Stephan Ewen <[hidden email]>
> wrote:
> > >>
> > >>> You do not really need a HBase data sink. You can call
> > >>> "DataSet.output(new
> > >>> HBaseOutputFormat())"
> > >>>
> > >>> Stephan
> > >>> Am 02.11.2014 23:05 schrieb "Flavio Pompermaier" <
> [hidden email]
> > >:
> > >>>
> > >>> > Just one last thing..I removed the HbaseDataSink because I think it
> > was
> > >>> > using the old APIs..can someone help me in updating that class?
> > >>> >
> > >>> > On Sun, Nov 2, 2014 at 10:55 AM, Flavio Pompermaier <
> > >>> [hidden email]>
> > >>> > wrote:
> > >>> >
> > >>> > > Indeed this time the build has been successful :)
> > >>> > >
> > >>> > > On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <
> [hidden email]
> > >
> > >>> > wrote:
> > >>> > >
> > >>> > >> You can also setup Travis to build your own Github repositories
> by
> > >>> > linking
> > >>> > >> it to your Github account. That way Travis can build all your
> > >>> branches
> > >>> > >> (and
> > >>> > >> you can also trigger rebuilds if something fails).
> > >>> > >> Not sure if we can manually trigger retrigger builds on the
> Apache
> > >>> > >> repository.
> > >>> > >>
> > >>> > >> Support for Hadoop 1 and 2 is indeed a very good addition :-)
> > >>> > >>
> > >>> > >> For the discusion about the PR itself, I would need a bit more
> > time
> > >>> to
> > >>> > >> become more familiar with HBase. I do also not have a HBase
> setup
> > >>> > >> available
> > >>> > >> here.
> > >>> > >> Maybe somebody else of the community who was involved with a
> > >>> previous
> > >>> > >> version of the HBase connector could comment on your question.
> > >>> > >>
> > >>> > >> Best, Fabian
> > >>> > >>
> > >>> > >> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <
> > [hidden email]
> > >>> >:
> > >>> > >>
> > >>> > >> > As suggestes by Fabian I moved the discussion on this mailing
> > >>> list.
> > >>> > >> >
> > >>> > >> > I think that what is still to be discussed is how  to
> retrigger
> > >>> the
> > >>> > >> build
> > >>> > >> > on Travis (I don't have an account) and if the PR can be
> > >>> integrated.
> > >>> > >> >
> > >>> > >> > Maybe what I can do is to move the HBase example in the test
> > >>> package
> > >>> > >> (right
> > >>> > >> > now I left it in the main folder) so it will force Travis to
> > >>> rebuild.
> > >>> > >> > I'll do it within a couple of hours.
> > >>> > >> >
> > >>> > >> > Another thing I forgot to say is that the hbase extension is
> now
> > >>> > >> compatible
> > >>> > >> > with both hadoop 1 and 2.
> > >>> > >> >
> > >>> > >> > Best,
> > >>> > >> > Flavio
> > >>> > >>
> > >>> > >
> > >>> >
> > >>>
> > >>
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Stephan Ewen
It is fine to remove it, in my opinion.

On Mon, Nov 3, 2014 at 10:11 AM, Flavio Pompermaier <[hidden email]>
wrote:

> That is one class I removed because it was using the deprecated API
> GenericDataSink..I can restore them but the it will be a good idea to
> remove those warning (also because from what I understood the Record APIs
> are going to be removed).
>
> On Mon, Nov 3, 2014 at 9:51 AM, Fabian Hueske <[hidden email]> wrote:
>
> > I'm not familiar with the HBase connector code, but are you maybe looking
> > for the GenericTableOutputFormat?
> >
> > 2014-11-03 9:44 GMT+01:00 Flavio Pompermaier <[hidden email]>:
> >
> > > | was trying to modify the example setting hbaseDs.output(new
> > > HBaseOutputFormat()); but I can't see any HBaseOutputFormat
> class..maybe
> > we
> > > shall use another class?
> > >
> > > On Mon, Nov 3, 2014 at 9:39 AM, Flavio Pompermaier <
> [hidden email]
> > >
> > > wrote:
> > >
> > > > Maybe that's something I could add to the HBase example and that
> could
> > be
> > > > better documented in the Wiki.
> > > >
> > > > Since we're talking about the wiki..I was looking at the Java API (
> > > >
> > >
> >
> http://flink.incubator.apache.org/docs/0.6-incubating/java_api_guide.html)
> > > > and the link to the KMeans example is not working (where it says For
> a
> > > > complete example program, have a look at KMeans Algorithm).
> > > >
> > > > Best,
> > > > Flavio
> > > >
> > > >
> > > > On Mon, Nov 3, 2014 at 9:12 AM, Flavio Pompermaier <
> > [hidden email]
> > > >
> > > > wrote:
> > > >
> > > >> Ah ok, perfect! That was the reason why I removed it :)
> > > >>
> > > >> On Mon, Nov 3, 2014 at 9:10 AM, Stephan Ewen <[hidden email]>
> > wrote:
> > > >>
> > > >>> You do not really need a HBase data sink. You can call
> > > >>> "DataSet.output(new
> > > >>> HBaseOutputFormat())"
> > > >>>
> > > >>> Stephan
> > > >>> Am 02.11.2014 23:05 schrieb "Flavio Pompermaier" <
> > [hidden email]
> > > >:
> > > >>>
> > > >>> > Just one last thing..I removed the HbaseDataSink because I think
> it
> > > was
> > > >>> > using the old APIs..can someone help me in updating that class?
> > > >>> >
> > > >>> > On Sun, Nov 2, 2014 at 10:55 AM, Flavio Pompermaier <
> > > >>> [hidden email]>
> > > >>> > wrote:
> > > >>> >
> > > >>> > > Indeed this time the build has been successful :)
> > > >>> > >
> > > >>> > > On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <
> > [hidden email]
> > > >
> > > >>> > wrote:
> > > >>> > >
> > > >>> > >> You can also setup Travis to build your own Github
> repositories
> > by
> > > >>> > linking
> > > >>> > >> it to your Github account. That way Travis can build all your
> > > >>> branches
> > > >>> > >> (and
> > > >>> > >> you can also trigger rebuilds if something fails).
> > > >>> > >> Not sure if we can manually trigger retrigger builds on the
> > Apache
> > > >>> > >> repository.
> > > >>> > >>
> > > >>> > >> Support for Hadoop 1 and 2 is indeed a very good addition :-)
> > > >>> > >>
> > > >>> > >> For the discusion about the PR itself, I would need a bit more
> > > time
> > > >>> to
> > > >>> > >> become more familiar with HBase. I do also not have a HBase
> > setup
> > > >>> > >> available
> > > >>> > >> here.
> > > >>> > >> Maybe somebody else of the community who was involved with a
> > > >>> previous
> > > >>> > >> version of the HBase connector could comment on your question.
> > > >>> > >>
> > > >>> > >> Best, Fabian
> > > >>> > >>
> > > >>> > >> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <
> > > [hidden email]
> > > >>> >:
> > > >>> > >>
> > > >>> > >> > As suggestes by Fabian I moved the discussion on this
> mailing
> > > >>> list.
> > > >>> > >> >
> > > >>> > >> > I think that what is still to be discussed is how  to
> > retrigger
> > > >>> the
> > > >>> > >> build
> > > >>> > >> > on Travis (I don't have an account) and if the PR can be
> > > >>> integrated.
> > > >>> > >> >
> > > >>> > >> > Maybe what I can do is to move the HBase example in the test
> > > >>> package
> > > >>> > >> (right
> > > >>> > >> > now I left it in the main folder) so it will force Travis to
> > > >>> rebuild.
> > > >>> > >> > I'll do it within a couple of hours.
> > > >>> > >> >
> > > >>> > >> > Another thing I forgot to say is that the hbase extension is
> > now
> > > >>> > >> compatible
> > > >>> > >> > with both hadoop 1 and 2.
> > > >>> > >> >
> > > >>> > >> > Best,
> > > >>> > >> > Flavio
> > > >>> > >>
> > > >>> > >
> > > >>> >
> > > >>>
> > > >>
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Flavio Pompermaier
The problem is that I also removed the GenericTableOutputFormat because
there is an incompatibility between hadoop1 and hadoop2 for class
TaskAttemptContext and TaskAttemptContextImpl..
then it would be nice if the user doesn't have to worry about passing
pact.hbase.jtkey and pact.job.id parameters..
I think it is probably a good idea to remove hadoop1 compatibility and keep
enable HBase addon only for hadoop2 (as before) and decide how to mange
those 2 parameters..

On Mon, Nov 3, 2014 at 10:19 AM, Stephan Ewen <[hidden email]> wrote:

> It is fine to remove it, in my opinion.
>
> On Mon, Nov 3, 2014 at 10:11 AM, Flavio Pompermaier <[hidden email]>
> wrote:
>
> > That is one class I removed because it was using the deprecated API
> > GenericDataSink..I can restore them but the it will be a good idea to
> > remove those warning (also because from what I understood the Record APIs
> > are going to be removed).
> >
> > On Mon, Nov 3, 2014 at 9:51 AM, Fabian Hueske <[hidden email]>
> wrote:
> >
> > > I'm not familiar with the HBase connector code, but are you maybe
> looking
> > > for the GenericTableOutputFormat?
> > >
> > > 2014-11-03 9:44 GMT+01:00 Flavio Pompermaier <[hidden email]>:
> > >
> > > > | was trying to modify the example setting hbaseDs.output(new
> > > > HBaseOutputFormat()); but I can't see any HBaseOutputFormat
> > class..maybe
> > > we
> > > > shall use another class?
> > > >
> > > > On Mon, Nov 3, 2014 at 9:39 AM, Flavio Pompermaier <
> > [hidden email]
> > > >
> > > > wrote:
> > > >
> > > > > Maybe that's something I could add to the HBase example and that
> > could
> > > be
> > > > > better documented in the Wiki.
> > > > >
> > > > > Since we're talking about the wiki..I was looking at the Java API (
> > > > >
> > > >
> > >
> >
> http://flink.incubator.apache.org/docs/0.6-incubating/java_api_guide.html)
> > > > > and the link to the KMeans example is not working (where it says
> For
> > a
> > > > > complete example program, have a look at KMeans Algorithm).
> > > > >
> > > > > Best,
> > > > > Flavio
> > > > >
> > > > >
> > > > > On Mon, Nov 3, 2014 at 9:12 AM, Flavio Pompermaier <
> > > [hidden email]
> > > > >
> > > > > wrote:
> > > > >
> > > > >> Ah ok, perfect! That was the reason why I removed it :)
> > > > >>
> > > > >> On Mon, Nov 3, 2014 at 9:10 AM, Stephan Ewen <[hidden email]>
> > > wrote:
> > > > >>
> > > > >>> You do not really need a HBase data sink. You can call
> > > > >>> "DataSet.output(new
> > > > >>> HBaseOutputFormat())"
> > > > >>>
> > > > >>> Stephan
> > > > >>> Am 02.11.2014 23:05 schrieb "Flavio Pompermaier" <
> > > [hidden email]
> > > > >:
> > > > >>>
> > > > >>> > Just one last thing..I removed the HbaseDataSink because I
> think
> > it
> > > > was
> > > > >>> > using the old APIs..can someone help me in updating that class?
> > > > >>> >
> > > > >>> > On Sun, Nov 2, 2014 at 10:55 AM, Flavio Pompermaier <
> > > > >>> [hidden email]>
> > > > >>> > wrote:
> > > > >>> >
> > > > >>> > > Indeed this time the build has been successful :)
> > > > >>> > >
> > > > >>> > > On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <
> > > [hidden email]
> > > > >
> > > > >>> > wrote:
> > > > >>> > >
> > > > >>> > >> You can also setup Travis to build your own Github
> > repositories
> > > by
> > > > >>> > linking
> > > > >>> > >> it to your Github account. That way Travis can build all
> your
> > > > >>> branches
> > > > >>> > >> (and
> > > > >>> > >> you can also trigger rebuilds if something fails).
> > > > >>> > >> Not sure if we can manually trigger retrigger builds on the
> > > Apache
> > > > >>> > >> repository.
> > > > >>> > >>
> > > > >>> > >> Support for Hadoop 1 and 2 is indeed a very good addition
> :-)
> > > > >>> > >>
> > > > >>> > >> For the discusion about the PR itself, I would need a bit
> more
> > > > time
> > > > >>> to
> > > > >>> > >> become more familiar with HBase. I do also not have a HBase
> > > setup
> > > > >>> > >> available
> > > > >>> > >> here.
> > > > >>> > >> Maybe somebody else of the community who was involved with a
> > > > >>> previous
> > > > >>> > >> version of the HBase connector could comment on your
> question.
> > > > >>> > >>
> > > > >>> > >> Best, Fabian
> > > > >>> > >>
> > > > >>> > >> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <
> > > > [hidden email]
> > > > >>> >:
> > > > >>> > >>
> > > > >>> > >> > As suggestes by Fabian I moved the discussion on this
> > mailing
> > > > >>> list.
> > > > >>> > >> >
> > > > >>> > >> > I think that what is still to be discussed is how  to
> > > retrigger
> > > > >>> the
> > > > >>> > >> build
> > > > >>> > >> > on Travis (I don't have an account) and if the PR can be
> > > > >>> integrated.
> > > > >>> > >> >
> > > > >>> > >> > Maybe what I can do is to move the HBase example in the
> test
> > > > >>> package
> > > > >>> > >> (right
> > > > >>> > >> > now I left it in the main folder) so it will force Travis
> to
> > > > >>> rebuild.
> > > > >>> > >> > I'll do it within a couple of hours.
> > > > >>> > >> >
> > > > >>> > >> > Another thing I forgot to say is that the hbase extension
> is
> > > now
> > > > >>> > >> compatible
> > > > >>> > >> > with both hadoop 1 and 2.
> > > > >>> > >> >
> > > > >>> > >> > Best,
> > > > >>> > >> > Flavio
> > > > >>> > >>
> > > > >>> > >
> > > > >>> >
> > > > >>>
> > > > >>
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Stephan Ewen
Hi!

The way of passing parameters through the configuration is very old (the
original HBase format dated back to that time). I would simply make the
HBase format take those parameters through the constructor.

Greetings,
Stephan


On Mon, Nov 3, 2014 at 10:59 AM, Flavio Pompermaier <[hidden email]>
wrote:

> The problem is that I also removed the GenericTableOutputFormat because
> there is an incompatibility between hadoop1 and hadoop2 for class
> TaskAttemptContext and TaskAttemptContextImpl..
> then it would be nice if the user doesn't have to worry about passing
> pact.hbase.jtkey and pact.job.id parameters..
> I think it is probably a good idea to remove hadoop1 compatibility and keep
> enable HBase addon only for hadoop2 (as before) and decide how to mange
> those 2 parameters..
>
> On Mon, Nov 3, 2014 at 10:19 AM, Stephan Ewen <[hidden email]> wrote:
>
> > It is fine to remove it, in my opinion.
> >
> > On Mon, Nov 3, 2014 at 10:11 AM, Flavio Pompermaier <
> [hidden email]>
> > wrote:
> >
> > > That is one class I removed because it was using the deprecated API
> > > GenericDataSink..I can restore them but the it will be a good idea to
> > > remove those warning (also because from what I understood the Record
> APIs
> > > are going to be removed).
> > >
> > > On Mon, Nov 3, 2014 at 9:51 AM, Fabian Hueske <[hidden email]>
> > wrote:
> > >
> > > > I'm not familiar with the HBase connector code, but are you maybe
> > looking
> > > > for the GenericTableOutputFormat?
> > > >
> > > > 2014-11-03 9:44 GMT+01:00 Flavio Pompermaier <[hidden email]>:
> > > >
> > > > > | was trying to modify the example setting hbaseDs.output(new
> > > > > HBaseOutputFormat()); but I can't see any HBaseOutputFormat
> > > class..maybe
> > > > we
> > > > > shall use another class?
> > > > >
> > > > > On Mon, Nov 3, 2014 at 9:39 AM, Flavio Pompermaier <
> > > [hidden email]
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Maybe that's something I could add to the HBase example and that
> > > could
> > > > be
> > > > > > better documented in the Wiki.
> > > > > >
> > > > > > Since we're talking about the wiki..I was looking at the Java
> API (
> > > > > >
> > > > >
> > > >
> > >
> >
> http://flink.incubator.apache.org/docs/0.6-incubating/java_api_guide.html)
> > > > > > and the link to the KMeans example is not working (where it says
> > For
> > > a
> > > > > > complete example program, have a look at KMeans Algorithm).
> > > > > >
> > > > > > Best,
> > > > > > Flavio
> > > > > >
> > > > > >
> > > > > > On Mon, Nov 3, 2014 at 9:12 AM, Flavio Pompermaier <
> > > > [hidden email]
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > >> Ah ok, perfect! That was the reason why I removed it :)
> > > > > >>
> > > > > >> On Mon, Nov 3, 2014 at 9:10 AM, Stephan Ewen <[hidden email]>
> > > > wrote:
> > > > > >>
> > > > > >>> You do not really need a HBase data sink. You can call
> > > > > >>> "DataSet.output(new
> > > > > >>> HBaseOutputFormat())"
> > > > > >>>
> > > > > >>> Stephan
> > > > > >>> Am 02.11.2014 23:05 schrieb "Flavio Pompermaier" <
> > > > [hidden email]
> > > > > >:
> > > > > >>>
> > > > > >>> > Just one last thing..I removed the HbaseDataSink because I
> > think
> > > it
> > > > > was
> > > > > >>> > using the old APIs..can someone help me in updating that
> class?
> > > > > >>> >
> > > > > >>> > On Sun, Nov 2, 2014 at 10:55 AM, Flavio Pompermaier <
> > > > > >>> [hidden email]>
> > > > > >>> > wrote:
> > > > > >>> >
> > > > > >>> > > Indeed this time the build has been successful :)
> > > > > >>> > >
> > > > > >>> > > On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <
> > > > [hidden email]
> > > > > >
> > > > > >>> > wrote:
> > > > > >>> > >
> > > > > >>> > >> You can also setup Travis to build your own Github
> > > repositories
> > > > by
> > > > > >>> > linking
> > > > > >>> > >> it to your Github account. That way Travis can build all
> > your
> > > > > >>> branches
> > > > > >>> > >> (and
> > > > > >>> > >> you can also trigger rebuilds if something fails).
> > > > > >>> > >> Not sure if we can manually trigger retrigger builds on
> the
> > > > Apache
> > > > > >>> > >> repository.
> > > > > >>> > >>
> > > > > >>> > >> Support for Hadoop 1 and 2 is indeed a very good addition
> > :-)
> > > > > >>> > >>
> > > > > >>> > >> For the discusion about the PR itself, I would need a bit
> > more
> > > > > time
> > > > > >>> to
> > > > > >>> > >> become more familiar with HBase. I do also not have a
> HBase
> > > > setup
> > > > > >>> > >> available
> > > > > >>> > >> here.
> > > > > >>> > >> Maybe somebody else of the community who was involved
> with a
> > > > > >>> previous
> > > > > >>> > >> version of the HBase connector could comment on your
> > question.
> > > > > >>> > >>
> > > > > >>> > >> Best, Fabian
> > > > > >>> > >>
> > > > > >>> > >> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <
> > > > > [hidden email]
> > > > > >>> >:
> > > > > >>> > >>
> > > > > >>> > >> > As suggestes by Fabian I moved the discussion on this
> > > mailing
> > > > > >>> list.
> > > > > >>> > >> >
> > > > > >>> > >> > I think that what is still to be discussed is how  to
> > > > retrigger
> > > > > >>> the
> > > > > >>> > >> build
> > > > > >>> > >> > on Travis (I don't have an account) and if the PR can be
> > > > > >>> integrated.
> > > > > >>> > >> >
> > > > > >>> > >> > Maybe what I can do is to move the HBase example in the
> > test
> > > > > >>> package
> > > > > >>> > >> (right
> > > > > >>> > >> > now I left it in the main folder) so it will force
> Travis
> > to
> > > > > >>> rebuild.
> > > > > >>> > >> > I'll do it within a couple of hours.
> > > > > >>> > >> >
> > > > > >>> > >> > Another thing I forgot to say is that the hbase
> extension
> > is
> > > > now
> > > > > >>> > >> compatible
> > > > > >>> > >> > with both hadoop 1 and 2.
> > > > > >>> > >> >
> > > > > >>> > >> > Best,
> > > > > >>> > >> > Flavio
> > > > > >>> > >>
> > > > > >>> > >
> > > > > >>> >
> > > > > >>>
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Fabian Hueske
Hi Flavio

let me try to answer your last question on the user's list (to the best of
my HBase knowledge).
"I just wanted to known if and how regiom splitting is handled. Can you
explain me in detail how Flink and HBase works?what is not fully clear to
me is when computation is done by region servers and when data start flow
to a Flink worker (that in ky test job is only my pc) and how ro undertsand
better the important logged info to understand if my job is performing well"

HBase partitions its tables into so called "regions" of keys and stores the
regions distributed in the cluster using HDFS. I think an HBase region can
be thought of as a HDFS block. To make reading an HBase table efficient,
region reads should be locally done, i.e., an InputFormat should primarily
read region that are stored on the same machine as the IF is running on.
Flink's InputSplits partition the HBase input by regions and add
information about the storage location of the region. During execution,
input splits are assigned to InputFormats that can do local reads.

Best, Fabian

2014-11-03 11:13 GMT+01:00 Stephan Ewen <[hidden email]>:

> Hi!
>
> The way of passing parameters through the configuration is very old (the
> original HBase format dated back to that time). I would simply make the
> HBase format take those parameters through the constructor.
>
> Greetings,
> Stephan
>
>
> On Mon, Nov 3, 2014 at 10:59 AM, Flavio Pompermaier <[hidden email]>
> wrote:
>
> > The problem is that I also removed the GenericTableOutputFormat because
> > there is an incompatibility between hadoop1 and hadoop2 for class
> > TaskAttemptContext and TaskAttemptContextImpl..
> > then it would be nice if the user doesn't have to worry about passing
> > pact.hbase.jtkey and pact.job.id parameters..
> > I think it is probably a good idea to remove hadoop1 compatibility and
> keep
> > enable HBase addon only for hadoop2 (as before) and decide how to mange
> > those 2 parameters..
> >
> > On Mon, Nov 3, 2014 at 10:19 AM, Stephan Ewen <[hidden email]> wrote:
> >
> > > It is fine to remove it, in my opinion.
> > >
> > > On Mon, Nov 3, 2014 at 10:11 AM, Flavio Pompermaier <
> > [hidden email]>
> > > wrote:
> > >
> > > > That is one class I removed because it was using the deprecated API
> > > > GenericDataSink..I can restore them but the it will be a good idea to
> > > > remove those warning (also because from what I understood the Record
> > APIs
> > > > are going to be removed).
> > > >
> > > > On Mon, Nov 3, 2014 at 9:51 AM, Fabian Hueske <[hidden email]>
> > > wrote:
> > > >
> > > > > I'm not familiar with the HBase connector code, but are you maybe
> > > looking
> > > > > for the GenericTableOutputFormat?
> > > > >
> > > > > 2014-11-03 9:44 GMT+01:00 Flavio Pompermaier <[hidden email]
> >:
> > > > >
> > > > > > | was trying to modify the example setting hbaseDs.output(new
> > > > > > HBaseOutputFormat()); but I can't see any HBaseOutputFormat
> > > > class..maybe
> > > > > we
> > > > > > shall use another class?
> > > > > >
> > > > > > On Mon, Nov 3, 2014 at 9:39 AM, Flavio Pompermaier <
> > > > [hidden email]
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Maybe that's something I could add to the HBase example and
> that
> > > > could
> > > > > be
> > > > > > > better documented in the Wiki.
> > > > > > >
> > > > > > > Since we're talking about the wiki..I was looking at the Java
> > API (
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://flink.incubator.apache.org/docs/0.6-incubating/java_api_guide.html)
> > > > > > > and the link to the KMeans example is not working (where it
> says
> > > For
> > > > a
> > > > > > > complete example program, have a look at KMeans Algorithm).
> > > > > > >
> > > > > > > Best,
> > > > > > > Flavio
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Nov 3, 2014 at 9:12 AM, Flavio Pompermaier <
> > > > > [hidden email]
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Ah ok, perfect! That was the reason why I removed it :)
> > > > > > >>
> > > > > > >> On Mon, Nov 3, 2014 at 9:10 AM, Stephan Ewen <
> [hidden email]>
> > > > > wrote:
> > > > > > >>
> > > > > > >>> You do not really need a HBase data sink. You can call
> > > > > > >>> "DataSet.output(new
> > > > > > >>> HBaseOutputFormat())"
> > > > > > >>>
> > > > > > >>> Stephan
> > > > > > >>> Am 02.11.2014 23:05 schrieb "Flavio Pompermaier" <
> > > > > [hidden email]
> > > > > > >:
> > > > > > >>>
> > > > > > >>> > Just one last thing..I removed the HbaseDataSink because I
> > > think
> > > > it
> > > > > > was
> > > > > > >>> > using the old APIs..can someone help me in updating that
> > class?
> > > > > > >>> >
> > > > > > >>> > On Sun, Nov 2, 2014 at 10:55 AM, Flavio Pompermaier <
> > > > > > >>> [hidden email]>
> > > > > > >>> > wrote:
> > > > > > >>> >
> > > > > > >>> > > Indeed this time the build has been successful :)
> > > > > > >>> > >
> > > > > > >>> > > On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <
> > > > > [hidden email]
> > > > > > >
> > > > > > >>> > wrote:
> > > > > > >>> > >
> > > > > > >>> > >> You can also setup Travis to build your own Github
> > > > repositories
> > > > > by
> > > > > > >>> > linking
> > > > > > >>> > >> it to your Github account. That way Travis can build all
> > > your
> > > > > > >>> branches
> > > > > > >>> > >> (and
> > > > > > >>> > >> you can also trigger rebuilds if something fails).
> > > > > > >>> > >> Not sure if we can manually trigger retrigger builds on
> > the
> > > > > Apache
> > > > > > >>> > >> repository.
> > > > > > >>> > >>
> > > > > > >>> > >> Support for Hadoop 1 and 2 is indeed a very good
> addition
> > > :-)
> > > > > > >>> > >>
> > > > > > >>> > >> For the discusion about the PR itself, I would need a
> bit
> > > more
> > > > > > time
> > > > > > >>> to
> > > > > > >>> > >> become more familiar with HBase. I do also not have a
> > HBase
> > > > > setup
> > > > > > >>> > >> available
> > > > > > >>> > >> here.
> > > > > > >>> > >> Maybe somebody else of the community who was involved
> > with a
> > > > > > >>> previous
> > > > > > >>> > >> version of the HBase connector could comment on your
> > > question.
> > > > > > >>> > >>
> > > > > > >>> > >> Best, Fabian
> > > > > > >>> > >>
> > > > > > >>> > >> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <
> > > > > > [hidden email]
> > > > > > >>> >:
> > > > > > >>> > >>
> > > > > > >>> > >> > As suggestes by Fabian I moved the discussion on this
> > > > mailing
> > > > > > >>> list.
> > > > > > >>> > >> >
> > > > > > >>> > >> > I think that what is still to be discussed is how  to
> > > > > retrigger
> > > > > > >>> the
> > > > > > >>> > >> build
> > > > > > >>> > >> > on Travis (I don't have an account) and if the PR can
> be
> > > > > > >>> integrated.
> > > > > > >>> > >> >
> > > > > > >>> > >> > Maybe what I can do is to move the HBase example in
> the
> > > test
> > > > > > >>> package
> > > > > > >>> > >> (right
> > > > > > >>> > >> > now I left it in the main folder) so it will force
> > Travis
> > > to
> > > > > > >>> rebuild.
> > > > > > >>> > >> > I'll do it within a couple of hours.
> > > > > > >>> > >> >
> > > > > > >>> > >> > Another thing I forgot to say is that the hbase
> > extension
> > > is
> > > > > now
> > > > > > >>> > >> compatible
> > > > > > >>> > >> > with both hadoop 1 and 2.
> > > > > > >>> > >> >
> > > > > > >>> > >> > Best,
> > > > > > >>> > >> > Flavio
> > > > > > >>> > >>
> > > > > > >>> > >
> > > > > > >>> >
> > > > > > >>>
> > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Flavio Pompermaier
Thanks for the detailed answer. So if I run a job from my machine I'll have
to download all the scanned data in a table..right?

Always regarding the GenericTableOutputFormat it is not clear to me how to
proceed..
I saw in the hadoop compatibility addon that it is possible to have such
compatibility using HBaseUtils class so the open method should become
something like:

@Override
public void open(int taskNumber, int numTasks) throws IOException {
if (Integer.toString(taskNumber + 1).length() > 6) {
throw new IOException("Task id too large.");
}
TaskAttemptID taskAttemptID = TaskAttemptID.forName("attempt__0000_r_"
+ String.format("%" + (6 - Integer.toString(taskNumber + 1).length()) +
"s"," ").replace(" ", "0")
+ Integer.toString(taskNumber + 1)
+ "_0");
 this.configuration.set("mapred.task.id", taskAttemptID.toString());
this.configuration.setInt("mapred.task.partition", taskNumber + 1);
// for hadoop 2.2
this.configuration.set("mapreduce.task.attempt.id",
taskAttemptID.toString());
this.configuration.setInt("mapreduce.task.partition", taskNumber + 1);
 try {
this.context =
HadoopUtils.instantiateTaskAttemptContext(this.configuration,
taskAttemptID);
} catch (Exception e) {
throw new RuntimeException(e);
}
final HFileOutputFormat2 outFormat = new HFileOutputFormat2();
try {
this.writer = outFormat.getRecordWriter(this.context);
} catch (InterruptedException iex) {
throw new IOException("Opening the writer was interrupted.", iex);
}
}

But I'm not sure about how to pass the JobConf to the class, if to merge
config fileas, where HFileOutputFormat2 writes the data and how to
implement the public void writeRecord(Record record) API.
Could I do a little chat off the mailing list with the implementor of this
extension?

On Mon, Nov 3, 2014 at 11:51 AM, Fabian Hueske <[hidden email]> wrote:

> Hi Flavio
>
> let me try to answer your last question on the user's list (to the best of
> my HBase knowledge).
> "I just wanted to known if and how regiom splitting is handled. Can you
> explain me in detail how Flink and HBase works?what is not fully clear to
> me is when computation is done by region servers and when data start flow
> to a Flink worker (that in ky test job is only my pc) and how ro undertsand
> better the important logged info to understand if my job is performing
> well"
>
> HBase partitions its tables into so called "regions" of keys and stores the
> regions distributed in the cluster using HDFS. I think an HBase region can
> be thought of as a HDFS block. To make reading an HBase table efficient,
> region reads should be locally done, i.e., an InputFormat should primarily
> read region that are stored on the same machine as the IF is running on.
> Flink's InputSplits partition the HBase input by regions and add
> information about the storage location of the region. During execution,
> input splits are assigned to InputFormats that can do local reads.
>
> Best, Fabian
>
> 2014-11-03 11:13 GMT+01:00 Stephan Ewen <[hidden email]>:
>
> > Hi!
> >
> > The way of passing parameters through the configuration is very old (the
> > original HBase format dated back to that time). I would simply make the
> > HBase format take those parameters through the constructor.
> >
> > Greetings,
> > Stephan
> >
> >
> > On Mon, Nov 3, 2014 at 10:59 AM, Flavio Pompermaier <
> [hidden email]>
> > wrote:
> >
> > > The problem is that I also removed the GenericTableOutputFormat because
> > > there is an incompatibility between hadoop1 and hadoop2 for class
> > > TaskAttemptContext and TaskAttemptContextImpl..
> > > then it would be nice if the user doesn't have to worry about passing
> > > pact.hbase.jtkey and pact.job.id parameters..
> > > I think it is probably a good idea to remove hadoop1 compatibility and
> > keep
> > > enable HBase addon only for hadoop2 (as before) and decide how to mange
> > > those 2 parameters..
> > >
> > > On Mon, Nov 3, 2014 at 10:19 AM, Stephan Ewen <[hidden email]>
> wrote:
> > >
> > > > It is fine to remove it, in my opinion.
> > > >
> > > > On Mon, Nov 3, 2014 at 10:11 AM, Flavio Pompermaier <
> > > [hidden email]>
> > > > wrote:
> > > >
> > > > > That is one class I removed because it was using the deprecated API
> > > > > GenericDataSink..I can restore them but the it will be a good idea
> to
> > > > > remove those warning (also because from what I understood the
> Record
> > > APIs
> > > > > are going to be removed).
> > > > >
> > > > > On Mon, Nov 3, 2014 at 9:51 AM, Fabian Hueske <[hidden email]>
> > > > wrote:
> > > > >
> > > > > > I'm not familiar with the HBase connector code, but are you maybe
> > > > looking
> > > > > > for the GenericTableOutputFormat?
> > > > > >
> > > > > > 2014-11-03 9:44 GMT+01:00 Flavio Pompermaier <
> [hidden email]
> > >:
> > > > > >
> > > > > > > | was trying to modify the example setting hbaseDs.output(new
> > > > > > > HBaseOutputFormat()); but I can't see any HBaseOutputFormat
> > > > > class..maybe
> > > > > > we
> > > > > > > shall use another class?
> > > > > > >
> > > > > > > On Mon, Nov 3, 2014 at 9:39 AM, Flavio Pompermaier <
> > > > > [hidden email]
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Maybe that's something I could add to the HBase example and
> > that
> > > > > could
> > > > > > be
> > > > > > > > better documented in the Wiki.
> > > > > > > >
> > > > > > > > Since we're talking about the wiki..I was looking at the Java
> > > API (
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://flink.incubator.apache.org/docs/0.6-incubating/java_api_guide.html)
> > > > > > > > and the link to the KMeans example is not working (where it
> > says
> > > > For
> > > > > a
> > > > > > > > complete example program, have a look at KMeans Algorithm).
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Flavio
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Nov 3, 2014 at 9:12 AM, Flavio Pompermaier <
> > > > > > [hidden email]
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > >> Ah ok, perfect! That was the reason why I removed it :)
> > > > > > > >>
> > > > > > > >> On Mon, Nov 3, 2014 at 9:10 AM, Stephan Ewen <
> > [hidden email]>
> > > > > > wrote:
> > > > > > > >>
> > > > > > > >>> You do not really need a HBase data sink. You can call
> > > > > > > >>> "DataSet.output(new
> > > > > > > >>> HBaseOutputFormat())"
> > > > > > > >>>
> > > > > > > >>> Stephan
> > > > > > > >>> Am 02.11.2014 23:05 schrieb "Flavio Pompermaier" <
> > > > > > [hidden email]
> > > > > > > >:
> > > > > > > >>>
> > > > > > > >>> > Just one last thing..I removed the HbaseDataSink because
> I
> > > > think
> > > > > it
> > > > > > > was
> > > > > > > >>> > using the old APIs..can someone help me in updating that
> > > class?
> > > > > > > >>> >
> > > > > > > >>> > On Sun, Nov 2, 2014 at 10:55 AM, Flavio Pompermaier <
> > > > > > > >>> [hidden email]>
> > > > > > > >>> > wrote:
> > > > > > > >>> >
> > > > > > > >>> > > Indeed this time the build has been successful :)
> > > > > > > >>> > >
> > > > > > > >>> > > On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <
> > > > > > [hidden email]
> > > > > > > >
> > > > > > > >>> > wrote:
> > > > > > > >>> > >
> > > > > > > >>> > >> You can also setup Travis to build your own Github
> > > > > repositories
> > > > > > by
> > > > > > > >>> > linking
> > > > > > > >>> > >> it to your Github account. That way Travis can build
> all
> > > > your
> > > > > > > >>> branches
> > > > > > > >>> > >> (and
> > > > > > > >>> > >> you can also trigger rebuilds if something fails).
> > > > > > > >>> > >> Not sure if we can manually trigger retrigger builds
> on
> > > the
> > > > > > Apache
> > > > > > > >>> > >> repository.
> > > > > > > >>> > >>
> > > > > > > >>> > >> Support for Hadoop 1 and 2 is indeed a very good
> > addition
> > > > :-)
> > > > > > > >>> > >>
> > > > > > > >>> > >> For the discusion about the PR itself, I would need a
> > bit
> > > > more
> > > > > > > time
> > > > > > > >>> to
> > > > > > > >>> > >> become more familiar with HBase. I do also not have a
> > > HBase
> > > > > > setup
> > > > > > > >>> > >> available
> > > > > > > >>> > >> here.
> > > > > > > >>> > >> Maybe somebody else of the community who was involved
> > > with a
> > > > > > > >>> previous
> > > > > > > >>> > >> version of the HBase connector could comment on your
> > > > question.
> > > > > > > >>> > >>
> > > > > > > >>> > >> Best, Fabian
> > > > > > > >>> > >>
> > > > > > > >>> > >> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <
> > > > > > > [hidden email]
> > > > > > > >>> >:
> > > > > > > >>> > >>
> > > > > > > >>> > >> > As suggestes by Fabian I moved the discussion on
> this
> > > > > mailing
> > > > > > > >>> list.
> > > > > > > >>> > >> >
> > > > > > > >>> > >> > I think that what is still to be discussed is how
> to
> > > > > > retrigger
> > > > > > > >>> the
> > > > > > > >>> > >> build
> > > > > > > >>> > >> > on Travis (I don't have an account) and if the PR
> can
> > be
> > > > > > > >>> integrated.
> > > > > > > >>> > >> >
> > > > > > > >>> > >> > Maybe what I can do is to move the HBase example in
> > the
> > > > test
> > > > > > > >>> package
> > > > > > > >>> > >> (right
> > > > > > > >>> > >> > now I left it in the main folder) so it will force
> > > Travis
> > > > to
> > > > > > > >>> rebuild.
> > > > > > > >>> > >> > I'll do it within a couple of hours.
> > > > > > > >>> > >> >
> > > > > > > >>> > >> > Another thing I forgot to say is that the hbase
> > > extension
> > > > is
> > > > > > now
> > > > > > > >>> > >> compatible
> > > > > > > >>> > >> > with both hadoop 1 and 2.
> > > > > > > >>> > >> >
> > > > > > > >>> > >> > Best,
> > > > > > > >>> > >> > Flavio
> > > > > > > >>> > >>
> > > > > > > >>> > >
> > > > > > > >>> >
> > > > > > > >>>
> > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Flavio Pompermaier
I've just updated the code on my fork (synch with current master and
applied improvements coming from comments on related PR).
I still have to understand how to write results back to an HBase
Sink/OutputFormat...

On Mon, Nov 3, 2014 at 12:05 PM, Flavio Pompermaier <[hidden email]>
wrote:

> Thanks for the detailed answer. So if I run a job from my machine I'll
> have to download all the scanned data in a table..right?
>
> Always regarding the GenericTableOutputFormat it is not clear to me how to
> proceed..
> I saw in the hadoop compatibility addon that it is possible to have such
> compatibility using HBaseUtils class so the open method should become
> something like:
>
> @Override
> public void open(int taskNumber, int numTasks) throws IOException {
> if (Integer.toString(taskNumber + 1).length() > 6) {
> throw new IOException("Task id too large.");
> }
> TaskAttemptID taskAttemptID = TaskAttemptID.forName("attempt__0000_r_"
> + String.format("%" + (6 - Integer.toString(taskNumber + 1).length()) +
> "s"," ").replace(" ", "0")
> + Integer.toString(taskNumber + 1)
> + "_0");
>  this.configuration.set("mapred.task.id", taskAttemptID.toString());
> this.configuration.setInt("mapred.task.partition", taskNumber + 1);
> // for hadoop 2.2
> this.configuration.set("mapreduce.task.attempt.id",
> taskAttemptID.toString());
> this.configuration.setInt("mapreduce.task.partition", taskNumber + 1);
>  try {
> this.context =
> HadoopUtils.instantiateTaskAttemptContext(this.configuration,
> taskAttemptID);
> } catch (Exception e) {
> throw new RuntimeException(e);
> }
> final HFileOutputFormat2 outFormat = new HFileOutputFormat2();
> try {
> this.writer = outFormat.getRecordWriter(this.context);
> } catch (InterruptedException iex) {
> throw new IOException("Opening the writer was interrupted.", iex);
> }
> }
>
> But I'm not sure about how to pass the JobConf to the class, if to merge
> config fileas, where HFileOutputFormat2 writes the data and how to
> implement the public void writeRecord(Record record) API.
> Could I do a little chat off the mailing list with the implementor of this
> extension?
>
> On Mon, Nov 3, 2014 at 11:51 AM, Fabian Hueske <[hidden email]> wrote:
>
>> Hi Flavio
>>
>> let me try to answer your last question on the user's list (to the best of
>> my HBase knowledge).
>> "I just wanted to known if and how regiom splitting is handled. Can you
>> explain me in detail how Flink and HBase works?what is not fully clear to
>> me is when computation is done by region servers and when data start flow
>> to a Flink worker (that in ky test job is only my pc) and how ro
>> undertsand
>> better the important logged info to understand if my job is performing
>> well"
>>
>> HBase partitions its tables into so called "regions" of keys and stores
>> the
>> regions distributed in the cluster using HDFS. I think an HBase region can
>> be thought of as a HDFS block. To make reading an HBase table efficient,
>> region reads should be locally done, i.e., an InputFormat should primarily
>> read region that are stored on the same machine as the IF is running on.
>> Flink's InputSplits partition the HBase input by regions and add
>> information about the storage location of the region. During execution,
>> input splits are assigned to InputFormats that can do local reads.
>>
>> Best, Fabian
>>
>> 2014-11-03 11:13 GMT+01:00 Stephan Ewen <[hidden email]>:
>>
>> > Hi!
>> >
>> > The way of passing parameters through the configuration is very old (the
>> > original HBase format dated back to that time). I would simply make the
>> > HBase format take those parameters through the constructor.
>> >
>> > Greetings,
>> > Stephan
>> >
>> >
>> > On Mon, Nov 3, 2014 at 10:59 AM, Flavio Pompermaier <
>> [hidden email]>
>> > wrote:
>> >
>> > > The problem is that I also removed the GenericTableOutputFormat
>> because
>> > > there is an incompatibility between hadoop1 and hadoop2 for class
>> > > TaskAttemptContext and TaskAttemptContextImpl..
>> > > then it would be nice if the user doesn't have to worry about passing
>> > > pact.hbase.jtkey and pact.job.id parameters..
>> > > I think it is probably a good idea to remove hadoop1 compatibility and
>> > keep
>> > > enable HBase addon only for hadoop2 (as before) and decide how to
>> mange
>> > > those 2 parameters..
>> > >
>> > > On Mon, Nov 3, 2014 at 10:19 AM, Stephan Ewen <[hidden email]>
>> wrote:
>> > >
>> > > > It is fine to remove it, in my opinion.
>> > > >
>> > > > On Mon, Nov 3, 2014 at 10:11 AM, Flavio Pompermaier <
>> > > [hidden email]>
>> > > > wrote:
>> > > >
>> > > > > That is one class I removed because it was using the deprecated
>> API
>> > > > > GenericDataSink..I can restore them but the it will be a good
>> idea to
>> > > > > remove those warning (also because from what I understood the
>> Record
>> > > APIs
>> > > > > are going to be removed).
>> > > > >
>> > > > > On Mon, Nov 3, 2014 at 9:51 AM, Fabian Hueske <[hidden email]
>> >
>> > > > wrote:
>> > > > >
>> > > > > > I'm not familiar with the HBase connector code, but are you
>> maybe
>> > > > looking
>> > > > > > for the GenericTableOutputFormat?
>> > > > > >
>> > > > > > 2014-11-03 9:44 GMT+01:00 Flavio Pompermaier <
>> [hidden email]
>> > >:
>> > > > > >
>> > > > > > > | was trying to modify the example setting hbaseDs.output(new
>> > > > > > > HBaseOutputFormat()); but I can't see any HBaseOutputFormat
>> > > > > class..maybe
>> > > > > > we
>> > > > > > > shall use another class?
>> > > > > > >
>> > > > > > > On Mon, Nov 3, 2014 at 9:39 AM, Flavio Pompermaier <
>> > > > > [hidden email]
>> > > > > > >
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Maybe that's something I could add to the HBase example and
>> > that
>> > > > > could
>> > > > > > be
>> > > > > > > > better documented in the Wiki.
>> > > > > > > >
>> > > > > > > > Since we're talking about the wiki..I was looking at the
>> Java
>> > > API (
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://flink.incubator.apache.org/docs/0.6-incubating/java_api_guide.html
>> )
>> > > > > > > > and the link to the KMeans example is not working (where it
>> > says
>> > > > For
>> > > > > a
>> > > > > > > > complete example program, have a look at KMeans Algorithm).
>> > > > > > > >
>> > > > > > > > Best,
>> > > > > > > > Flavio
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > On Mon, Nov 3, 2014 at 9:12 AM, Flavio Pompermaier <
>> > > > > > [hidden email]
>> > > > > > > >
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > >> Ah ok, perfect! That was the reason why I removed it :)
>> > > > > > > >>
>> > > > > > > >> On Mon, Nov 3, 2014 at 9:10 AM, Stephan Ewen <
>> > [hidden email]>
>> > > > > > wrote:
>> > > > > > > >>
>> > > > > > > >>> You do not really need a HBase data sink. You can call
>> > > > > > > >>> "DataSet.output(new
>> > > > > > > >>> HBaseOutputFormat())"
>> > > > > > > >>>
>> > > > > > > >>> Stephan
>> > > > > > > >>> Am 02.11.2014 23:05 schrieb "Flavio Pompermaier" <
>> > > > > > [hidden email]
>> > > > > > > >:
>> > > > > > > >>>
>> > > > > > > >>> > Just one last thing..I removed the HbaseDataSink
>> because I
>> > > > think
>> > > > > it
>> > > > > > > was
>> > > > > > > >>> > using the old APIs..can someone help me in updating that
>> > > class?
>> > > > > > > >>> >
>> > > > > > > >>> > On Sun, Nov 2, 2014 at 10:55 AM, Flavio Pompermaier <
>> > > > > > > >>> [hidden email]>
>> > > > > > > >>> > wrote:
>> > > > > > > >>> >
>> > > > > > > >>> > > Indeed this time the build has been successful :)
>> > > > > > > >>> > >
>> > > > > > > >>> > > On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <
>> > > > > > [hidden email]
>> > > > > > > >
>> > > > > > > >>> > wrote:
>> > > > > > > >>> > >
>> > > > > > > >>> > >> You can also setup Travis to build your own Github
>> > > > > repositories
>> > > > > > by
>> > > > > > > >>> > linking
>> > > > > > > >>> > >> it to your Github account. That way Travis can build
>> all
>> > > > your
>> > > > > > > >>> branches
>> > > > > > > >>> > >> (and
>> > > > > > > >>> > >> you can also trigger rebuilds if something fails).
>> > > > > > > >>> > >> Not sure if we can manually trigger retrigger builds
>> on
>> > > the
>> > > > > > Apache
>> > > > > > > >>> > >> repository.
>> > > > > > > >>> > >>
>> > > > > > > >>> > >> Support for Hadoop 1 and 2 is indeed a very good
>> > addition
>> > > > :-)
>> > > > > > > >>> > >>
>> > > > > > > >>> > >> For the discusion about the PR itself, I would need a
>> > bit
>> > > > more
>> > > > > > > time
>> > > > > > > >>> to
>> > > > > > > >>> > >> become more familiar with HBase. I do also not have a
>> > > HBase
>> > > > > > setup
>> > > > > > > >>> > >> available
>> > > > > > > >>> > >> here.
>> > > > > > > >>> > >> Maybe somebody else of the community who was involved
>> > > with a
>> > > > > > > >>> previous
>> > > > > > > >>> > >> version of the HBase connector could comment on your
>> > > > question.
>> > > > > > > >>> > >>
>> > > > > > > >>> > >> Best, Fabian
>> > > > > > > >>> > >>
>> > > > > > > >>> > >> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <
>> > > > > > > [hidden email]
>> > > > > > > >>> >:
>> > > > > > > >>> > >>
>> > > > > > > >>> > >> > As suggestes by Fabian I moved the discussion on
>> this
>> > > > > mailing
>> > > > > > > >>> list.
>> > > > > > > >>> > >> >
>> > > > > > > >>> > >> > I think that what is still to be discussed is how
>> to
>> > > > > > retrigger
>> > > > > > > >>> the
>> > > > > > > >>> > >> build
>> > > > > > > >>> > >> > on Travis (I don't have an account) and if the PR
>> can
>> > be
>> > > > > > > >>> integrated.
>> > > > > > > >>> > >> >
>> > > > > > > >>> > >> > Maybe what I can do is to move the HBase example in
>> > the
>> > > > test
>> > > > > > > >>> package
>> > > > > > > >>> > >> (right
>> > > > > > > >>> > >> > now I left it in the main folder) so it will force
>> > > Travis
>> > > > to
>> > > > > > > >>> rebuild.
>> > > > > > > >>> > >> > I'll do it within a couple of hours.
>> > > > > > > >>> > >> >
>> > > > > > > >>> > >> > Another thing I forgot to say is that the hbase
>> > > extension
>> > > > is
>> > > > > > now
>> > > > > > > >>> > >> compatible
>> > > > > > > >>> > >> > with both hadoop 1 and 2.
>> > > > > > > >>> > >> >
>> > > > > > > >>> > >> > Best,
>> > > > > > > >>> > >> > Flavio
>> > > > > > > >>> > >>
>> > > > > > > >>> > >
>> > > > > > > >>> >
>> > > > > > > >>>
>> > > > > > > >>
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: HBase 0.98 addon for Flink 0.8

Flavio Pompermaier
I fixed also the profile for Cloudera CDH5.1.3. You can build it with the
command:
      mvn clean install -Dmaven.test.skip=true -Dhadoop.profile=2
 -Pvendor-repos,cdh5.1.3

However, it would be good to generate the specific jar when
releasing..(e.g. flink-addons:flink-hbase:0.8.0-hadoop2-cdh5.1.3-incubating)

Best,
Flavio

On Fri, Nov 7, 2014 at 12:44 PM, Flavio Pompermaier <[hidden email]>
wrote:

> I've just updated the code on my fork (synch with current master and
> applied improvements coming from comments on related PR).
> I still have to understand how to write results back to an HBase
> Sink/OutputFormat...
>
>
> On Mon, Nov 3, 2014 at 12:05 PM, Flavio Pompermaier <[hidden email]>
> wrote:
>
>> Thanks for the detailed answer. So if I run a job from my machine I'll
>> have to download all the scanned data in a table..right?
>>
>> Always regarding the GenericTableOutputFormat it is not clear to me how
>> to proceed..
>> I saw in the hadoop compatibility addon that it is possible to have such
>> compatibility using HBaseUtils class so the open method should become
>> something like:
>>
>> @Override
>> public void open(int taskNumber, int numTasks) throws IOException {
>> if (Integer.toString(taskNumber + 1).length() > 6) {
>> throw new IOException("Task id too large.");
>> }
>> TaskAttemptID taskAttemptID = TaskAttemptID.forName("attempt__0000_r_"
>> + String.format("%" + (6 - Integer.toString(taskNumber + 1).length()) +
>> "s"," ").replace(" ", "0")
>> + Integer.toString(taskNumber + 1)
>> + "_0");
>>  this.configuration.set("mapred.task.id", taskAttemptID.toString());
>> this.configuration.setInt("mapred.task.partition", taskNumber + 1);
>> // for hadoop 2.2
>> this.configuration.set("mapreduce.task.attempt.id",
>> taskAttemptID.toString());
>> this.configuration.setInt("mapreduce.task.partition", taskNumber + 1);
>>  try {
>> this.context =
>> HadoopUtils.instantiateTaskAttemptContext(this.configuration,
>> taskAttemptID);
>> } catch (Exception e) {
>> throw new RuntimeException(e);
>> }
>> final HFileOutputFormat2 outFormat = new HFileOutputFormat2();
>> try {
>> this.writer = outFormat.getRecordWriter(this.context);
>> } catch (InterruptedException iex) {
>> throw new IOException("Opening the writer was interrupted.", iex);
>> }
>> }
>>
>> But I'm not sure about how to pass the JobConf to the class, if to merge
>> config fileas, where HFileOutputFormat2 writes the data and how to
>> implement the public void writeRecord(Record record) API.
>> Could I do a little chat off the mailing list with the implementor of
>> this extension?
>>
>> On Mon, Nov 3, 2014 at 11:51 AM, Fabian Hueske <[hidden email]>
>> wrote:
>>
>>> Hi Flavio
>>>
>>> let me try to answer your last question on the user's list (to the best
>>> of
>>> my HBase knowledge).
>>> "I just wanted to known if and how regiom splitting is handled. Can you
>>> explain me in detail how Flink and HBase works?what is not fully clear to
>>> me is when computation is done by region servers and when data start flow
>>> to a Flink worker (that in ky test job is only my pc) and how ro
>>> undertsand
>>> better the important logged info to understand if my job is performing
>>> well"
>>>
>>> HBase partitions its tables into so called "regions" of keys and stores
>>> the
>>> regions distributed in the cluster using HDFS. I think an HBase region
>>> can
>>> be thought of as a HDFS block. To make reading an HBase table efficient,
>>> region reads should be locally done, i.e., an InputFormat should
>>> primarily
>>> read region that are stored on the same machine as the IF is running on.
>>> Flink's InputSplits partition the HBase input by regions and add
>>> information about the storage location of the region. During execution,
>>> input splits are assigned to InputFormats that can do local reads.
>>>
>>> Best, Fabian
>>>
>>> 2014-11-03 11:13 GMT+01:00 Stephan Ewen <[hidden email]>:
>>>
>>> > Hi!
>>> >
>>> > The way of passing parameters through the configuration is very old
>>> (the
>>> > original HBase format dated back to that time). I would simply make the
>>> > HBase format take those parameters through the constructor.
>>> >
>>> > Greetings,
>>> > Stephan
>>> >
>>> >
>>> > On Mon, Nov 3, 2014 at 10:59 AM, Flavio Pompermaier <
>>> [hidden email]>
>>> > wrote:
>>> >
>>> > > The problem is that I also removed the GenericTableOutputFormat
>>> because
>>> > > there is an incompatibility between hadoop1 and hadoop2 for class
>>> > > TaskAttemptContext and TaskAttemptContextImpl..
>>> > > then it would be nice if the user doesn't have to worry about passing
>>> > > pact.hbase.jtkey and pact.job.id parameters..
>>> > > I think it is probably a good idea to remove hadoop1 compatibility
>>> and
>>> > keep
>>> > > enable HBase addon only for hadoop2 (as before) and decide how to
>>> mange
>>> > > those 2 parameters..
>>> > >
>>> > > On Mon, Nov 3, 2014 at 10:19 AM, Stephan Ewen <[hidden email]>
>>> wrote:
>>> > >
>>> > > > It is fine to remove it, in my opinion.
>>> > > >
>>> > > > On Mon, Nov 3, 2014 at 10:11 AM, Flavio Pompermaier <
>>> > > [hidden email]>
>>> > > > wrote:
>>> > > >
>>> > > > > That is one class I removed because it was using the deprecated
>>> API
>>> > > > > GenericDataSink..I can restore them but the it will be a good
>>> idea to
>>> > > > > remove those warning (also because from what I understood the
>>> Record
>>> > > APIs
>>> > > > > are going to be removed).
>>> > > > >
>>> > > > > On Mon, Nov 3, 2014 at 9:51 AM, Fabian Hueske <
>>> [hidden email]>
>>> > > > wrote:
>>> > > > >
>>> > > > > > I'm not familiar with the HBase connector code, but are you
>>> maybe
>>> > > > looking
>>> > > > > > for the GenericTableOutputFormat?
>>> > > > > >
>>> > > > > > 2014-11-03 9:44 GMT+01:00 Flavio Pompermaier <
>>> [hidden email]
>>> > >:
>>> > > > > >
>>> > > > > > > | was trying to modify the example setting hbaseDs.output(new
>>> > > > > > > HBaseOutputFormat()); but I can't see any HBaseOutputFormat
>>> > > > > class..maybe
>>> > > > > > we
>>> > > > > > > shall use another class?
>>> > > > > > >
>>> > > > > > > On Mon, Nov 3, 2014 at 9:39 AM, Flavio Pompermaier <
>>> > > > > [hidden email]
>>> > > > > > >
>>> > > > > > > wrote:
>>> > > > > > >
>>> > > > > > > > Maybe that's something I could add to the HBase example and
>>> > that
>>> > > > > could
>>> > > > > > be
>>> > > > > > > > better documented in the Wiki.
>>> > > > > > > >
>>> > > > > > > > Since we're talking about the wiki..I was looking at the
>>> Java
>>> > > API (
>>> > > > > > > >
>>> > > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> http://flink.incubator.apache.org/docs/0.6-incubating/java_api_guide.html
>>> )
>>> > > > > > > > and the link to the KMeans example is not working (where it
>>> > says
>>> > > > For
>>> > > > > a
>>> > > > > > > > complete example program, have a look at KMeans Algorithm).
>>> > > > > > > >
>>> > > > > > > > Best,
>>> > > > > > > > Flavio
>>> > > > > > > >
>>> > > > > > > >
>>> > > > > > > > On Mon, Nov 3, 2014 at 9:12 AM, Flavio Pompermaier <
>>> > > > > > [hidden email]
>>> > > > > > > >
>>> > > > > > > > wrote:
>>> > > > > > > >
>>> > > > > > > >> Ah ok, perfect! That was the reason why I removed it :)
>>> > > > > > > >>
>>> > > > > > > >> On Mon, Nov 3, 2014 at 9:10 AM, Stephan Ewen <
>>> > [hidden email]>
>>> > > > > > wrote:
>>> > > > > > > >>
>>> > > > > > > >>> You do not really need a HBase data sink. You can call
>>> > > > > > > >>> "DataSet.output(new
>>> > > > > > > >>> HBaseOutputFormat())"
>>> > > > > > > >>>
>>> > > > > > > >>> Stephan
>>> > > > > > > >>> Am 02.11.2014 23:05 schrieb "Flavio Pompermaier" <
>>> > > > > > [hidden email]
>>> > > > > > > >:
>>> > > > > > > >>>
>>> > > > > > > >>> > Just one last thing..I removed the HbaseDataSink
>>> because I
>>> > > > think
>>> > > > > it
>>> > > > > > > was
>>> > > > > > > >>> > using the old APIs..can someone help me in updating
>>> that
>>> > > class?
>>> > > > > > > >>> >
>>> > > > > > > >>> > On Sun, Nov 2, 2014 at 10:55 AM, Flavio Pompermaier <
>>> > > > > > > >>> [hidden email]>
>>> > > > > > > >>> > wrote:
>>> > > > > > > >>> >
>>> > > > > > > >>> > > Indeed this time the build has been successful :)
>>> > > > > > > >>> > >
>>> > > > > > > >>> > > On Sun, Nov 2, 2014 at 10:29 AM, Fabian Hueske <
>>> > > > > > [hidden email]
>>> > > > > > > >
>>> > > > > > > >>> > wrote:
>>> > > > > > > >>> > >
>>> > > > > > > >>> > >> You can also setup Travis to build your own Github
>>> > > > > repositories
>>> > > > > > by
>>> > > > > > > >>> > linking
>>> > > > > > > >>> > >> it to your Github account. That way Travis can
>>> build all
>>> > > > your
>>> > > > > > > >>> branches
>>> > > > > > > >>> > >> (and
>>> > > > > > > >>> > >> you can also trigger rebuilds if something fails).
>>> > > > > > > >>> > >> Not sure if we can manually trigger retrigger
>>> builds on
>>> > > the
>>> > > > > > Apache
>>> > > > > > > >>> > >> repository.
>>> > > > > > > >>> > >>
>>> > > > > > > >>> > >> Support for Hadoop 1 and 2 is indeed a very good
>>> > addition
>>> > > > :-)
>>> > > > > > > >>> > >>
>>> > > > > > > >>> > >> For the discusion about the PR itself, I would need
>>> a
>>> > bit
>>> > > > more
>>> > > > > > > time
>>> > > > > > > >>> to
>>> > > > > > > >>> > >> become more familiar with HBase. I do also not have
>>> a
>>> > > HBase
>>> > > > > > setup
>>> > > > > > > >>> > >> available
>>> > > > > > > >>> > >> here.
>>> > > > > > > >>> > >> Maybe somebody else of the community who was
>>> involved
>>> > > with a
>>> > > > > > > >>> previous
>>> > > > > > > >>> > >> version of the HBase connector could comment on your
>>> > > > question.
>>> > > > > > > >>> > >>
>>> > > > > > > >>> > >> Best, Fabian
>>> > > > > > > >>> > >>
>>> > > > > > > >>> > >> 2014-11-02 9:57 GMT+01:00 Flavio Pompermaier <
>>> > > > > > > [hidden email]
>>> > > > > > > >>> >:
>>> > > > > > > >>> > >>
>>> > > > > > > >>> > >> > As suggestes by Fabian I moved the discussion on
>>> this
>>> > > > > mailing
>>> > > > > > > >>> list.
>>> > > > > > > >>> > >> >
>>> > > > > > > >>> > >> > I think that what is still to be discussed is
>>> how  to
>>> > > > > > retrigger
>>> > > > > > > >>> the
>>> > > > > > > >>> > >> build
>>> > > > > > > >>> > >> > on Travis (I don't have an account) and if the PR
>>> can
>>> > be
>>> > > > > > > >>> integrated.
>>> > > > > > > >>> > >> >
>>> > > > > > > >>> > >> > Maybe what I can do is to move the HBase example
>>> in
>>> > the
>>> > > > test
>>> > > > > > > >>> package
>>> > > > > > > >>> > >> (right
>>> > > > > > > >>> > >> > now I left it in the main folder) so it will force
>>> > > Travis
>>> > > > to
>>> > > > > > > >>> rebuild.
>>> > > > > > > >>> > >> > I'll do it within a couple of hours.
>>> > > > > > > >>> > >> >
>>> > > > > > > >>> > >> > Another thing I forgot to say is that the hbase
>>> > > extension
>>> > > > is
>>> > > > > > now
>>> > > > > > > >>> > >> compatible
>>> > > > > > > >>> > >> > with both hadoop 1 and 2.
>>> > > > > > > >>> > >> >
>>> > > > > > > >>> > >> > Best,
>>> > > > > > > >>> > >> > Flavio
>>> > > > > > > >>> > >>
>>> > > > > > > >>> > >
>>> > > > > > > >>> >
>>> > > > > > > >>>
>>> > > > > > > >>
>>> > > > > > > >
>>> > > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>>
>>
>>
>>
>
12