[DISCUSS] Distributed TPC-H DataGenerator for flink-contrib

classic Classic list List threaded Threaded
13 messages Options
Reply | Threaded
Open this post in threaded view
|

[DISCUSS] Distributed TPC-H DataGenerator for flink-contrib

Robert Metzger
Hi,

we recently added the "flink-contrib" module for user contributed tools etc.

On one of the last weekends, I've created a distributed tpch generator,
based on this libary: https://github.com/airlift/tpch (which is from a
PrestoDB developer and available on Maven central).

You can find my code here:
https://github.com/rmetzger/scratch/tree/distributed-tpch-generator

It contains two examples:
a) a full TPC data generator (as a flink program):
https://github.com/rmetzger/scratch/blob/distributed-tpch-generator/src/main/java/flink/generators/programs/TPCHGenerator.java

b) an example which generates two TPC-H tables on-the-fly to join them:
https://github.com/rmetzger/scratch/blob/distributed-tpch-generator/src/main/java/flink/generators/programs/TPCHGeneratorExample.java


Before I spend time on integrating it into the "flink-contrib" package, I
was wondering if the community is willing this contribution to Flink.


Best,
Robert
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Distributed TPC-H DataGenerator for flink-contrib

Fabian Hueske-2
I think this is a great tool and would be a nice contribution.

I am however not sure about the licensing here. Even though the used
library appears to be AL2 licensed, I do not know if there are any
restrictions from the Transaction Processing Performance Council (TPC,
tpc.org). TPC-H is a benchmark published by the TPC and their rights might
be affected.

We should clarify that we are allowed to include this code under AL2.

Cheers, Fabian

2015-02-09 16:03 GMT+01:00 Robert Metzger <[hidden email]>:

> Hi,
>
> we recently added the "flink-contrib" module for user contributed tools
> etc.
>
> On one of the last weekends, I've created a distributed tpch generator,
> based on this libary: https://github.com/airlift/tpch (which is from a
> PrestoDB developer and available on Maven central).
>
> You can find my code here:
> https://github.com/rmetzger/scratch/tree/distributed-tpch-generator
>
> It contains two examples:
> a) a full TPC data generator (as a flink program):
>
> https://github.com/rmetzger/scratch/blob/distributed-tpch-generator/src/main/java/flink/generators/programs/TPCHGenerator.java
>
> b) an example which generates two TPC-H tables on-the-fly to join them:
>
> https://github.com/rmetzger/scratch/blob/distributed-tpch-generator/src/main/java/flink/generators/programs/TPCHGeneratorExample.java
>
>
> Before I spend time on integrating it into the "flink-contrib" package, I
> was wondering if the community is willing this contribution to Flink.
>
>
> Best,
> Robert
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Distributed TPC-H DataGenerator for flink-contrib

Robert Metzger
Hi Fabian,

The legal TPC-H documents are hard to parse ;)
I think we don't have any issues with the tools or documents they provide,
because we are not modifying their generator tools.
Also, we are not publishing any performance numbers.
There is one remaining concern and thats the TPC trademark. The source
files clearly refer to TPC, TPC-H and so on.

The license agreement (
http://www.tpc.org/tpcds/dsgen/tpc-ds%20license%20agreement.doc) contains:

> Use of Name. It is acknowledged that TPC claims ownership in all trademark
> and trade name rights in the names used by TPC in the Software and the
> Materials. User shall preserve any notices regarding such ownership. User
> may only use such trademarks and names owned by TPC in accordance with the
> trademark usage guidelines of the TPC (available on the TPC web site at
> www.tpc.org/trademarks).


However, the website is not really helpful: http://www.tpc.org/trademarks/

As one data point, the Apache Calcite (incubating) project also depends on
the mentioned airlift/tpch repository:
https://github.com/apache/incubator-calcite/blob/master/plus/pom.xml#L57
and
https://github.com/apache/incubator-calcite/blob/master/plus/src/main/java/org/apache/calcite/adapter/tpch/TpchSchema.java#L33
.

How about adding a line to the NOTICE files acknowledging that TPC is a
registered trademark of the transaction processing council?


Robert



On Mon, Feb 9, 2015 at 4:17 PM, Fabian Hueske <[hidden email]> wrote:

> I think this is a great tool and would be a nice contribution.
>
> I am however not sure about the licensing here. Even though the used
> library appears to be AL2 licensed, I do not know if there are any
> restrictions from the Transaction Processing Performance Council (TPC,
> tpc.org). TPC-H is a benchmark published by the TPC and their rights might
> be affected.
>
> We should clarify that we are allowed to include this code under AL2.
>
> Cheers, Fabian
>
> 2015-02-09 16:03 GMT+01:00 Robert Metzger <[hidden email]>:
>
> > Hi,
> >
> > we recently added the "flink-contrib" module for user contributed tools
> > etc.
> >
> > On one of the last weekends, I've created a distributed tpch generator,
> > based on this libary: https://github.com/airlift/tpch (which is from a
> > PrestoDB developer and available on Maven central).
> >
> > You can find my code here:
> > https://github.com/rmetzger/scratch/tree/distributed-tpch-generator
> >
> > It contains two examples:
> > a) a full TPC data generator (as a flink program):
> >
> >
> https://github.com/rmetzger/scratch/blob/distributed-tpch-generator/src/main/java/flink/generators/programs/TPCHGenerator.java
> >
> > b) an example which generates two TPC-H tables on-the-fly to join them:
> >
> >
> https://github.com/rmetzger/scratch/blob/distributed-tpch-generator/src/main/java/flink/generators/programs/TPCHGeneratorExample.java
> >
> >
> > Before I spend time on integrating it into the "flink-contrib" package, I
> > was wondering if the community is willing this contribution to Flink.
> >
> >
> > Best,
> > Robert
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Distributed TPC-H DataGenerator for flink-contrib

Ufuk Celebi-2
Nice, this is a great tool. :)

On 09 Feb 2015, at 17:05, Robert Metzger <[hidden email]> wrote:

> However, the website is not really helpful: http://www.tpc.org/trademarks/
>
> As one data point, the Apache Calcite (incubating) project also depends on
> the mentioned airlift/tpch repository:
> https://github.com/apache/incubator-calcite/blob/master/plus/pom.xml#L57
> and
> https://github.com/apache/incubator-calcite/blob/master/plus/src/main/java/org/apache/calcite/adapter/tpch/TpchSchema.java#L33
> .
>
> How about adding a line to the NOTICE files acknowledging that TPC is a
> registered trademark of the transaction processing council?

I find it reasonable to add it to the NOTICE files as an acknowledgement.

The trademark website says "For additional details please contact [hidden email]." If we want to be on the safe side, we could write an email and confirm.

Any further opinions on this?

– Ufuk
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Distributed TPC-H DataGenerator for flink-contrib

Fabian Hueske-2
+1 for reaching out to the TPC.

It might also be that it is OK to add the code but not under the name TPC-H.

2015-02-11 13:55 GMT+01:00 Ufuk Celebi <[hidden email]>:

> Nice, this is a great tool. :)
>
> On 09 Feb 2015, at 17:05, Robert Metzger <[hidden email]> wrote:
>
> > However, the website is not really helpful:
> http://www.tpc.org/trademarks/
> >
> > As one data point, the Apache Calcite (incubating) project also depends
> on
> > the mentioned airlift/tpch repository:
> > https://github.com/apache/incubator-calcite/blob/master/plus/pom.xml#L57
> > and
> >
> https://github.com/apache/incubator-calcite/blob/master/plus/src/main/java/org/apache/calcite/adapter/tpch/TpchSchema.java#L33
> > .
> >
> > How about adding a line to the NOTICE files acknowledging that TPC is a
> > registered trademark of the transaction processing council?
>
> I find it reasonable to add it to the NOTICE files as an acknowledgement.
>
> The trademark website says "For additional details please contact
> [hidden email]." If we want to be on the safe side, we could write an
> email and confirm.
>
> Any further opinions on this?
>
> – Ufuk
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Distributed TPC-H DataGenerator for flink-contrib

Robert Metzger
Okay, thank you. I'll write a mail to tpc.org and ask which rules we have
to respect.

On Wed, Feb 11, 2015 at 2:16 PM, Fabian Hueske <[hidden email]> wrote:

> +1 for reaching out to the TPC.
>
> It might also be that it is OK to add the code but not under the name
> TPC-H.
>
> 2015-02-11 13:55 GMT+01:00 Ufuk Celebi <[hidden email]>:
>
> > Nice, this is a great tool. :)
> >
> > On 09 Feb 2015, at 17:05, Robert Metzger <[hidden email]> wrote:
> >
> > > However, the website is not really helpful:
> > http://www.tpc.org/trademarks/
> > >
> > > As one data point, the Apache Calcite (incubating) project also depends
> > on
> > > the mentioned airlift/tpch repository:
> > >
> https://github.com/apache/incubator-calcite/blob/master/plus/pom.xml#L57
> > > and
> > >
> >
> https://github.com/apache/incubator-calcite/blob/master/plus/src/main/java/org/apache/calcite/adapter/tpch/TpchSchema.java#L33
> > > .
> > >
> > > How about adding a line to the NOTICE files acknowledging that TPC is a
> > > registered trademark of the transaction processing council?
> >
> > I find it reasonable to add it to the NOTICE files as an acknowledgement.
> >
> > The trademark website says "For additional details please contact
> > [hidden email]." If we want to be on the safe side, we could write an
> > email and confirm.
> >
> > Any further opinions on this?
> >
> > – Ufuk
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Distributed TPC-H DataGenerator for flink-contrib

Stephan Ewen
I wrote them some time ago (like 12+ months) about the question whether we
can include TPCH sample data for our programs. They replied they were just
revising their license to allow that.

Should be possible now. Good idea to ping them again to make sure that it
is approved now and that it holds for code as well...

On Wed, Feb 11, 2015 at 2:22 PM, Robert Metzger <[hidden email]> wrote:

> Okay, thank you. I'll write a mail to tpc.org and ask which rules we have
> to respect.
>
> On Wed, Feb 11, 2015 at 2:16 PM, Fabian Hueske <[hidden email]> wrote:
>
> > +1 for reaching out to the TPC.
> >
> > It might also be that it is OK to add the code but not under the name
> > TPC-H.
> >
> > 2015-02-11 13:55 GMT+01:00 Ufuk Celebi <[hidden email]>:
> >
> > > Nice, this is a great tool. :)
> > >
> > > On 09 Feb 2015, at 17:05, Robert Metzger <[hidden email]> wrote:
> > >
> > > > However, the website is not really helpful:
> > > http://www.tpc.org/trademarks/
> > > >
> > > > As one data point, the Apache Calcite (incubating) project also
> depends
> > > on
> > > > the mentioned airlift/tpch repository:
> > > >
> > https://github.com/apache/incubator-calcite/blob/master/plus/pom.xml#L57
> > > > and
> > > >
> > >
> >
> https://github.com/apache/incubator-calcite/blob/master/plus/src/main/java/org/apache/calcite/adapter/tpch/TpchSchema.java#L33
> > > > .
> > > >
> > > > How about adding a line to the NOTICE files acknowledging that TPC
> is a
> > > > registered trademark of the transaction processing council?
> > >
> > > I find it reasonable to add it to the NOTICE files as an
> acknowledgement.
> > >
> > > The trademark website says "For additional details please contact
> > > [hidden email]." If we want to be on the safe side, we could write an
> > > email and confirm.
> > >
> > > Any further opinions on this?
> > >
> > > – Ufuk
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Distributed TPC-H DataGenerator for flink-contrib

Robert Metzger
I tried twice writing them but I didn't receive an answer.
But given that Apache Calcite is also using airlift/tpch in its
dependencies as well, I would like to add the TPC-H data generator to
"flink-contrib".
I would also add a note that TPC is a registered trademark and that our
generator is not the official generator and may not be used to generate
test data for performance measurement publications.


On Wed, Feb 11, 2015 at 3:02 PM, Stephan Ewen <[hidden email]> wrote:

> I wrote them some time ago (like 12+ months) about the question whether we
> can include TPCH sample data for our programs. They replied they were just
> revising their license to allow that.
>
> Should be possible now. Good idea to ping them again to make sure that it
> is approved now and that it holds for code as well...
>
> On Wed, Feb 11, 2015 at 2:22 PM, Robert Metzger <[hidden email]>
> wrote:
>
> > Okay, thank you. I'll write a mail to tpc.org and ask which rules we
> have
> > to respect.
> >
> > On Wed, Feb 11, 2015 at 2:16 PM, Fabian Hueske <[hidden email]>
> wrote:
> >
> > > +1 for reaching out to the TPC.
> > >
> > > It might also be that it is OK to add the code but not under the name
> > > TPC-H.
> > >
> > > 2015-02-11 13:55 GMT+01:00 Ufuk Celebi <[hidden email]>:
> > >
> > > > Nice, this is a great tool. :)
> > > >
> > > > On 09 Feb 2015, at 17:05, Robert Metzger <[hidden email]>
> wrote:
> > > >
> > > > > However, the website is not really helpful:
> > > > http://www.tpc.org/trademarks/
> > > > >
> > > > > As one data point, the Apache Calcite (incubating) project also
> > depends
> > > > on
> > > > > the mentioned airlift/tpch repository:
> > > > >
> > >
> https://github.com/apache/incubator-calcite/blob/master/plus/pom.xml#L57
> > > > > and
> > > > >
> > > >
> > >
> >
> https://github.com/apache/incubator-calcite/blob/master/plus/src/main/java/org/apache/calcite/adapter/tpch/TpchSchema.java#L33
> > > > > .
> > > > >
> > > > > How about adding a line to the NOTICE files acknowledging that TPC
> > is a
> > > > > registered trademark of the transaction processing council?
> > > >
> > > > I find it reasonable to add it to the NOTICE files as an
> > acknowledgement.
> > > >
> > > > The trademark website says "For additional details please contact
> > > > [hidden email]." If we want to be on the safe side, we could write an
> > > > email and confirm.
> > > >
> > > > Any further opinions on this?
> > > >
> > > > – Ufuk
> > >
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Distributed TPC-H DataGenerator for flink-contrib

Henry Saputra
Robert,

Just curious if you did try to send email to tpc.org to ask about fair
usage of example data?


- Henry

On Sat, Feb 28, 2015 at 12:07 PM, Robert Metzger <[hidden email]> wrote:

> I tried twice writing them but I didn't receive an answer.
> But given that Apache Calcite is also using airlift/tpch in its
> dependencies as well, I would like to add the TPC-H data generator to
> "flink-contrib".
> I would also add a note that TPC is a registered trademark and that our
> generator is not the official generator and may not be used to generate
> test data for performance measurement publications.
>
>
> On Wed, Feb 11, 2015 at 3:02 PM, Stephan Ewen <[hidden email]> wrote:
>
>> I wrote them some time ago (like 12+ months) about the question whether we
>> can include TPCH sample data for our programs. They replied they were just
>> revising their license to allow that.
>>
>> Should be possible now. Good idea to ping them again to make sure that it
>> is approved now and that it holds for code as well...
>>
>> On Wed, Feb 11, 2015 at 2:22 PM, Robert Metzger <[hidden email]>
>> wrote:
>>
>> > Okay, thank you. I'll write a mail to tpc.org and ask which rules we
>> have
>> > to respect.
>> >
>> > On Wed, Feb 11, 2015 at 2:16 PM, Fabian Hueske <[hidden email]>
>> wrote:
>> >
>> > > +1 for reaching out to the TPC.
>> > >
>> > > It might also be that it is OK to add the code but not under the name
>> > > TPC-H.
>> > >
>> > > 2015-02-11 13:55 GMT+01:00 Ufuk Celebi <[hidden email]>:
>> > >
>> > > > Nice, this is a great tool. :)
>> > > >
>> > > > On 09 Feb 2015, at 17:05, Robert Metzger <[hidden email]>
>> wrote:
>> > > >
>> > > > > However, the website is not really helpful:
>> > > > http://www.tpc.org/trademarks/
>> > > > >
>> > > > > As one data point, the Apache Calcite (incubating) project also
>> > depends
>> > > > on
>> > > > > the mentioned airlift/tpch repository:
>> > > > >
>> > >
>> https://github.com/apache/incubator-calcite/blob/master/plus/pom.xml#L57
>> > > > > and
>> > > > >
>> > > >
>> > >
>> >
>> https://github.com/apache/incubator-calcite/blob/master/plus/src/main/java/org/apache/calcite/adapter/tpch/TpchSchema.java#L33
>> > > > > .
>> > > > >
>> > > > > How about adding a line to the NOTICE files acknowledging that TPC
>> > is a
>> > > > > registered trademark of the transaction processing council?
>> > > >
>> > > > I find it reasonable to add it to the NOTICE files as an
>> > acknowledgement.
>> > > >
>> > > > The trademark website says "For additional details please contact
>> > > > [hidden email]." If we want to be on the safe side, we could write an
>> > > > email and confirm.
>> > > >
>> > > > Any further opinions on this?
>> > > >
>> > > > – Ufuk
>> > >
>> >
>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Distributed TPC-H DataGenerator for flink-contrib

Robert Metzger
I've send a message to [hidden email] and never got an answer. (on
http://www.tpc.org/trademarks/ they list [hidden email] as the right
address, but sending a message to admin@ redirects to admin-info@.)

My code doesn't contain any TPC data or code. Its a Java re-implementation
of the C data generator. The only thing it does is using the name "TPC". It
also tries to generate the same data as the official generator, but we
don't claim that.

On Mon, Mar 23, 2015 at 5:58 PM, Henry Saputra <[hidden email]>
wrote:

> Robert,
>
> Just curious if you did try to send email to tpc.org to ask about fair
> usage of example data?
>
>
> - Henry
>
> On Sat, Feb 28, 2015 at 12:07 PM, Robert Metzger <[hidden email]>
> wrote:
> > I tried twice writing them but I didn't receive an answer.
> > But given that Apache Calcite is also using airlift/tpch in its
> > dependencies as well, I would like to add the TPC-H data generator to
> > "flink-contrib".
> > I would also add a note that TPC is a registered trademark and that our
> > generator is not the official generator and may not be used to generate
> > test data for performance measurement publications.
> >
> >
> > On Wed, Feb 11, 2015 at 3:02 PM, Stephan Ewen <[hidden email]> wrote:
> >
> >> I wrote them some time ago (like 12+ months) about the question whether
> we
> >> can include TPCH sample data for our programs. They replied they were
> just
> >> revising their license to allow that.
> >>
> >> Should be possible now. Good idea to ping them again to make sure that
> it
> >> is approved now and that it holds for code as well...
> >>
> >> On Wed, Feb 11, 2015 at 2:22 PM, Robert Metzger <[hidden email]>
> >> wrote:
> >>
> >> > Okay, thank you. I'll write a mail to tpc.org and ask which rules we
> >> have
> >> > to respect.
> >> >
> >> > On Wed, Feb 11, 2015 at 2:16 PM, Fabian Hueske <[hidden email]>
> >> wrote:
> >> >
> >> > > +1 for reaching out to the TPC.
> >> > >
> >> > > It might also be that it is OK to add the code but not under the
> name
> >> > > TPC-H.
> >> > >
> >> > > 2015-02-11 13:55 GMT+01:00 Ufuk Celebi <[hidden email]>:
> >> > >
> >> > > > Nice, this is a great tool. :)
> >> > > >
> >> > > > On 09 Feb 2015, at 17:05, Robert Metzger <[hidden email]>
> >> wrote:
> >> > > >
> >> > > > > However, the website is not really helpful:
> >> > > > http://www.tpc.org/trademarks/
> >> > > > >
> >> > > > > As one data point, the Apache Calcite (incubating) project also
> >> > depends
> >> > > > on
> >> > > > > the mentioned airlift/tpch repository:
> >> > > > >
> >> > >
> >>
> https://github.com/apache/incubator-calcite/blob/master/plus/pom.xml#L57
> >> > > > > and
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://github.com/apache/incubator-calcite/blob/master/plus/src/main/java/org/apache/calcite/adapter/tpch/TpchSchema.java#L33
> >> > > > > .
> >> > > > >
> >> > > > > How about adding a line to the NOTICE files acknowledging that
> TPC
> >> > is a
> >> > > > > registered trademark of the transaction processing council?
> >> > > >
> >> > > > I find it reasonable to add it to the NOTICE files as an
> >> > acknowledgement.
> >> > > >
> >> > > > The trademark website says "For additional details please contact
> >> > > > [hidden email]." If we want to be on the safe side, we could
> write an
> >> > > > email and confirm.
> >> > > >
> >> > > > Any further opinions on this?
> >> > > >
> >> > > > – Ufuk
> >> > >
> >> >
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Distributed TPC-H DataGenerator for flink-contrib

Henry Saputra
Yeah, I believe it should be ok since we do not actually package any
code bits from TPC-H.

I think giving trademark nudge to TPC-H in our NOTICE file should be good.

- Henry

On Mon, Mar 23, 2015 at 11:23 AM, Robert Metzger <[hidden email]> wrote:

> I've send a message to [hidden email] and never got an answer. (on
> http://www.tpc.org/trademarks/ they list [hidden email] as the right
> address, but sending a message to admin@ redirects to admin-info@.)
>
> My code doesn't contain any TPC data or code. Its a Java re-implementation
> of the C data generator. The only thing it does is using the name "TPC". It
> also tries to generate the same data as the official generator, but we
> don't claim that.
>
> On Mon, Mar 23, 2015 at 5:58 PM, Henry Saputra <[hidden email]>
> wrote:
>
>> Robert,
>>
>> Just curious if you did try to send email to tpc.org to ask about fair
>> usage of example data?
>>
>>
>> - Henry
>>
>> On Sat, Feb 28, 2015 at 12:07 PM, Robert Metzger <[hidden email]>
>> wrote:
>> > I tried twice writing them but I didn't receive an answer.
>> > But given that Apache Calcite is also using airlift/tpch in its
>> > dependencies as well, I would like to add the TPC-H data generator to
>> > "flink-contrib".
>> > I would also add a note that TPC is a registered trademark and that our
>> > generator is not the official generator and may not be used to generate
>> > test data for performance measurement publications.
>> >
>> >
>> > On Wed, Feb 11, 2015 at 3:02 PM, Stephan Ewen <[hidden email]> wrote:
>> >
>> >> I wrote them some time ago (like 12+ months) about the question whether
>> we
>> >> can include TPCH sample data for our programs. They replied they were
>> just
>> >> revising their license to allow that.
>> >>
>> >> Should be possible now. Good idea to ping them again to make sure that
>> it
>> >> is approved now and that it holds for code as well...
>> >>
>> >> On Wed, Feb 11, 2015 at 2:22 PM, Robert Metzger <[hidden email]>
>> >> wrote:
>> >>
>> >> > Okay, thank you. I'll write a mail to tpc.org and ask which rules we
>> >> have
>> >> > to respect.
>> >> >
>> >> > On Wed, Feb 11, 2015 at 2:16 PM, Fabian Hueske <[hidden email]>
>> >> wrote:
>> >> >
>> >> > > +1 for reaching out to the TPC.
>> >> > >
>> >> > > It might also be that it is OK to add the code but not under the
>> name
>> >> > > TPC-H.
>> >> > >
>> >> > > 2015-02-11 13:55 GMT+01:00 Ufuk Celebi <[hidden email]>:
>> >> > >
>> >> > > > Nice, this is a great tool. :)
>> >> > > >
>> >> > > > On 09 Feb 2015, at 17:05, Robert Metzger <[hidden email]>
>> >> wrote:
>> >> > > >
>> >> > > > > However, the website is not really helpful:
>> >> > > > http://www.tpc.org/trademarks/
>> >> > > > >
>> >> > > > > As one data point, the Apache Calcite (incubating) project also
>> >> > depends
>> >> > > > on
>> >> > > > > the mentioned airlift/tpch repository:
>> >> > > > >
>> >> > >
>> >>
>> https://github.com/apache/incubator-calcite/blob/master/plus/pom.xml#L57
>> >> > > > > and
>> >> > > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>> https://github.com/apache/incubator-calcite/blob/master/plus/src/main/java/org/apache/calcite/adapter/tpch/TpchSchema.java#L33
>> >> > > > > .
>> >> > > > >
>> >> > > > > How about adding a line to the NOTICE files acknowledging that
>> TPC
>> >> > is a
>> >> > > > > registered trademark of the transaction processing council?
>> >> > > >
>> >> > > > I find it reasonable to add it to the NOTICE files as an
>> >> > acknowledgement.
>> >> > > >
>> >> > > > The trademark website says "For additional details please contact
>> >> > > > [hidden email]." If we want to be on the safe side, we could
>> write an
>> >> > > > email and confirm.
>> >> > > >
>> >> > > > Any further opinions on this?
>> >> > > >
>> >> > > > – Ufuk
>> >> > >
>> >> >
>> >>
>>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Distributed TPC-H DataGenerator for flink-contrib

Robert Metzger
Great, thanks for taking some time looking at this!

If nobody objects in the next 48 hours, I'll open a PR for the TPC-H data
generator.

On Mon, Mar 23, 2015 at 7:34 PM, Henry Saputra <[hidden email]>
wrote:

> Yeah, I believe it should be ok since we do not actually package any
> code bits from TPC-H.
>
> I think giving trademark nudge to TPC-H in our NOTICE file should be good.
>
> - Henry
>
> On Mon, Mar 23, 2015 at 11:23 AM, Robert Metzger <[hidden email]>
> wrote:
> > I've send a message to [hidden email] and never got an answer. (on
> > http://www.tpc.org/trademarks/ they list [hidden email] as the right
> > address, but sending a message to admin@ redirects to admin-info@.)
> >
> > My code doesn't contain any TPC data or code. Its a Java
> re-implementation
> > of the C data generator. The only thing it does is using the name "TPC".
> It
> > also tries to generate the same data as the official generator, but we
> > don't claim that.
> >
> > On Mon, Mar 23, 2015 at 5:58 PM, Henry Saputra <[hidden email]>
> > wrote:
> >
> >> Robert,
> >>
> >> Just curious if you did try to send email to tpc.org to ask about fair
> >> usage of example data?
> >>
> >>
> >> - Henry
> >>
> >> On Sat, Feb 28, 2015 at 12:07 PM, Robert Metzger <[hidden email]>
> >> wrote:
> >> > I tried twice writing them but I didn't receive an answer.
> >> > But given that Apache Calcite is also using airlift/tpch in its
> >> > dependencies as well, I would like to add the TPC-H data generator to
> >> > "flink-contrib".
> >> > I would also add a note that TPC is a registered trademark and that
> our
> >> > generator is not the official generator and may not be used to
> generate
> >> > test data for performance measurement publications.
> >> >
> >> >
> >> > On Wed, Feb 11, 2015 at 3:02 PM, Stephan Ewen <[hidden email]>
> wrote:
> >> >
> >> >> I wrote them some time ago (like 12+ months) about the question
> whether
> >> we
> >> >> can include TPCH sample data for our programs. They replied they were
> >> just
> >> >> revising their license to allow that.
> >> >>
> >> >> Should be possible now. Good idea to ping them again to make sure
> that
> >> it
> >> >> is approved now and that it holds for code as well...
> >> >>
> >> >> On Wed, Feb 11, 2015 at 2:22 PM, Robert Metzger <[hidden email]
> >
> >> >> wrote:
> >> >>
> >> >> > Okay, thank you. I'll write a mail to tpc.org and ask which rules
> we
> >> >> have
> >> >> > to respect.
> >> >> >
> >> >> > On Wed, Feb 11, 2015 at 2:16 PM, Fabian Hueske <[hidden email]>
> >> >> wrote:
> >> >> >
> >> >> > > +1 for reaching out to the TPC.
> >> >> > >
> >> >> > > It might also be that it is OK to add the code but not under the
> >> name
> >> >> > > TPC-H.
> >> >> > >
> >> >> > > 2015-02-11 13:55 GMT+01:00 Ufuk Celebi <[hidden email]>:
> >> >> > >
> >> >> > > > Nice, this is a great tool. :)
> >> >> > > >
> >> >> > > > On 09 Feb 2015, at 17:05, Robert Metzger <[hidden email]>
> >> >> wrote:
> >> >> > > >
> >> >> > > > > However, the website is not really helpful:
> >> >> > > > http://www.tpc.org/trademarks/
> >> >> > > > >
> >> >> > > > > As one data point, the Apache Calcite (incubating) project
> also
> >> >> > depends
> >> >> > > > on
> >> >> > > > > the mentioned airlift/tpch repository:
> >> >> > > > >
> >> >> > >
> >> >>
> >>
> https://github.com/apache/incubator-calcite/blob/master/plus/pom.xml#L57
> >> >> > > > > and
> >> >> > > > >
> >> >> > > >
> >> >> > >
> >> >> >
> >> >>
> >>
> https://github.com/apache/incubator-calcite/blob/master/plus/src/main/java/org/apache/calcite/adapter/tpch/TpchSchema.java#L33
> >> >> > > > > .
> >> >> > > > >
> >> >> > > > > How about adding a line to the NOTICE files acknowledging
> that
> >> TPC
> >> >> > is a
> >> >> > > > > registered trademark of the transaction processing council?
> >> >> > > >
> >> >> > > > I find it reasonable to add it to the NOTICE files as an
> >> >> > acknowledgement.
> >> >> > > >
> >> >> > > > The trademark website says "For additional details please
> contact
> >> >> > > > [hidden email]." If we want to be on the safe side, we could
> >> write an
> >> >> > > > email and confirm.
> >> >> > > >
> >> >> > > > Any further opinions on this?
> >> >> > > >
> >> >> > > > – Ufuk
> >> >> > >
> >> >> >
> >> >>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: [DISCUSS] Distributed TPC-H DataGenerator for flink-contrib

Fabian Hueske-2
+1
On Mar 23, 2015 7:39 PM, "Robert Metzger" <[hidden email]> wrote:

> Great, thanks for taking some time looking at this!
>
> If nobody objects in the next 48 hours, I'll open a PR for the TPC-H data
> generator.
>
> On Mon, Mar 23, 2015 at 7:34 PM, Henry Saputra <[hidden email]>
> wrote:
>
> > Yeah, I believe it should be ok since we do not actually package any
> > code bits from TPC-H.
> >
> > I think giving trademark nudge to TPC-H in our NOTICE file should be
> good.
> >
> > - Henry
> >
> > On Mon, Mar 23, 2015 at 11:23 AM, Robert Metzger <[hidden email]>
> > wrote:
> > > I've send a message to [hidden email] and never got an answer. (on
> > > http://www.tpc.org/trademarks/ they list [hidden email] as the right
> > > address, but sending a message to admin@ redirects to admin-info@.)
> > >
> > > My code doesn't contain any TPC data or code. Its a Java
> > re-implementation
> > > of the C data generator. The only thing it does is using the name
> "TPC".
> > It
> > > also tries to generate the same data as the official generator, but we
> > > don't claim that.
> > >
> > > On Mon, Mar 23, 2015 at 5:58 PM, Henry Saputra <
> [hidden email]>
> > > wrote:
> > >
> > >> Robert,
> > >>
> > >> Just curious if you did try to send email to tpc.org to ask about
> fair
> > >> usage of example data?
> > >>
> > >>
> > >> - Henry
> > >>
> > >> On Sat, Feb 28, 2015 at 12:07 PM, Robert Metzger <[hidden email]
> >
> > >> wrote:
> > >> > I tried twice writing them but I didn't receive an answer.
> > >> > But given that Apache Calcite is also using airlift/tpch in its
> > >> > dependencies as well, I would like to add the TPC-H data generator
> to
> > >> > "flink-contrib".
> > >> > I would also add a note that TPC is a registered trademark and that
> > our
> > >> > generator is not the official generator and may not be used to
> > generate
> > >> > test data for performance measurement publications.
> > >> >
> > >> >
> > >> > On Wed, Feb 11, 2015 at 3:02 PM, Stephan Ewen <[hidden email]>
> > wrote:
> > >> >
> > >> >> I wrote them some time ago (like 12+ months) about the question
> > whether
> > >> we
> > >> >> can include TPCH sample data for our programs. They replied they
> were
> > >> just
> > >> >> revising their license to allow that.
> > >> >>
> > >> >> Should be possible now. Good idea to ping them again to make sure
> > that
> > >> it
> > >> >> is approved now and that it holds for code as well...
> > >> >>
> > >> >> On Wed, Feb 11, 2015 at 2:22 PM, Robert Metzger <
> [hidden email]
> > >
> > >> >> wrote:
> > >> >>
> > >> >> > Okay, thank you. I'll write a mail to tpc.org and ask which
> rules
> > we
> > >> >> have
> > >> >> > to respect.
> > >> >> >
> > >> >> > On Wed, Feb 11, 2015 at 2:16 PM, Fabian Hueske <
> [hidden email]>
> > >> >> wrote:
> > >> >> >
> > >> >> > > +1 for reaching out to the TPC.
> > >> >> > >
> > >> >> > > It might also be that it is OK to add the code but not under
> the
> > >> name
> > >> >> > > TPC-H.
> > >> >> > >
> > >> >> > > 2015-02-11 13:55 GMT+01:00 Ufuk Celebi <[hidden email]>:
> > >> >> > >
> > >> >> > > > Nice, this is a great tool. :)
> > >> >> > > >
> > >> >> > > > On 09 Feb 2015, at 17:05, Robert Metzger <
> [hidden email]>
> > >> >> wrote:
> > >> >> > > >
> > >> >> > > > > However, the website is not really helpful:
> > >> >> > > > http://www.tpc.org/trademarks/
> > >> >> > > > >
> > >> >> > > > > As one data point, the Apache Calcite (incubating) project
> > also
> > >> >> > depends
> > >> >> > > > on
> > >> >> > > > > the mentioned airlift/tpch repository:
> > >> >> > > > >
> > >> >> > >
> > >> >>
> > >>
> > https://github.com/apache/incubator-calcite/blob/master/plus/pom.xml#L57
> > >> >> > > > > and
> > >> >> > > > >
> > >> >> > > >
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
> https://github.com/apache/incubator-calcite/blob/master/plus/src/main/java/org/apache/calcite/adapter/tpch/TpchSchema.java#L33
> > >> >> > > > > .
> > >> >> > > > >
> > >> >> > > > > How about adding a line to the NOTICE files acknowledging
> > that
> > >> TPC
> > >> >> > is a
> > >> >> > > > > registered trademark of the transaction processing council?
> > >> >> > > >
> > >> >> > > > I find it reasonable to add it to the NOTICE files as an
> > >> >> > acknowledgement.
> > >> >> > > >
> > >> >> > > > The trademark website says "For additional details please
> > contact
> > >> >> > > > [hidden email]." If we want to be on the safe side, we could
> > >> write an
> > >> >> > > > email and confirm.
> > >> >> > > >
> > >> >> > > > Any further opinions on this?
> > >> >> > > >
> > >> >> > > > – Ufuk
> > >> >> > >
> > >> >> >
> > >> >>
> > >>
> >
>