ITCases in the Table API

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

ITCases in the Table API

Stephan Ewen
Hi!

I want to bring up the discussion again about writing Unit Tests, rather
than many ITCases. I looked a bit through the Table API and it looks like
there is virtually no unit test, but everything has an ITCase.

I would really encourage to write more unit tests.

The DataStream API is actually a good example: It does not have an ITCase
for every operator - operators are all unit tested with a test harness
(mock contexts and environments). There are a few end-to-end ITCases for
certain functionalities, like Checkpointing, State Backends, or Timestamp
Handling.

I think it would be great to adopt a similar model for the TableAPI. So
far, the Table API follows the DataSet API model, where every single
operator has one or more ITCases - that makes build times very long. Also,
in my experience, ITCases are actually not even as precise in the tests as
a good series of unit tests.

Given that the library is still being created, now is a good time to look
into this.
Once it is established, chances of that getting reworked will be slim.

Greetings,
Stephan
Reply | Threaded
Open this post in threaded view
|

Re: ITCases in the Table API

Vasiliki Kalavri
Hey Stephan,

thanks for bringing this up!
We discussed this situation with Fabian a while ago and I saw that he has
now updated FLINK-3656 regarding this.
If nobody picks this up sooner, I can help with reworking the tests next
week.

Cheers,
-V.

On 18 May 2016 at 10:23, Stephan Ewen <[hidden email]> wrote:

> Hi!
>
> I want to bring up the discussion again about writing Unit Tests, rather
> than many ITCases. I looked a bit through the Table API and it looks like
> there is virtually no unit test, but everything has an ITCase.
>
> I would really encourage to write more unit tests.
>
> The DataStream API is actually a good example: It does not have an ITCase
> for every operator - operators are all unit tested with a test harness
> (mock contexts and environments). There are a few end-to-end ITCases for
> certain functionalities, like Checkpointing, State Backends, or Timestamp
> Handling.
>
> I think it would be great to adopt a similar model for the TableAPI. So
> far, the Table API follows the DataSet API model, where every single
> operator has one or more ITCases - that makes build times very long. Also,
> in my experience, ITCases are actually not even as precise in the tests as
> a good series of unit tests.
>
> Given that the library is still being created, now is a good time to look
> into this.
> Once it is established, chances of that getting reworked will be slim.
>
> Greetings,
> Stephan
>
Reply | Threaded
Open this post in threaded view
|

Re: ITCases in the Table API

Stephan Ewen
Just to give a perspective the time: The Table API project is now already
at 3 minutes testing time on my machine - the longest of all libraries.

Given that it probably contains by now a fraction of the functionality it
will contain in half a year from now, it becomes pretty clear that this
pattern is not sustainable.


On Wed, May 18, 2016 at 7:36 PM, Vasiliki Kalavri <[hidden email]
> wrote:

> Hey Stephan,
>
> thanks for bringing this up!
> We discussed this situation with Fabian a while ago and I saw that he has
> now updated FLINK-3656 regarding this.
> If nobody picks this up sooner, I can help with reworking the tests next
> week.
>
> Cheers,
> -V.
>
> On 18 May 2016 at 10:23, Stephan Ewen <[hidden email]> wrote:
>
> > Hi!
> >
> > I want to bring up the discussion again about writing Unit Tests, rather
> > than many ITCases. I looked a bit through the Table API and it looks like
> > there is virtually no unit test, but everything has an ITCase.
> >
> > I would really encourage to write more unit tests.
> >
> > The DataStream API is actually a good example: It does not have an ITCase
> > for every operator - operators are all unit tested with a test harness
> > (mock contexts and environments). There are a few end-to-end ITCases for
> > certain functionalities, like Checkpointing, State Backends, or Timestamp
> > Handling.
> >
> > I think it would be great to adopt a similar model for the TableAPI. So
> > far, the Table API follows the DataSet API model, where every single
> > operator has one or more ITCases - that makes build times very long.
> Also,
> > in my experience, ITCases are actually not even as precise in the tests
> as
> > a good series of unit tests.
> >
> > Given that the library is still being created, now is a good time to look
> > into this.
> > Once it is established, chances of that getting reworked will be slim.
> >
> > Greetings,
> > Stephan
> >
>