Object reuse documentation should be improved

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Object reuse documentation should be improved

Gábor Gévay
Hello,

I find the documentation about object reuse [1] very confusing. I
started a Google Doc [2] about clarifying/rewriting it.

First, it states four questions that I think should have answers
stated explicitly in the documentation, and then lists some concrete
problems (ambiguities) in the current text.

[1] https://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#object-reuse-behavior
[2] https://docs.google.com/document/d/1cgkuttvmj4jUonG7E2RdFVjKlfQDm_hE6gvFcgAfzXg/edit?usp=sharing

Best,
Gabor
Reply | Threaded
Open this post in threaded view
|

Re: Object reuse documentation should be improved

Aljoscha Krettek-2
Good write up. You could extend the Table of 1) a/b 2) a/b at the top with “chaining” (but you already know this, I guess). Chaining changes all of these and I think it can be tricky to know whether stuff is chained or not (for users, and even for us developers…).


> On 13 Dec 2015, at 19:24, Gábor Gévay <[hidden email]> wrote:
>
> Hello,
>
> I find the documentation about object reuse [1] very confusing. I
> started a Google Doc [2] about clarifying/rewriting it.
>
> First, it states four questions that I think should have answers
> stated explicitly in the documentation, and then lists some concrete
> problems (ambiguities) in the current text.
>
> [1] https://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#object-reuse-behavior
> [2] https://docs.google.com/document/d/1cgkuttvmj4jUonG7E2RdFVjKlfQDm_hE6gvFcgAfzXg/edit?usp=sharing
>
> Best,
> Gabor

Reply | Threaded
Open this post in threaded view
|

Re: Object reuse documentation should be improved

Gábor Gévay
I guess chaining happens so often, that we should just write this doc
assuming that there is chaining, and not even describe the rules for
the non-chaining case. I mean I would never risk writing a UDF that
only works when there is no chaining, and then constantly worry about
when do I accidentally introduce chaining. Or what do you think?

Best,
Gábor




2015-12-14 11:33 GMT+01:00 Aljoscha Krettek <[hidden email]>:

> Good write up. You could extend the Table of 1) a/b 2) a/b at the top with “chaining” (but you already know this, I guess). Chaining changes all of these and I think it can be tricky to know whether stuff is chained or not (for users, and even for us developers…).
>
>
>> On 13 Dec 2015, at 19:24, Gábor Gévay <[hidden email]> wrote:
>>
>> Hello,
>>
>> I find the documentation about object reuse [1] very confusing. I
>> started a Google Doc [2] about clarifying/rewriting it.
>>
>> First, it states four questions that I think should have answers
>> stated explicitly in the documentation, and then lists some concrete
>> problems (ambiguities) in the current text.
>>
>> [1] https://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#object-reuse-behavior
>> [2] https://docs.google.com/document/d/1cgkuttvmj4jUonG7E2RdFVjKlfQDm_hE6gvFcgAfzXg/edit?usp=sharing
>>
>> Best,
>> Gabor
>
Reply | Threaded
Open this post in threaded view
|

Re: Object reuse documentation should be improved

Márton Balassi
Thanks for writing this up, Gábor. As Aljoscha suggested chaining changes
all of these and makes it very tricky to work with these which should be
clearly documented. That was the reason while some time ago the streaming
API always copied the output of a UDF by default to avoid this ambiguous
cases. Now this copying is omitted for performance reasons.

On Mon, Dec 14, 2015 at 1:15 PM, Gábor Gévay <[hidden email]> wrote:

> I guess chaining happens so often, that we should just write this doc
> assuming that there is chaining, and not even describe the rules for
> the non-chaining case. I mean I would never risk writing a UDF that
> only works when there is no chaining, and then constantly worry about
> when do I accidentally introduce chaining. Or what do you think?
>
> Best,
> Gábor
>
>
>
>
> 2015-12-14 11:33 GMT+01:00 Aljoscha Krettek <[hidden email]>:
> > Good write up. You could extend the Table of 1) a/b 2) a/b at the top
> with “chaining” (but you already know this, I guess). Chaining changes all
> of these and I think it can be tricky to know whether stuff is chained or
> not (for users, and even for us developers…).
> >
> >
> >> On 13 Dec 2015, at 19:24, Gábor Gévay <[hidden email]> wrote:
> >>
> >> Hello,
> >>
> >> I find the documentation about object reuse [1] very confusing. I
> >> started a Google Doc [2] about clarifying/rewriting it.
> >>
> >> First, it states four questions that I think should have answers
> >> stated explicitly in the documentation, and then lists some concrete
> >> problems (ambiguities) in the current text.
> >>
> >> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#object-reuse-behavior
> >> [2]
> https://docs.google.com/document/d/1cgkuttvmj4jUonG7E2RdFVjKlfQDm_hE6gvFcgAfzXg/edit?usp=sharing
> >>
> >> Best,
> >> Gabor
> >
>
Reply | Threaded
Open this post in threaded view
|

Re: Object reuse documentation should be improved

Aljoscha Krettek-2
If I’m not mistaken copying is still performed in the streaming API by default.

> On 14 Dec 2015, at 13:20, Márton Balassi <[hidden email]> wrote:
>
> Thanks for writing this up, Gábor. As Aljoscha suggested chaining changes
> all of these and makes it very tricky to work with these which should be
> clearly documented. That was the reason while some time ago the streaming
> API always copied the output of a UDF by default to avoid this ambiguous
> cases. Now this copying is omitted for performance reasons.
>
> On Mon, Dec 14, 2015 at 1:15 PM, Gábor Gévay <[hidden email]> wrote:
>
>> I guess chaining happens so often, that we should just write this doc
>> assuming that there is chaining, and not even describe the rules for
>> the non-chaining case. I mean I would never risk writing a UDF that
>> only works when there is no chaining, and then constantly worry about
>> when do I accidentally introduce chaining. Or what do you think?
>>
>> Best,
>> Gábor
>>
>>
>>
>>
>> 2015-12-14 11:33 GMT+01:00 Aljoscha Krettek <[hidden email]>:
>>> Good write up. You could extend the Table of 1) a/b 2) a/b at the top
>> with “chaining” (but you already know this, I guess). Chaining changes all
>> of these and I think it can be tricky to know whether stuff is chained or
>> not (for users, and even for us developers…).
>>>
>>>
>>>> On 13 Dec 2015, at 19:24, Gábor Gévay <[hidden email]> wrote:
>>>>
>>>> Hello,
>>>>
>>>> I find the documentation about object reuse [1] very confusing. I
>>>> started a Google Doc [2] about clarifying/rewriting it.
>>>>
>>>> First, it states four questions that I think should have answers
>>>> stated explicitly in the documentation, and then lists some concrete
>>>> problems (ambiguities) in the current text.
>>>>
>>>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-master/apis/programming_guide.html#object-reuse-behavior
>>>> [2]
>> https://docs.google.com/document/d/1cgkuttvmj4jUonG7E2RdFVjKlfQDm_hE6gvFcgAfzXg/edit?usp=sharing
>>>>
>>>> Best,
>>>> Gabor
>>>
>>