Hi,
I am a little bit confused about the class hierarchy of DataStream. It has three subclasses: KeyedDataStream, SingleOutputStreamOperator, and SplitDataStream. 1) Why is the name "SingleOutputStreamOperator" (why OPERATOR ??) 2) Is it correct, that a SplitDataStream emit multiple logical output streams, while SingleOutputStreamOperator and KeyedDataStream emit a single logical output stream? => If yes, why is a KeyedDataStream not a subclass of SingleOutputStreamOperator ? 3) a) Why does only SingleOutputStreamOperator has method name()/getName()? b) Why does only SingleOutputStreamOperator has method setParallelism()? c) Should those methods be members of DataStream instead? -Matthias |
Yes, very good points. I think we will be fixing these when we do the API
cleanups that we discussed on the wiki design docs. In fact, the work I'm doing on https://issues.apache.org/jira/browse/FLINK-2398 can be seen as preparation for making these changes possible/easier. On Tue, 28 Jul 2015 at 21:56 Matthias J. Sax <[hidden email]> wrote: > Hi, > > I am a little bit confused about the class hierarchy of DataStream. It > has three subclasses: KeyedDataStream, SingleOutputStreamOperator, and > SplitDataStream. > > 1) Why is the name "SingleOutputStreamOperator" (why OPERATOR ??) > > 2) Is it correct, that a SplitDataStream emit multiple logical output > streams, while SingleOutputStreamOperator and KeyedDataStream emit a > single logical output stream? > => If yes, why is a KeyedDataStream not a subclass of > SingleOutputStreamOperator ? > > 3) > a) Why does only SingleOutputStreamOperator has method name()/getName()? > b) Why does only SingleOutputStreamOperator has method setParallelism()? > c) Should those methods be members of DataStream instead? > > > > -Matthias > > |
My current work depends on a clean design of those. Otherwise, my own
code would get very messy. I would like to apply some changes in my own PR (not opened yet). Do you thinks this is feasible? I don't want get in a messy state. What kind of changes are you going to apply in FLINK-2398? -Matthias On 07/28/2015 10:30 PM, Aljoscha Krettek wrote: > Yes, very good points. I think we will be fixing these when we do the API > cleanups that we discussed on the wiki design docs. In fact, the work I'm > doing on https://issues.apache.org/jira/browse/FLINK-2398 can be seen as > preparation for making these changes possible/easier. > > On Tue, 28 Jul 2015 at 21:56 Matthias J. Sax <[hidden email]> > wrote: > >> Hi, >> >> I am a little bit confused about the class hierarchy of DataStream. It >> has three subclasses: KeyedDataStream, SingleOutputStreamOperator, and >> SplitDataStream. >> >> 1) Why is the name "SingleOutputStreamOperator" (why OPERATOR ??) >> >> 2) Is it correct, that a SplitDataStream emit multiple logical output >> streams, while SingleOutputStreamOperator and KeyedDataStream emit a >> single logical output stream? >> => If yes, why is a KeyedDataStream not a subclass of >> SingleOutputStreamOperator ? >> >> 3) >> a) Why does only SingleOutputStreamOperator has method name()/getName()? >> b) Why does only SingleOutputStreamOperator has method setParallelism()? >> c) Should those methods be members of DataStream instead? >> >> >> >> -Matthias >> >> > |
Right now it's mostly under-the-hood changes but you can look at the
progress here: https://github.com/aljoscha/flink/tree/stream-api-rework The commit is going to change, so if you do put your work on top of it you might have to rebase. On Wed, 29 Jul 2015 at 07:26 Matthias J. Sax <[hidden email]> wrote: > My current work depends on a clean design of those. Otherwise, my own > code would get very messy. I would like to apply some changes in my own > PR (not opened yet). Do you thinks this is feasible? I don't want get in > a messy state. What kind of changes are you going to apply in FLINK-2398? > > -Matthias > > > On 07/28/2015 10:30 PM, Aljoscha Krettek wrote: > > Yes, very good points. I think we will be fixing these when we do the API > > cleanups that we discussed on the wiki design docs. In fact, the work I'm > > doing on https://issues.apache.org/jira/browse/FLINK-2398 can be seen as > > preparation for making these changes possible/easier. > > > > On Tue, 28 Jul 2015 at 21:56 Matthias J. Sax < > [hidden email]> > > wrote: > > > >> Hi, > >> > >> I am a little bit confused about the class hierarchy of DataStream. It > >> has three subclasses: KeyedDataStream, SingleOutputStreamOperator, and > >> SplitDataStream. > >> > >> 1) Why is the name "SingleOutputStreamOperator" (why OPERATOR ??) > >> > >> 2) Is it correct, that a SplitDataStream emit multiple logical output > >> streams, while SingleOutputStreamOperator and KeyedDataStream emit a > >> single logical output stream? > >> => If yes, why is a KeyedDataStream not a subclass of > >> SingleOutputStreamOperator ? > >> > >> 3) > >> a) Why does only SingleOutputStreamOperator has method > name()/getName()? > >> b) Why does only SingleOutputStreamOperator has method > setParallelism()? > >> c) Should those methods be members of DataStream instead? > >> > >> > >> > >> -Matthias > >> > >> > > > > |
What is the expected time frame for you work? I don't want to delay my
work too long (if I base it on your branch, it could not be merged before yours). Right now, you did not change the class hierarchy. However, that is what I would need. Thus, it make no sense to use you branch as a base right now. What are your plans about this? -> one side comment: would it make sense to make DataStream abstract? From my point of view, it make most sense to me, that I apply the changes I need in my PR directly (based on master). -Matthias On 07/29/2015 08:11 AM, Aljoscha Krettek wrote: > Right now it's mostly under-the-hood changes but you can look at the > progress here: https://github.com/aljoscha/flink/tree/stream-api-rework > > The commit is going to change, so if you do put your work on top of it you > might have to rebase. > > On Wed, 29 Jul 2015 at 07:26 Matthias J. Sax <[hidden email]> > wrote: > >> My current work depends on a clean design of those. Otherwise, my own >> code would get very messy. I would like to apply some changes in my own >> PR (not opened yet). Do you thinks this is feasible? I don't want get in >> a messy state. What kind of changes are you going to apply in FLINK-2398? >> >> -Matthias >> >> >> On 07/28/2015 10:30 PM, Aljoscha Krettek wrote: >>> Yes, very good points. I think we will be fixing these when we do the API >>> cleanups that we discussed on the wiki design docs. In fact, the work I'm >>> doing on https://issues.apache.org/jira/browse/FLINK-2398 can be seen as >>> preparation for making these changes possible/easier. >>> >>> On Tue, 28 Jul 2015 at 21:56 Matthias J. Sax < >> [hidden email]> >>> wrote: >>> >>>> Hi, >>>> >>>> I am a little bit confused about the class hierarchy of DataStream. It >>>> has three subclasses: KeyedDataStream, SingleOutputStreamOperator, and >>>> SplitDataStream. >>>> >>>> 1) Why is the name "SingleOutputStreamOperator" (why OPERATOR ??) >>>> >>>> 2) Is it correct, that a SplitDataStream emit multiple logical output >>>> streams, while SingleOutputStreamOperator and KeyedDataStream emit a >>>> single logical output stream? >>>> => If yes, why is a KeyedDataStream not a subclass of >>>> SingleOutputStreamOperator ? >>>> >>>> 3) >>>> a) Why does only SingleOutputStreamOperator has method >> name()/getName()? >>>> b) Why does only SingleOutputStreamOperator has method >> setParallelism()? >>>> c) Should those methods be members of DataStream instead? >>>> >>>> >>>> >>>> -Matthias >>>> >>>> >>> >> >> > |
Hi,
I would like to apply the following changes to DataStream class hierarchy: https://github.com/mjsax/flink/tree/flink-2306-storm-namedStreams Please give some feedback if those changes are reasonable to you. I need those change to get a clean design for https://issues.apache.org/jira/browse/FLINK-2306 -Matthias On 07/29/2015 12:07 PM, Matthias J. Sax wrote: > What is the expected time frame for you work? I don't want to delay my > work too long (if I base it on your branch, it could not be merged > before yours). > > Right now, you did not change the class hierarchy. However, that is what > I would need. Thus, it make no sense to use you branch as a base right > now. What are your plans about this? > > -> one side comment: would it make sense to make DataStream abstract? > > From my point of view, it make most sense to me, that I apply the > changes I need in my PR directly (based on master). > > -Matthias > > > On 07/29/2015 08:11 AM, Aljoscha Krettek wrote: >> Right now it's mostly under-the-hood changes but you can look at the >> progress here: https://github.com/aljoscha/flink/tree/stream-api-rework >> >> The commit is going to change, so if you do put your work on top of it you >> might have to rebase. >> >> On Wed, 29 Jul 2015 at 07:26 Matthias J. Sax <[hidden email]> >> wrote: >> >>> My current work depends on a clean design of those. Otherwise, my own >>> code would get very messy. I would like to apply some changes in my own >>> PR (not opened yet). Do you thinks this is feasible? I don't want get in >>> a messy state. What kind of changes are you going to apply in FLINK-2398? >>> >>> -Matthias >>> >>> >>> On 07/28/2015 10:30 PM, Aljoscha Krettek wrote: >>>> Yes, very good points. I think we will be fixing these when we do the API >>>> cleanups that we discussed on the wiki design docs. In fact, the work I'm >>>> doing on https://issues.apache.org/jira/browse/FLINK-2398 can be seen as >>>> preparation for making these changes possible/easier. >>>> >>>> On Tue, 28 Jul 2015 at 21:56 Matthias J. Sax < >>> [hidden email]> >>>> wrote: >>>> >>>>> Hi, >>>>> >>>>> I am a little bit confused about the class hierarchy of DataStream. It >>>>> has three subclasses: KeyedDataStream, SingleOutputStreamOperator, and >>>>> SplitDataStream. >>>>> >>>>> 1) Why is the name "SingleOutputStreamOperator" (why OPERATOR ??) >>>>> >>>>> 2) Is it correct, that a SplitDataStream emit multiple logical output >>>>> streams, while SingleOutputStreamOperator and KeyedDataStream emit a >>>>> single logical output stream? >>>>> => If yes, why is a KeyedDataStream not a subclass of >>>>> SingleOutputStreamOperator ? >>>>> >>>>> 3) >>>>> a) Why does only SingleOutputStreamOperator has method >>> name()/getName()? >>>>> b) Why does only SingleOutputStreamOperator has method >>> setParallelism()? >>>>> c) Should those methods be members of DataStream instead? >>>>> >>>>> >>>>> >>>>> -Matthias >>>>> >>>>> >>>> >>> >>> >> > |
Hi Matthias,
I think Aljoscha is preparing a nice PR that completely reworks the DataStream classes and the information they actually contain. I don't think it's a good idea to mess things up before he gets a chance to open the PR. Also I don't see a well supported reason for moving the setParallelism, setName etc method to the DataStream, as these are specific things that you can only set on operators. The KeyedDataStream is not an operator on the other hand. Can we just wait a little bit for Aljoscha with this? If you really need his changes, you can for his branch and we can consider your changes after merging his. Regards, Gyula Matthias J. Sax <[hidden email]> ezt írta (időpont: 2015. júl. 31., P, 21:57): > Hi, > > I would like to apply the following changes to DataStream class > hierarchy: > https://github.com/mjsax/flink/tree/flink-2306-storm-namedStreams > > Please give some feedback if those changes are reasonable to you. > > I need those change to get a clean design for > https://issues.apache.org/jira/browse/FLINK-2306 > > > -Matthias > > > > On 07/29/2015 12:07 PM, Matthias J. Sax wrote: > > What is the expected time frame for you work? I don't want to delay my > > work too long (if I base it on your branch, it could not be merged > > before yours). > > > > Right now, you did not change the class hierarchy. However, that is what > > I would need. Thus, it make no sense to use you branch as a base right > > now. What are your plans about this? > > > > -> one side comment: would it make sense to make DataStream abstract? > > > > From my point of view, it make most sense to me, that I apply the > > changes I need in my PR directly (based on master). > > > > -Matthias > > > > > > On 07/29/2015 08:11 AM, Aljoscha Krettek wrote: > >> Right now it's mostly under-the-hood changes but you can look at the > >> progress here: https://github.com/aljoscha/flink/tree/stream-api-rework > >> > >> The commit is going to change, so if you do put your work on top of it > you > >> might have to rebase. > >> > >> On Wed, 29 Jul 2015 at 07:26 Matthias J. Sax < > [hidden email]> > >> wrote: > >> > >>> My current work depends on a clean design of those. Otherwise, my own > >>> code would get very messy. I would like to apply some changes in my own > >>> PR (not opened yet). Do you thinks this is feasible? I don't want get > in > >>> a messy state. What kind of changes are you going to apply in > FLINK-2398? > >>> > >>> -Matthias > >>> > >>> > >>> On 07/28/2015 10:30 PM, Aljoscha Krettek wrote: > >>>> Yes, very good points. I think we will be fixing these when we do the > API > >>>> cleanups that we discussed on the wiki design docs. In fact, the work > I'm > >>>> doing on https://issues.apache.org/jira/browse/FLINK-2398 can be > seen as > >>>> preparation for making these changes possible/easier. > >>>> > >>>> On Tue, 28 Jul 2015 at 21:56 Matthias J. Sax < > >>> [hidden email]> > >>>> wrote: > >>>> > >>>>> Hi, > >>>>> > >>>>> I am a little bit confused about the class hierarchy of DataStream. > It > >>>>> has three subclasses: KeyedDataStream, SingleOutputStreamOperator, > and > >>>>> SplitDataStream. > >>>>> > >>>>> 1) Why is the name "SingleOutputStreamOperator" (why OPERATOR ??) > >>>>> > >>>>> 2) Is it correct, that a SplitDataStream emit multiple logical output > >>>>> streams, while SingleOutputStreamOperator and KeyedDataStream emit a > >>>>> single logical output stream? > >>>>> => If yes, why is a KeyedDataStream not a subclass of > >>>>> SingleOutputStreamOperator ? > >>>>> > >>>>> 3) > >>>>> a) Why does only SingleOutputStreamOperator has method > >>> name()/getName()? > >>>>> b) Why does only SingleOutputStreamOperator has method > >>> setParallelism()? > >>>>> c) Should those methods be members of DataStream instead? > >>>>> > >>>>> > >>>>> > >>>>> -Matthias > >>>>> > >>>>> > >>>> > >>> > >>> > >> > > > > |
I agree with Gyula here.
Getting the API right is too important to "quick fix" it. On Fri, Jul 31, 2015 at 10:06 PM, Gyula Fóra <[hidden email]> wrote: > Hi Matthias, > > I think Aljoscha is preparing a nice PR that completely reworks the > DataStream classes and the information they actually contain. I don't think > it's a good idea to mess things up before he gets a chance to open the PR. > > Also I don't see a well supported reason for moving the setParallelism, > setName etc method to the DataStream, as these are specific things that you > can only set on operators. The KeyedDataStream is not an operator on the > other hand. > > Can we just wait a little bit for Aljoscha with this? If you really need > his changes, you can for his branch and we can consider your changes after > merging his. > > Regards, > Gyula > > > > Matthias J. Sax <[hidden email]> ezt írta (időpont: 2015. > júl. 31., P, 21:57): > > > Hi, > > > > I would like to apply the following changes to DataStream class > > hierarchy: > > https://github.com/mjsax/flink/tree/flink-2306-storm-namedStreams > > > > Please give some feedback if those changes are reasonable to you. > > > > I need those change to get a clean design for > > https://issues.apache.org/jira/browse/FLINK-2306 > > > > > > -Matthias > > > > > > > > On 07/29/2015 12:07 PM, Matthias J. Sax wrote: > > > What is the expected time frame for you work? I don't want to delay my > > > work too long (if I base it on your branch, it could not be merged > > > before yours). > > > > > > Right now, you did not change the class hierarchy. However, that is > what > > > I would need. Thus, it make no sense to use you branch as a base right > > > now. What are your plans about this? > > > > > > -> one side comment: would it make sense to make DataStream abstract? > > > > > > From my point of view, it make most sense to me, that I apply the > > > changes I need in my PR directly (based on master). > > > > > > -Matthias > > > > > > > > > On 07/29/2015 08:11 AM, Aljoscha Krettek wrote: > > >> Right now it's mostly under-the-hood changes but you can look at the > > >> progress here: > https://github.com/aljoscha/flink/tree/stream-api-rework > > >> > > >> The commit is going to change, so if you do put your work on top of it > > you > > >> might have to rebase. > > >> > > >> On Wed, 29 Jul 2015 at 07:26 Matthias J. Sax < > > [hidden email]> > > >> wrote: > > >> > > >>> My current work depends on a clean design of those. Otherwise, my own > > >>> code would get very messy. I would like to apply some changes in my > own > > >>> PR (not opened yet). Do you thinks this is feasible? I don't want get > > in > > >>> a messy state. What kind of changes are you going to apply in > > FLINK-2398? > > >>> > > >>> -Matthias > > >>> > > >>> > > >>> On 07/28/2015 10:30 PM, Aljoscha Krettek wrote: > > >>>> Yes, very good points. I think we will be fixing these when we do > the > > API > > >>>> cleanups that we discussed on the wiki design docs. In fact, the > work > > I'm > > >>>> doing on https://issues.apache.org/jira/browse/FLINK-2398 can be > > seen as > > >>>> preparation for making these changes possible/easier. > > >>>> > > >>>> On Tue, 28 Jul 2015 at 21:56 Matthias J. Sax < > > >>> [hidden email]> > > >>>> wrote: > > >>>> > > >>>>> Hi, > > >>>>> > > >>>>> I am a little bit confused about the class hierarchy of DataStream. > > It > > >>>>> has three subclasses: KeyedDataStream, SingleOutputStreamOperator, > > and > > >>>>> SplitDataStream. > > >>>>> > > >>>>> 1) Why is the name "SingleOutputStreamOperator" (why OPERATOR ??) > > >>>>> > > >>>>> 2) Is it correct, that a SplitDataStream emit multiple logical > output > > >>>>> streams, while SingleOutputStreamOperator and KeyedDataStream emit > a > > >>>>> single logical output stream? > > >>>>> => If yes, why is a KeyedDataStream not a subclass of > > >>>>> SingleOutputStreamOperator ? > > >>>>> > > >>>>> 3) > > >>>>> a) Why does only SingleOutputStreamOperator has method > > >>> name()/getName()? > > >>>>> b) Why does only SingleOutputStreamOperator has method > > >>> setParallelism()? > > >>>>> c) Should those methods be members of DataStream instead? > > >>>>> > > >>>>> > > >>>>> > > >>>>> -Matthias > > >>>>> > > >>>>> > > >>>> > > >>> > > >>> > > >> > > > > > > > > |
Free forum by Nabble | Edit this page |