Hi all,
what do you think if we exploit this job-submission sprint to address also the problem discussed in https://issues.apache.org/jira/browse/FLINK-10862? Best, Flavio |
On 18.03.20 14:45, Flavio Pompermaier wrote:
> what do you think if we exploit this job-submission sprint to address also > the problem discussed in https://issues.apache.org/jira/browse/FLINK-10862? That's a good idea! What should we do? It seems that most committers on the issue were in favour of deprecating/removing ProgramDescription. |
I would personally like to see a way of describing a Flink job/pipeline
(including its parameters and types) in order to enable better UIs, then the important thing is to make things consistent and aligned with the new client developments and exploit this new dev sprint to fix such issues. On Mon, Mar 30, 2020 at 11:38 AM Aljoscha Krettek <[hidden email]> wrote: > On 18.03.20 14:45, Flavio Pompermaier wrote: > > what do you think if we exploit this job-submission sprint to address > also > > the problem discussed in > https://issues.apache.org/jira/browse/FLINK-10862? > > That's a good idea! What should we do? It seems that most committers on > the issue were in favour of deprecating/removing ProgramDescription. > |
I think no-one is interested to push this personally right now. We would need a champion that is interested and pushes this forward.
Best, Aljoscha On Mon, Mar 30, 2020, at 12:45, Flavio Pompermaier wrote: > I would personally like to see a way of describing a Flink job/pipeline > (including its parameters and types) in order to enable better UIs, then > the important thing is to make things consistent and aligned with the new > client developments and exploit this new dev sprint to fix such issues. > > On Mon, Mar 30, 2020 at 11:38 AM Aljoscha Krettek <[hidden email]> > wrote: > > > On 18.03.20 14:45, Flavio Pompermaier wrote: > > > what do you think if we exploit this job-submission sprint to address > > also > > > the problem discussed in > > https://issues.apache.org/jira/browse/FLINK-10862? > > > > That's a good idea! What should we do? It seems that most committers on > > the issue were in favour of deprecating/removing ProgramDescription. > > > |
Ok, it's not a problem for me if the community is not interested in pushing
this thing forward. When we develop a Job is super useful for us to have the job describing itself somehow (what it does and which parameters it requires). If this is not in Flink I have to implement it somewhere else but I can't think that this is not a common situation. However I think I can live with it :D Best, Flavio On Wed, Jul 15, 2020 at 12:01 PM Aljoscha Krettek <[hidden email]> wrote: > I think no-one is interested to push this personally right now. We would > need a champion that is interested and pushes this forward. > > Best, > Aljoscha > > On Mon, Mar 30, 2020, at 12:45, Flavio Pompermaier wrote: > > I would personally like to see a way of describing a Flink job/pipeline > > (including its parameters and types) in order to enable better UIs, then > > the important thing is to make things consistent and aligned with the new > > client developments and exploit this new dev sprint to fix such issues. > > > > On Mon, Mar 30, 2020 at 11:38 AM Aljoscha Krettek <[hidden email]> > > wrote: > > > > > On 18.03.20 14:45, Flavio Pompermaier wrote: > > > > what do you think if we exploit this job-submission sprint to address > > > also > > > > the problem discussed in > > > https://issues.apache.org/jira/browse/FLINK-10862? > > > > > > That's a good idea! What should we do? It seems that most committers on > > > the issue were in favour of deprecating/removing ProgramDescription. > > > > > |
The more we strive towards a model where an application can submit
multiple jobs it will become increasingly important to be able to attach meta data to a job/application to have any idea what is going on. But I don't think the PackagedProgram/ProgramDescription is the way to go; and I'd envision rather something like a meta data object that is attached to the environment/execute calls. But we have to figure out how to do this in a way that also works for the SQL APIs. What we have done internally is to encode such information in the GlobalJobParameters which are then available in the WebUI. We have things like commit IDs encoded into the jar manifest, that we extract at submission time and put them into the parameters. My guess would be that such approach can work sufficiently for all dataset/datastream/table API users. On 15/07/2020 14:05, Flavio Pompermaier wrote: > Ok, it's not a problem for me if the community is not interested in pushing > this thing forward. > When we develop a Job is super useful for us to have the job describing > itself somehow (what it does and which parameters it requires). > If this is not in Flink I have to implement it somewhere else but I can't > think that this is not a common situation. > However I think I can live with it :D > > Best, > Flavio > > On Wed, Jul 15, 2020 at 12:01 PM Aljoscha Krettek <[hidden email]> > wrote: > >> I think no-one is interested to push this personally right now. We would >> need a champion that is interested and pushes this forward. >> >> Best, >> Aljoscha >> >> On Mon, Mar 30, 2020, at 12:45, Flavio Pompermaier wrote: >>> I would personally like to see a way of describing a Flink job/pipeline >>> (including its parameters and types) in order to enable better UIs, then >>> the important thing is to make things consistent and aligned with the new >>> client developments and exploit this new dev sprint to fix such issues. >>> >>> On Mon, Mar 30, 2020 at 11:38 AM Aljoscha Krettek <[hidden email]> >>> wrote: >>> >>>> On 18.03.20 14:45, Flavio Pompermaier wrote: >>>>> what do you think if we exploit this job-submission sprint to address >>>> also >>>>> the problem discussed in >>>> https://issues.apache.org/jira/browse/FLINK-10862? >>>> >>>> That's a good idea! What should we do? It seems that most committers on >>>> the issue were in favour of deprecating/removing ProgramDescription. >>>> |
Thanks Chesnay for the tip.
I'll try to investigate the usage of GlobalJobParameters. On Wed, Jul 15, 2020 at 2:51 PM Chesnay Schepler <[hidden email]> wrote: > The more we strive towards a model where an application can submit > multiple jobs it will become increasingly important to be able to attach > meta data to a job/application to have any idea what is going on. > > But I don't think the PackagedProgram/ProgramDescription is the way to > go; and I'd envision rather something like a meta data object that is > attached to the environment/execute calls. But we have to figure out how > to do this in a way that also works for the SQL APIs. > > What we have done internally is to encode such information in the > GlobalJobParameters which are then available in the WebUI. We have > things like commit IDs encoded into the jar manifest, that we extract at > submission time and put them into the parameters. > My guess would be that such approach can work sufficiently for all > dataset/datastream/table API users. > > On 15/07/2020 14:05, Flavio Pompermaier wrote: > > Ok, it's not a problem for me if the community is not interested in > pushing > > this thing forward. > > When we develop a Job is super useful for us to have the job describing > > itself somehow (what it does and which parameters it requires). > > If this is not in Flink I have to implement it somewhere else but I can't > > think that this is not a common situation. > > However I think I can live with it :D > > > > Best, > > Flavio > > > > On Wed, Jul 15, 2020 at 12:01 PM Aljoscha Krettek <[hidden email]> > > wrote: > > > >> I think no-one is interested to push this personally right now. We would > >> need a champion that is interested and pushes this forward. > >> > >> Best, > >> Aljoscha > >> > >> On Mon, Mar 30, 2020, at 12:45, Flavio Pompermaier wrote: > >>> I would personally like to see a way of describing a Flink job/pipeline > >>> (including its parameters and types) in order to enable better UIs, > then > >>> the important thing is to make things consistent and aligned with the > new > >>> client developments and exploit this new dev sprint to fix such issues. > >>> > >>> On Mon, Mar 30, 2020 at 11:38 AM Aljoscha Krettek <[hidden email] > > > >>> wrote: > >>> > >>>> On 18.03.20 14:45, Flavio Pompermaier wrote: > >>>>> what do you think if we exploit this job-submission sprint to address > >>>> also > >>>>> the problem discussed in > >>>> https://issues.apache.org/jira/browse/FLINK-10862? > >>>> > >>>> That's a good idea! What should we do? It seems that most committers > on > >>>> the issue were in favour of deprecating/removing ProgramDescription. > >>>> > |
For completeness sake, here's an example of what we're doing to add the
job arguments and some manifest entries to the global job parameters: (Manifests is a class from jcabi-manifests) public class MetaDataUtils { public static ExecutionConfig.GlobalJobParameters createMetaData(ParameterTool parameterTool) { Map<String, String> metaData =new HashMap<>(parameterTool.toMap()); setFromManifest(metaData, "Commit-ID"); setFromManifest(metaData, "Commit-Message"); setFromManifest(metaData, "Commit-Time"); return new MetaData(metaData); } private static void setFromManifest(Map<String, String> metaData, String property) { metaData.put(property, guard(() -> Manifests.read(property))); } private static String guard(Supplier<String> supplier) { try { return supplier.get(); }catch (IllegalArgumentException iae) { return "unknown"; } } private static class MetaDataextends ExecutionConfig.GlobalJobParameters { private final Map<String, String> data; private MetaData(Map<String, String> data) { this.data = data; } @Override public Map<String, String> toMap() { return data; } } } On 15/07/2020 15:01, Flavio Pompermaier wrote: > Thanks Chesnay for the tip. > I'll try to investigate the usage of GlobalJobParameters. > > On Wed, Jul 15, 2020 at 2:51 PM Chesnay Schepler <[hidden email]> wrote: > >> The more we strive towards a model where an application can submit >> multiple jobs it will become increasingly important to be able to attach >> meta data to a job/application to have any idea what is going on. >> >> But I don't think the PackagedProgram/ProgramDescription is the way to >> go; and I'd envision rather something like a meta data object that is >> attached to the environment/execute calls. But we have to figure out how >> to do this in a way that also works for the SQL APIs. >> >> What we have done internally is to encode such information in the >> GlobalJobParameters which are then available in the WebUI. We have >> things like commit IDs encoded into the jar manifest, that we extract at >> submission time and put them into the parameters. >> My guess would be that such approach can work sufficiently for all >> dataset/datastream/table API users. >> >> On 15/07/2020 14:05, Flavio Pompermaier wrote: >>> Ok, it's not a problem for me if the community is not interested in >> pushing >>> this thing forward. >>> When we develop a Job is super useful for us to have the job describing >>> itself somehow (what it does and which parameters it requires). >>> If this is not in Flink I have to implement it somewhere else but I can't >>> think that this is not a common situation. >>> However I think I can live with it :D >>> >>> Best, >>> Flavio >>> >>> On Wed, Jul 15, 2020 at 12:01 PM Aljoscha Krettek <[hidden email]> >>> wrote: >>> >>>> I think no-one is interested to push this personally right now. We would >>>> need a champion that is interested and pushes this forward. >>>> >>>> Best, >>>> Aljoscha >>>> >>>> On Mon, Mar 30, 2020, at 12:45, Flavio Pompermaier wrote: >>>>> I would personally like to see a way of describing a Flink job/pipeline >>>>> (including its parameters and types) in order to enable better UIs, >> then >>>>> the important thing is to make things consistent and aligned with the >> new >>>>> client developments and exploit this new dev sprint to fix such issues. >>>>> >>>>> On Mon, Mar 30, 2020 at 11:38 AM Aljoscha Krettek <[hidden email] >>>>> wrote: >>>>> >>>>>> On 18.03.20 14:45, Flavio Pompermaier wrote: >>>>>>> what do you think if we exploit this job-submission sprint to address >>>>>> also >>>>>>> the problem discussed in >>>>>> https://issues.apache.org/jira/browse/FLINK-10862? >>>>>> >>>>>> That's a good idea! What should we do? It seems that most committers >> on >>>>>> the issue were in favour of deprecating/removing ProgramDescription. >>>>>> |
Let's try this again; the formatting went haywire for some reason...
public class MetaDataUtils { public static ExecutionConfig.GlobalJobParameters createMetaData(ParameterTool parameterTool) { Map<String, String> metaData = new HashMap<>(parameterTool.toMap()); setFromManifest(metaData, "Commit-ID"); setFromManifest(metaData, "Commit-Message"); setFromManifest(metaData, "Commit-Time"); return new MetaData(metaData); } private static void setFromManifest(Map<String, String> metaData, String property) { metaData.put(property, guard(() -> Manifests.read(property))); } private static String guard(Supplier<String> supplier) { try { return supplier.get(); } catch (IllegalArgumentException iae) { return "unknown"; } } private static class MetaData extends ExecutionConfig.GlobalJobParameters { private final Map<String, String> data; private MetaData(Map<String, String> data) { this.data = data; } @Override public Map<String, String> toMap() { return data; } } } On 15/07/2020 15:25, Chesnay Schepler wrote: > For completeness sake, here's an example of what we're doing to add > the job arguments and some manifest entries to the global job parameters: > (Manifests is a class from jcabi-manifests) > > public class MetaDataUtils { > > public static ExecutionConfig.GlobalJobParameters > createMetaData(ParameterTool parameterTool) { > Map<String, String> metaData =new > HashMap<>(parameterTool.toMap()); setFromManifest(metaData, > "Commit-ID"); setFromManifest(metaData, "Commit-Message"); > setFromManifest(metaData, "Commit-Time"); return new > MetaData(metaData); } > > private static void setFromManifest(Map<String, String> metaData, > String property) { > metaData.put(property, guard(() -> Manifests.read(property))); } > > private static String guard(Supplier<String> supplier) { > try { > return supplier.get(); }catch (IllegalArgumentException > iae) { > return "unknown"; } > } > > private static class MetaDataextends > ExecutionConfig.GlobalJobParameters { > private final Map<String, String> data; private > MetaData(Map<String, String> data) { > this.data = data; } > > @Override > public Map<String, String> toMap() { > return data; } > } > } > > > On 15/07/2020 15:01, Flavio Pompermaier wrote: >> Thanks Chesnay for the tip. >> I'll try to investigate the usage of GlobalJobParameters. >> >> On Wed, Jul 15, 2020 at 2:51 PM Chesnay Schepler <[hidden email]> >> wrote: >> >>> The more we strive towards a model where an application can submit >>> multiple jobs it will become increasingly important to be able to >>> attach >>> meta data to a job/application to have any idea what is going on. >>> >>> But I don't think the PackagedProgram/ProgramDescription is the way to >>> go; and I'd envision rather something like a meta data object that is >>> attached to the environment/execute calls. But we have to figure out >>> how >>> to do this in a way that also works for the SQL APIs. >>> >>> What we have done internally is to encode such information in the >>> GlobalJobParameters which are then available in the WebUI. We have >>> things like commit IDs encoded into the jar manifest, that we >>> extract at >>> submission time and put them into the parameters. >>> My guess would be that such approach can work sufficiently for all >>> dataset/datastream/table API users. >>> >>> On 15/07/2020 14:05, Flavio Pompermaier wrote: >>>> Ok, it's not a problem for me if the community is not interested in >>> pushing >>>> this thing forward. >>>> When we develop a Job is super useful for us to have the job >>>> describing >>>> itself somehow (what it does and which parameters it requires). >>>> If this is not in Flink I have to implement it somewhere else but I >>>> can't >>>> think that this is not a common situation. >>>> However I think I can live with it :D >>>> >>>> Best, >>>> Flavio >>>> >>>> On Wed, Jul 15, 2020 at 12:01 PM Aljoscha Krettek >>>> <[hidden email]> >>>> wrote: >>>> >>>>> I think no-one is interested to push this personally right now. We >>>>> would >>>>> need a champion that is interested and pushes this forward. >>>>> >>>>> Best, >>>>> Aljoscha >>>>> >>>>> On Mon, Mar 30, 2020, at 12:45, Flavio Pompermaier wrote: >>>>>> I would personally like to see a way of describing a Flink >>>>>> job/pipeline >>>>>> (including its parameters and types) in order to enable better UIs, >>> then >>>>>> the important thing is to make things consistent and aligned with >>>>>> the >>> new >>>>>> client developments and exploit this new dev sprint to fix such >>>>>> issues. >>>>>> >>>>>> On Mon, Mar 30, 2020 at 11:38 AM Aljoscha Krettek >>>>>> <[hidden email] >>>>>> wrote: >>>>>> >>>>>>> On 18.03.20 14:45, Flavio Pompermaier wrote: >>>>>>>> what do you think if we exploit this job-submission sprint to >>>>>>>> address >>>>>>> also >>>>>>>> the problem discussed in >>>>>>> https://issues.apache.org/jira/browse/FLINK-10862? >>>>>>> >>>>>>> That's a good idea! What should we do? It seems that most >>>>>>> committers >>> on >>>>>>> the issue were in favour of deprecating/removing >>>>>>> ProgramDescription. >>>>>>> > > |
Free forum by Nabble | Edit this page |