Class Cache

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Class Cache

Mike Accola
Are classes cached somewhere in flink?  I am running in a very basic,
local environment on Linux (start_local.sh).  I've somehow gotten my
environment into a strange state that I don't understand.  I feel like I
am overlooking something simple, but I've checked everything I can think
of.

My main flink application with a ProcessFunction is embedded in
mylib1.jar.  Within my ProcessFunction I use another class that is
embedded in mylib2.jar.

When I made changes to function in mylib2.jar and rebuilt the jar, I
realized the changes weren't taking affect.  In fact, I then delete
mylib2.jar entirely and my application still worked.  I can't figure out
where my application is picking up the function contained in mylib2.jar. I
have checked any temp directories, library paths, etc.  I have repeatedly
stopped/started my flink environment just to be safe.

I tried adding -verbose:class to env.java.opts.  It output a lot of class
loading info to the stdout log, but there were no references to my class
in mylib2.jar.

This has to be caching this code somehow whether it is in flink or in the
jvm.  Any ideas what could be happening or how to debug this further?

Thanks


Reply | Threaded
Open this post in threaded view
|

Re: Class Cache

Eron Wright-2
A Flink program is typically packaged as an 'uber-jar' containing its
dependencies.  The Flink quickstart project illustrates this (see the use
of the shading plugin in pom.xml).   Based on your description, the classes
of mylib2.jar were copied into mylib1.jar when the latter was built.  Try
rebuilding mylib1.jar to effect the change.

-Eron

On Mon, Jul 31, 2017 at 11:18 AM, Mike Accola <[hidden email]> wrote:

> Are classes cached somewhere in flink?  I am running in a very basic,
> local environment on Linux (start_local.sh).  I've somehow gotten my
> environment into a strange state that I don't understand.  I feel like I
> am overlooking something simple, but I've checked everything I can think
> of.
>
> My main flink application with a ProcessFunction is embedded in
> mylib1.jar.  Within my ProcessFunction I use another class that is
> embedded in mylib2.jar.
>
> When I made changes to function in mylib2.jar and rebuilt the jar, I
> realized the changes weren't taking affect.  In fact, I then delete
> mylib2.jar entirely and my application still worked.  I can't figure out
> where my application is picking up the function contained in mylib2.jar. I
> have checked any temp directories, library paths, etc.  I have repeatedly
> stopped/started my flink environment just to be safe.
>
> I tried adding -verbose:class to env.java.opts.  It output a lot of class
> loading info to the stdout log, but there were no references to my class
> in mylib2.jar.
>
> This has to be caching this code somehow whether it is in flink or in the
> jvm.  Any ideas what could be happening or how to debug this further?
>
> Thanks
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Class Cache

Mike Accola
No, I did not explicitly  create an uber-jar.  The mylib1.jar is very
light. It only contains my main application class (including
ProcessFunction).

I have been specifying --classpath option on my flink run command to pull
in the mylib2.jar .

Plus, I have been rebuilding mylib1.jar frequently just to be safe and it
hasn't made a difference.


Mike Accola
[hidden email]





From:   Eron Wright <[hidden email]>
To:     [hidden email]
Date:   07/31/2017 01:47 PM
Subject:        Re: Class Cache



A Flink program is typically packaged as an 'uber-jar' containing its
dependencies.  The Flink quickstart project illustrates this (see the use
of the shading plugin in pom.xml).   Based on your description, the
classes
of mylib2.jar were copied into mylib1.jar when the latter was built.  Try
rebuilding mylib1.jar to effect the change.

-Eron

On Mon, Jul 31, 2017 at 11:18 AM, Mike Accola <[hidden email]> wrote:

> Are classes cached somewhere in flink?  I am running in a very basic,
> local environment on Linux (start_local.sh).  I've somehow gotten my
> environment into a strange state that I don't understand.  I feel like I
> am overlooking something simple, but I've checked everything I can think
> of.
>
> My main flink application with a ProcessFunction is embedded in
> mylib1.jar.  Within my ProcessFunction I use another class that is
> embedded in mylib2.jar.
>
> When I made changes to function in mylib2.jar and rebuilt the jar, I
> realized the changes weren't taking affect.  In fact, I then delete
> mylib2.jar entirely and my application still worked.  I can't figure out
> where my application is picking up the function contained in mylib2.jar.
I
> have checked any temp directories, library paths, etc.  I have
repeatedly
> stopped/started my flink environment just to be safe.
>
> I tried adding -verbose:class to env.java.opts.  It output a lot of
class
> loading info to the stdout log, but there were no references to my class
> in mylib2.jar.
>
> This has to be caching this code somehow whether it is in flink or in
the
> jvm.  Any ideas what could be happening or how to debug this further?
>
> Thanks
>
>
>




Reply | Threaded
Open this post in threaded view
|

Re: Class Cache

Stephan Ewen
Hi Mike!

Flink does in fact cache jar files in the "blob server". But these are
cached subject to the following conditions:

  - No caching across "sessions", meaning start/stop of the
cluster/jobmanager. If you run the per-job-yarn setup, the job does not
cache anything.

  - Files are cached under a content hash, meaning as soon as the contents
changes, the artifact is not reused. So if you actually change the jar
file, no caching should happen.

I cannot really explain what you are observing and have never seen that
myself...

Stephan


On Mon, Jul 31, 2017 at 9:00 PM, Mike Accola <[hidden email]> wrote:

> No, I did not explicitly  create an uber-jar.  The mylib1.jar is very
> light. It only contains my main application class (including
> ProcessFunction).
>
> I have been specifying --classpath option on my flink run command to pull
> in the mylib2.jar .
>
> Plus, I have been rebuilding mylib1.jar frequently just to be safe and it
> hasn't made a difference.
>
>
> Mike Accola
> [hidden email]
>
>
>
>
>
> From:   Eron Wright <[hidden email]>
> To:     [hidden email]
> Date:   07/31/2017 01:47 PM
> Subject:        Re: Class Cache
>
>
>
> A Flink program is typically packaged as an 'uber-jar' containing its
> dependencies.  The Flink quickstart project illustrates this (see the use
> of the shading plugin in pom.xml).   Based on your description, the
> classes
> of mylib2.jar were copied into mylib1.jar when the latter was built.  Try
> rebuilding mylib1.jar to effect the change.
>
> -Eron
>
> On Mon, Jul 31, 2017 at 11:18 AM, Mike Accola <[hidden email]> wrote:
>
> > Are classes cached somewhere in flink?  I am running in a very basic,
> > local environment on Linux (start_local.sh).  I've somehow gotten my
> > environment into a strange state that I don't understand.  I feel like I
> > am overlooking something simple, but I've checked everything I can think
> > of.
> >
> > My main flink application with a ProcessFunction is embedded in
> > mylib1.jar.  Within my ProcessFunction I use another class that is
> > embedded in mylib2.jar.
> >
> > When I made changes to function in mylib2.jar and rebuilt the jar, I
> > realized the changes weren't taking affect.  In fact, I then delete
> > mylib2.jar entirely and my application still worked.  I can't figure out
> > where my application is picking up the function contained in mylib2.jar.
> I
> > have checked any temp directories, library paths, etc.  I have
> repeatedly
> > stopped/started my flink environment just to be safe.
> >
> > I tried adding -verbose:class to env.java.opts.  It output a lot of
> class
> > loading info to the stdout log, but there were no references to my class
> > in mylib2.jar.
> >
> > This has to be caching this code somehow whether it is in flink or in
> the
> > jvm.  Any ideas what could be happening or how to debug this further?
> >
> > Thanks
> >
> >
> >
>
>
>
>
>
Reply | Threaded
Open this post in threaded view
|

Re: Class Cache

Ufuk Celebi-2
Hey Mike!

Thanks for the detailed information about your setup. I'm also puzzled
by this...

(1) Which version of Flink are you using? We recently merged some
changes to the JAR distribution components, which might cause this
(although I think that's unlikely).

(2) As a temporary work around you could try putting mylib2.jar into
the /lib folder and not pulling it in via --classpath. After doing
this you would need to stop/start the cluster and resubmit the job.

– Ufuk


On Mon, Jul 31, 2017 at 10:17 PM, Stephan Ewen <[hidden email]> wrote:

> Hi Mike!
>
> Flink does in fact cache jar files in the "blob server". But these are
> cached subject to the following conditions:
>
>   - No caching across "sessions", meaning start/stop of the
> cluster/jobmanager. If you run the per-job-yarn setup, the job does not
> cache anything.
>
>   - Files are cached under a content hash, meaning as soon as the contents
> changes, the artifact is not reused. So if you actually change the jar
> file, no caching should happen.
>
> I cannot really explain what you are observing and have never seen that
> myself...
>
> Stephan
>
>
> On Mon, Jul 31, 2017 at 9:00 PM, Mike Accola <[hidden email]> wrote:
>
>> No, I did not explicitly  create an uber-jar.  The mylib1.jar is very
>> light. It only contains my main application class (including
>> ProcessFunction).
>>
>> I have been specifying --classpath option on my flink run command to pull
>> in the mylib2.jar .
>>
>> Plus, I have been rebuilding mylib1.jar frequently just to be safe and it
>> hasn't made a difference.
>>
>>
>> Mike Accola
>> [hidden email]
>>
>>
>>
>>
>>
>> From:   Eron Wright <[hidden email]>
>> To:     [hidden email]
>> Date:   07/31/2017 01:47 PM
>> Subject:        Re: Class Cache
>>
>>
>>
>> A Flink program is typically packaged as an 'uber-jar' containing its
>> dependencies.  The Flink quickstart project illustrates this (see the use
>> of the shading plugin in pom.xml).   Based on your description, the
>> classes
>> of mylib2.jar were copied into mylib1.jar when the latter was built.  Try
>> rebuilding mylib1.jar to effect the change.
>>
>> -Eron
>>
>> On Mon, Jul 31, 2017 at 11:18 AM, Mike Accola <[hidden email]> wrote:
>>
>> > Are classes cached somewhere in flink?  I am running in a very basic,
>> > local environment on Linux (start_local.sh).  I've somehow gotten my
>> > environment into a strange state that I don't understand.  I feel like I
>> > am overlooking something simple, but I've checked everything I can think
>> > of.
>> >
>> > My main flink application with a ProcessFunction is embedded in
>> > mylib1.jar.  Within my ProcessFunction I use another class that is
>> > embedded in mylib2.jar.
>> >
>> > When I made changes to function in mylib2.jar and rebuilt the jar, I
>> > realized the changes weren't taking affect.  In fact, I then delete
>> > mylib2.jar entirely and my application still worked.  I can't figure out
>> > where my application is picking up the function contained in mylib2.jar.
>> I
>> > have checked any temp directories, library paths, etc.  I have
>> repeatedly
>> > stopped/started my flink environment just to be safe.
>> >
>> > I tried adding -verbose:class to env.java.opts.  It output a lot of
>> class
>> > loading info to the stdout log, but there were no references to my class
>> > in mylib2.jar.
>> >
>> > This has to be caching this code somehow whether it is in flink or in
>> the
>> > jvm.  Any ideas what could be happening or how to debug this further?
>> >
>> > Thanks
>> >
>> >
>> >
>>
>>
>>
>>
>>
Reply | Threaded
Open this post in threaded view
|

Re: Class Cache

Vishnu Viswanath
We have also noticed such behaviour when running on Yarn, had to restart
the session for the changes in the jar to be picked up.

On Mon, 31 Jul 2017 at 17:13, Ufuk Celebi <[hidden email]> wrote:

> Hey Mike!
>
> Thanks for the detailed information about your setup. I'm also puzzled
> by this...
>
> (1) Which version of Flink are you using? We recently merged some
> changes to the JAR distribution components, which might cause this
> (although I think that's unlikely).
>
> (2) As a temporary work around you could try putting mylib2.jar into
> the /lib folder and not pulling it in via --classpath. After doing
> this you would need to stop/start the cluster and resubmit the job.
>
> – Ufuk
>
>
> On Mon, Jul 31, 2017 at 10:17 PM, Stephan Ewen <[hidden email]> wrote:
> > Hi Mike!
> >
> > Flink does in fact cache jar files in the "blob server". But these are
> > cached subject to the following conditions:
> >
> >   - No caching across "sessions", meaning start/stop of the
> > cluster/jobmanager. If you run the per-job-yarn setup, the job does not
> > cache anything.
> >
> >   - Files are cached under a content hash, meaning as soon as the
> contents
> > changes, the artifact is not reused. So if you actually change the jar
> > file, no caching should happen.
> >
> > I cannot really explain what you are observing and have never seen that
> > myself...
> >
> > Stephan
> >
> >
> > On Mon, Jul 31, 2017 at 9:00 PM, Mike Accola <[hidden email]> wrote:
> >
> >> No, I did not explicitly  create an uber-jar.  The mylib1.jar is very
> >> light. It only contains my main application class (including
> >> ProcessFunction).
> >>
> >> I have been specifying --classpath option on my flink run command to
> pull
> >> in the mylib2.jar .
> >>
> >> Plus, I have been rebuilding mylib1.jar frequently just to be safe and
> it
> >> hasn't made a difference.
> >>
> >>
> >> Mike Accola
> >> [hidden email]
> >>
> >>
> >>
> >>
> >>
> >> From:   Eron Wright <[hidden email]>
> >> To:     [hidden email]
> >> Date:   07/31/2017 01:47 PM
> >> Subject:        Re: Class Cache
> >>
> >>
> >>
> >> A Flink program is typically packaged as an 'uber-jar' containing its
> >> dependencies.  The Flink quickstart project illustrates this (see the
> use
> >> of the shading plugin in pom.xml).   Based on your description, the
> >> classes
> >> of mylib2.jar were copied into mylib1.jar when the latter was built.
> Try
> >> rebuilding mylib1.jar to effect the change.
> >>
> >> -Eron
> >>
> >> On Mon, Jul 31, 2017 at 11:18 AM, Mike Accola <[hidden email]>
> wrote:
> >>
> >> > Are classes cached somewhere in flink?  I am running in a very basic,
> >> > local environment on Linux (start_local.sh).  I've somehow gotten my
> >> > environment into a strange state that I don't understand.  I feel
> like I
> >> > am overlooking something simple, but I've checked everything I can
> think
> >> > of.
> >> >
> >> > My main flink application with a ProcessFunction is embedded in
> >> > mylib1.jar.  Within my ProcessFunction I use another class that is
> >> > embedded in mylib2.jar.
> >> >
> >> > When I made changes to function in mylib2.jar and rebuilt the jar, I
> >> > realized the changes weren't taking affect.  In fact, I then delete
> >> > mylib2.jar entirely and my application still worked.  I can't figure
> out
> >> > where my application is picking up the function contained in
> mylib2.jar.
> >> I
> >> > have checked any temp directories, library paths, etc.  I have
> >> repeatedly
> >> > stopped/started my flink environment just to be safe.
> >> >
> >> > I tried adding -verbose:class to env.java.opts.  It output a lot of
> >> class
> >> > loading info to the stdout log, but there were no references to my
> class
> >> > in mylib2.jar.
> >> >
> >> > This has to be caching this code somehow whether it is in flink or in
> >> the
> >> > jvm.  Any ideas what could be happening or how to debug this further?
> >> >
> >> > Thanks
> >> >
> >> >
> >> >
> >>
> >>
> >>
> >>
> >>
>
Reply | Threaded
Open this post in threaded view
|

Re: Class Cache

Mike Accola
In reply to this post by Ufuk Celebi-2
Thank you to those of you who replied.

I can't really explain why, but yesterday afternoon, out of the blue, this
problem disappeared.  Everything was working fine for a while.  Until some
different strange behavior came up this afternoon.  I don't believe any
code changes I made should affect this.

I suspect this is all related to some kind of caching somewhere somehow.

I was already putting mylib2.jar into flink's lib directory.  I was doing
this because mylib2.jar has some native methods and I had had better luck
loading these.

Now I am running into a problem where my application runs successfully.  I
then turn around and run the same application a 2nd time (it should get
exact same results). Except this 2nd time I get a ClassNotFoundException
for one of the classes in mylib2.jar.  I've temporarily taken references
to the class that uses the native library out of the code just to rule out
that this is related to any native loading problems.  If I do
stop-local.sh and start-local.sh, I get the same result:  works the first
time I run, but get the ClassNotFoundException the 2nd time.

I am running flink 1.3.0 on linux (RHEL 6.8). I am not using yarn.  Just
plain, out of the box local mode.

Is it possible that the cache in this blob server is getting corrupted? Is
there a way to tell?  Is there a way to disable the blob server?

Any other ideas on things to look at?






From:   Ufuk Celebi <[hidden email]>
To:     [hidden email]
Date:   07/31/2017 04:13 PM
Subject:        Re: Class Cache



Hey Mike!

Thanks for the detailed information about your setup. I'm also puzzled
by this...

(1) Which version of Flink are you using? We recently merged some
changes to the JAR distribution components, which might cause this
(although I think that's unlikely).

(2) As a temporary work around you could try putting mylib2.jar into
the /lib folder and not pulling it in via --classpath. After doing
this you would need to stop/start the cluster and resubmit the job.

– Ufuk


On Mon, Jul 31, 2017 at 10:17 PM, Stephan Ewen <[hidden email]> wrote:

> Hi Mike!
>
> Flink does in fact cache jar files in the "blob server". But these are
> cached subject to the following conditions:
>
>   - No caching across "sessions", meaning start/stop of the
> cluster/jobmanager. If you run the per-job-yarn setup, the job does not
> cache anything.
>
>   - Files are cached under a content hash, meaning as soon as the
contents

> changes, the artifact is not reused. So if you actually change the jar
> file, no caching should happen.
>
> I cannot really explain what you are observing and have never seen that
> myself...
>
> Stephan
>
>
> On Mon, Jul 31, 2017 at 9:00 PM, Mike Accola <[hidden email]> wrote:
>
>> No, I did not explicitly  create an uber-jar.  The mylib1.jar is very
>> light. It only contains my main application class (including
>> ProcessFunction).
>>
>> I have been specifying --classpath option on my flink run command to
pull
>> in the mylib2.jar .
>>
>> Plus, I have been rebuilding mylib1.jar frequently just to be safe and
it

>> hasn't made a difference.
>>
>>
>> Mike Accola
>> [hidden email]
>>
>>
>>
>>
>>
>> From:   Eron Wright <[hidden email]>
>> To:     [hidden email]
>> Date:   07/31/2017 01:47 PM
>> Subject:        Re: Class Cache
>>
>>
>>
>> A Flink program is typically packaged as an 'uber-jar' containing its
>> dependencies.  The Flink quickstart project illustrates this (see the
use
>> of the shading plugin in pom.xml).   Based on your description, the
>> classes
>> of mylib2.jar were copied into mylib1.jar when the latter was built.
Try
>> rebuilding mylib1.jar to effect the change.
>>
>> -Eron
>>
>> On Mon, Jul 31, 2017 at 11:18 AM, Mike Accola <[hidden email]>
wrote:
>>
>> > Are classes cached somewhere in flink?  I am running in a very basic,
>> > local environment on Linux (start_local.sh).  I've somehow gotten my
>> > environment into a strange state that I don't understand.  I feel
like I
>> > am overlooking something simple, but I've checked everything I can
think
>> > of.
>> >
>> > My main flink application with a ProcessFunction is embedded in
>> > mylib1.jar.  Within my ProcessFunction I use another class that is
>> > embedded in mylib2.jar.
>> >
>> > When I made changes to function in mylib2.jar and rebuilt the jar, I
>> > realized the changes weren't taking affect.  In fact, I then delete
>> > mylib2.jar entirely and my application still worked.  I can't figure
out
>> > where my application is picking up the function contained in
mylib2.jar.
>> I
>> > have checked any temp directories, library paths, etc.  I have
>> repeatedly
>> > stopped/started my flink environment just to be safe.
>> >
>> > I tried adding -verbose:class to env.java.opts.  It output a lot of
>> class
>> > loading info to the stdout log, but there were no references to my
class

>> > in mylib2.jar.
>> >
>> > This has to be caching this code somehow whether it is in flink or in
>> the
>> > jvm.  Any ideas what could be happening or how to debug this further?
>> >
>> > Thanks
>> >
>> >
>> >
>>
>>
>>
>>
>>