BroadcastStateITCase failure caused by insufficient number of network buffers

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

BroadcastStateITCase failure caused by insufficient number of network buffers

liyu
Hi All,

As titled, I found this case failed stably in one of my testing environment
when verifying 1.8.0 release. I'm starting a new thread here because it
also failed in 1.7.2 and 1.6.4 on the same env, so I believe it's not a 1.8
specific issue. And the case could pass in my local mac env (MacOS 10.14.3)
so the issue should be environment related (possibly due to different
default system configuration). Below are some more detailed information:

* Kernel version: Linux 3.10.0-327
* LSB Version: :core-4.1-amd64:core-4.1-noarch
* Detailed error message:
[ERROR]
testKeyedWithBroadcastTranslation(org.apache.flink.test.streaming.runtime.BroadcastStateITCase)
Time elapsed: 0.638 s  <<< ERROR!
org.apache.flink.runtime.client.JobExecutionException: Job execution failed.
at
org.apache.flink.test.streaming.runtime.BroadcastStateITCase.testKeyedWithBroadcastTranslation(BroadcastStateITCase.java:99)
Caused by: java.io.IOException: Insufficient number of network buffers:
required 96, but only 63 available. The total number of network buffers is
currently set to 7281 of 32768 bytes each. You can increase this number by
setting the configuration keys 'taskmanager.network.memory.fraction',
'taskmanager.network.memory.min', and 'taskmanager.network.memory.max'.
* Command to run test case: mvn -Dtest=BroadcastStateITCase
-DfailIfNoTests=false -Dcheckstyle.skip=true clean test -pl flink-tests -am
   - Notice that for 1.6.2/1.7.2 we need to manually update the
maven-surefire-plugin version to 2.22.1 in pom.xml, or the test case will
be skipped. In 1.8 we don't need this action since FLINK-11144 includes the
change.

Best Regards,
Yu
Reply | Threaded
Open this post in threaded view
|

Re: BroadcastStateITCase failure caused by insufficient number of network buffers

Stephan Ewen
Thanks for reporting this.

Here is the issue and the fix for it:
https://issues.apache.org/jira/browse/FLINK-12012


On Fri, Mar 15, 2019 at 1:36 PM Yu Li <[hidden email]> wrote:

> Hi All,
>
> As titled, I found this case failed stably in one of my testing environment
> when verifying 1.8.0 release. I'm starting a new thread here because it
> also failed in 1.7.2 and 1.6.4 on the same env, so I believe it's not a 1.8
> specific issue. And the case could pass in my local mac env (MacOS 10.14.3)
> so the issue should be environment related (possibly due to different
> default system configuration). Below are some more detailed information:
>
> * Kernel version: Linux 3.10.0-327
> * LSB Version: :core-4.1-amd64:core-4.1-noarch
> * Detailed error message:
> [ERROR]
>
> testKeyedWithBroadcastTranslation(org.apache.flink.test.streaming.runtime.BroadcastStateITCase)
> Time elapsed: 0.638 s  <<< ERROR!
> org.apache.flink.runtime.client.JobExecutionException: Job execution
> failed.
> at
>
> org.apache.flink.test.streaming.runtime.BroadcastStateITCase.testKeyedWithBroadcastTranslation(BroadcastStateITCase.java:99)
> Caused by: java.io.IOException: Insufficient number of network buffers:
> required 96, but only 63 available. The total number of network buffers is
> currently set to 7281 of 32768 bytes each. You can increase this number by
> setting the configuration keys 'taskmanager.network.memory.fraction',
> 'taskmanager.network.memory.min', and 'taskmanager.network.memory.max'.
> * Command to run test case: mvn -Dtest=BroadcastStateITCase
> -DfailIfNoTests=false -Dcheckstyle.skip=true clean test -pl flink-tests -am
>    - Notice that for 1.6.2/1.7.2 we need to manually update the
> maven-surefire-plugin version to 2.22.1 in pom.xml, or the test case will
> be skipped. In 1.8 we don't need this action since FLINK-11144 includes the
> change.
>
> Best Regards,
> Yu
>
Reply | Threaded
Open this post in threaded view
|

Re: BroadcastStateITCase failure caused by insufficient number of network buffers

liyu
Thanks for the fix and information boss!

Best Regards,
Yu


On Tue, 26 Mar 2019 at 02:24, Stephan Ewen <[hidden email]> wrote:

> Thanks for reporting this.
>
> Here is the issue and the fix for it:
> https://issues.apache.org/jira/browse/FLINK-12012
>
>
> On Fri, Mar 15, 2019 at 1:36 PM Yu Li <[hidden email]> wrote:
>
> > Hi All,
> >
> > As titled, I found this case failed stably in one of my testing
> environment
> > when verifying 1.8.0 release. I'm starting a new thread here because it
> > also failed in 1.7.2 and 1.6.4 on the same env, so I believe it's not a
> 1.8
> > specific issue. And the case could pass in my local mac env (MacOS
> 10.14.3)
> > so the issue should be environment related (possibly due to different
> > default system configuration). Below are some more detailed information:
> >
> > * Kernel version: Linux 3.10.0-327
> > * LSB Version: :core-4.1-amd64:core-4.1-noarch
> > * Detailed error message:
> > [ERROR]
> >
> >
> testKeyedWithBroadcastTranslation(org.apache.flink.test.streaming.runtime.BroadcastStateITCase)
> > Time elapsed: 0.638 s  <<< ERROR!
> > org.apache.flink.runtime.client.JobExecutionException: Job execution
> > failed.
> > at
> >
> >
> org.apache.flink.test.streaming.runtime.BroadcastStateITCase.testKeyedWithBroadcastTranslation(BroadcastStateITCase.java:99)
> > Caused by: java.io.IOException: Insufficient number of network buffers:
> > required 96, but only 63 available. The total number of network buffers
> is
> > currently set to 7281 of 32768 bytes each. You can increase this number
> by
> > setting the configuration keys 'taskmanager.network.memory.fraction',
> > 'taskmanager.network.memory.min', and 'taskmanager.network.memory.max'.
> > * Command to run test case: mvn -Dtest=BroadcastStateITCase
> > -DfailIfNoTests=false -Dcheckstyle.skip=true clean test -pl flink-tests
> -am
> >    - Notice that for 1.6.2/1.7.2 we need to manually update the
> > maven-surefire-plugin version to 2.22.1 in pom.xml, or the test case will
> > be skipped. In 1.8 we don't need this action since FLINK-11144 includes
> the
> > change.
> >
> > Best Regards,
> > Yu
> >
>