Hi dev,
I noticed that all the travis tests triggered by pull request are failed with the same error: "Cached flink dir /home/travis/flink_cache/xxxxx/flink does not exist. Exiting build." Anyone have a clue on what happened and how to fix this? Best, Kurt |
This is (hopefully a short-lived) hiccup on the Travis caching
infrastructure. There's nothing we can do to _fix_ it; if it persists we'll have to rework our travis setup again to not rely on caching. On 18/06/2019 08:34, Kurt Young wrote: > Hi dev, > > I noticed that all the travis tests triggered by pull request are failed > with the same error: > > "Cached flink dir /home/travis/flink_cache/xxxxx/flink does not exist. > Exiting build." > > Anyone have a clue on what happened and how to fix this? > > Best, > Kurt > |
If it is travis caching issue, we can file apache infra ticket and ask them
to clean the cache. Chesnay Schepler <[hidden email]> 于2019年6月18日周二 下午3:18写道: > This is (hopefully a short-lived) hiccup on the Travis caching > infrastructure. > > There's nothing we can do to _fix_ it; if it persists we'll have to > rework our travis setup again to not rely on caching. > > On 18/06/2019 08:34, Kurt Young wrote: > > Hi dev, > > > > I noticed that all the travis tests triggered by pull request are failed > > with the same error: > > > > "Cached flink dir /home/travis/flink_cache/xxxxx/flink does not exist. > > Exiting build." > > > > Anyone have a clue on what happened and how to fix this? > > > > Best, > > Kurt > > > > -- Best Regards Jeff Zhang |
It has been crashed for more than 14 hours. Hope it recovers soon.
Jeff Zhang <[hidden email]> 于2019年6月18日周二 下午3:21写道: > If it is travis caching issue, we can file apache infra ticket and ask them > to clean the cache. > > > > Chesnay Schepler <[hidden email]> 于2019年6月18日周二 下午3:18写道: > > > This is (hopefully a short-lived) hiccup on the Travis caching > > infrastructure. > > > > There's nothing we can do to _fix_ it; if it persists we'll have to > > rework our travis setup again to not rely on caching. > > > > On 18/06/2019 08:34, Kurt Young wrote: > > > Hi dev, > > > > > > I noticed that all the travis tests triggered by pull request are > failed > > > with the same error: > > > > > > "Cached flink dir /home/travis/flink_cache/xxxxx/flink does not exist. > > > Exiting build." > > > > > > Anyone have a clue on what happened and how to fix this? > > > > > > Best, > > > Kurt > > > > > > > > > -- > Best Regards > > Jeff Zhang > |
In reply to this post by Jeff Zhang
The problem is not that bad stuff is in the cache (which is the only
thing a cache cleaning solves), it is that the test stages don't download the correct one. Our compile stage uploads stuff in to the cache, and the subsequent test builds downloads it again. Whether the upload from the compile phase is visible to the test phase is basically a timing thing; it depends on the visibility guarantee that the backing infrastructure provides. So far it _usually_ worked, but these are naturally things that may change over time. On 18/06/2019 09:20, Jeff Zhang wrote: > If it is travis caching issue, we can file apache infra ticket and ask them > to clean the cache. > > > > Chesnay Schepler <[hidden email]> 于2019年6月18日周二 下午3:18写道: > >> This is (hopefully a short-lived) hiccup on the Travis caching >> infrastructure. >> >> There's nothing we can do to _fix_ it; if it persists we'll have to >> rework our travis setup again to not rely on caching. >> >> On 18/06/2019 08:34, Kurt Young wrote: >>> Hi dev, >>> >>> I noticed that all the travis tests triggered by pull request are failed >>> with the same error: >>> >>> "Cached flink dir /home/travis/flink_cache/xxxxx/flink does not exist. >>> Exiting build." >>> >>> Anyone have a clue on what happened and how to fix this? >>> >>> Best, >>> Kurt >>> >> |
I agree with the explanation from @Chesnay Schepler <[hidden email]>. this
should be a problem with the Travis infrastructure because recently we have not big changed the logic of Travis inside Flink. At present, most of the failures are after the compile is completed. The cache size is only 7.7M, which means that the JARs are not successfully uploaded. So here is a question: - Where can we check the cache storage to see if there is a problem with the storage? In order to try to find out some reason for the CI issue, I do the follows test: - I delete other test phases locally and test them - Test whether the cache is uploaded normally during the compilation phase. See here https://travis-ci.org/sunjincheng121/flink/builds/547155029 - Increase Travis cache timeout to 1200 - Test the cache cannot be downloaded due to cache is a timeout. (I think this test will have the same result ) See here https://travis-ci.org/apache/flink/builds/547136163 Will feedback here after testing. Best, Jincheng Chesnay Schepler <[hidden email]> 于2019年6月18日周二 下午3:53写道: > The problem is not that bad stuff is in the cache (which is the only > thing a cache cleaning solves), it is that the test stages don't > download the correct one. > > Our compile stage uploads stuff in to the cache, and the subsequent test > builds downloads it again. > > Whether the upload from the compile phase is visible to the test phase > is basically a timing thing; it depends on the visibility guarantee that > the backing infrastructure provides. So far it _usually_ worked, but > these are naturally things that may change over time. > > On 18/06/2019 09:20, Jeff Zhang wrote: > > If it is travis caching issue, we can file apache infra ticket and ask > them > > to clean the cache. > > > > > > > > Chesnay Schepler <[hidden email]> 于2019年6月18日周二 下午3:18写道: > > > >> This is (hopefully a short-lived) hiccup on the Travis caching > >> infrastructure. > >> > >> There's nothing we can do to _fix_ it; if it persists we'll have to > >> rework our travis setup again to not rely on caching. > >> > >> On 18/06/2019 08:34, Kurt Young wrote: > >>> Hi dev, > >>> > >>> I noticed that all the travis tests triggered by pull request are > failed > >>> with the same error: > >>> > >>> "Cached flink dir /home/travis/flink_cache/xxxxx/flink does not exist. > >>> Exiting build." > >>> > >>> Anyone have a clue on what happened and how to fix this? > >>> > >>> Best, > >>> Kurt > >>> > >> > > |
Test result:
- The test for only compile state are succeeding (I deleted some old caches) cache size 1146.26M. See here https://travis-ci.org/sunjincheng121/flink/caches - timeout to 1200 test fail, get the same error, but I think maybe the storage problem, so I delete more old cache and restart the CI. See here https://travis-ci.org/apache/flink/builds/547136163 So now it feels like the storage size of the cache is limited. If so we can add some cleanup logic for the old cache (I am not sure,some validation is needed) Best Jincheng jincheng sun <[hidden email]> 于2019年6月18日周二 下午6:00写道: > I agree with the explanation from @Chesnay Schepler <[hidden email]>. this > should be a problem with the Travis infrastructure because recently we have > not big changed the logic of Travis inside Flink. > At present, most of the failures are after the compile is completed. The > cache size is only 7.7M, which means that the JARs are not successfully > uploaded. > > So here is a question: > - Where can we check the cache storage to see if there is a problem with > the storage? > > In order to try to find out some reason for the CI issue, I do the > follows test: > > - I delete other test phases locally and test them - Test whether the > cache is uploaded normally during the compilation phase. See here > https://travis-ci.org/sunjincheng121/flink/builds/547155029 > - Increase Travis cache timeout to 1200 - Test the cache cannot be > downloaded due to cache is a timeout. (I think this test will have the same > result ) See here https://travis-ci.org/apache/flink/builds/547136163 > > Will feedback here after testing. > > Best, > Jincheng > > Chesnay Schepler <[hidden email]> 于2019年6月18日周二 下午3:53写道: > >> The problem is not that bad stuff is in the cache (which is the only >> thing a cache cleaning solves), it is that the test stages don't >> download the correct one. >> >> Our compile stage uploads stuff in to the cache, and the subsequent test >> builds downloads it again. >> >> Whether the upload from the compile phase is visible to the test phase >> is basically a timing thing; it depends on the visibility guarantee that >> the backing infrastructure provides. So far it _usually_ worked, but >> these are naturally things that may change over time. >> >> On 18/06/2019 09:20, Jeff Zhang wrote: >> > If it is travis caching issue, we can file apache infra ticket and ask >> them >> > to clean the cache. >> > >> > >> > >> > Chesnay Schepler <[hidden email]> 于2019年6月18日周二 下午3:18写道: >> > >> >> This is (hopefully a short-lived) hiccup on the Travis caching >> >> infrastructure. >> >> >> >> There's nothing we can do to _fix_ it; if it persists we'll have to >> >> rework our travis setup again to not rely on caching. >> >> >> >> On 18/06/2019 08:34, Kurt Young wrote: >> >>> Hi dev, >> >>> >> >>> I noticed that all the travis tests triggered by pull request are >> failed >> >>> with the same error: >> >>> >> >>> "Cached flink dir /home/travis/flink_cache/xxxxx/flink does not exist. >> >>> Exiting build." >> >>> >> >>> Anyone have a clue on what happened and how to fix this? >> >>> >> >>> Best, >> >>> Kurt >> >>> >> >> >> >> |
The compile stage was always passing.
The timeout makes no difference, it only affects how long we wait for the download to complete. We already had significantly more data in the cache a while ago (like twice as much), so I skeptical that the amount of cached data is the problem. On 18/06/2019 12:47, jincheng sun wrote: > Test result: > - The test for only compile state are succeeding (I deleted some old > caches) cache size 1146.26M. See here > https://travis-ci.org/sunjincheng121/flink/caches > - timeout to 1200 test fail, get the same error, but I think maybe the > storage problem, so I delete more old cache and restart the CI. See here > https://travis-ci.org/apache/flink/builds/547136163 > > So now it feels like the storage size of the cache is limited. If so we can > add some cleanup logic for the old cache (I am not sure,some validation is > needed) > > Best > Jincheng > > jincheng sun <[hidden email]> 于2019年6月18日周二 下午6:00写道: > >> I agree with the explanation from @Chesnay Schepler <[hidden email]>. this >> should be a problem with the Travis infrastructure because recently we have >> not big changed the logic of Travis inside Flink. >> At present, most of the failures are after the compile is completed. The >> cache size is only 7.7M, which means that the JARs are not successfully >> uploaded. >> >> So here is a question: >> - Where can we check the cache storage to see if there is a problem with >> the storage? >> >> In order to try to find out some reason for the CI issue, I do the >> follows test: >> >> - I delete other test phases locally and test them - Test whether the >> cache is uploaded normally during the compilation phase. See here >> https://travis-ci.org/sunjincheng121/flink/builds/547155029 >> - Increase Travis cache timeout to 1200 - Test the cache cannot be >> downloaded due to cache is a timeout. (I think this test will have the same >> result ) See here https://travis-ci.org/apache/flink/builds/547136163 >> >> Will feedback here after testing. >> >> Best, >> Jincheng >> >> Chesnay Schepler <[hidden email]> 于2019年6月18日周二 下午3:53写道: >> >>> The problem is not that bad stuff is in the cache (which is the only >>> thing a cache cleaning solves), it is that the test stages don't >>> download the correct one. >>> >>> Our compile stage uploads stuff in to the cache, and the subsequent test >>> builds downloads it again. >>> >>> Whether the upload from the compile phase is visible to the test phase >>> is basically a timing thing; it depends on the visibility guarantee that >>> the backing infrastructure provides. So far it _usually_ worked, but >>> these are naturally things that may change over time. >>> >>> On 18/06/2019 09:20, Jeff Zhang wrote: >>>> If it is travis caching issue, we can file apache infra ticket and ask >>> them >>>> to clean the cache. >>>> >>>> >>>> >>>> Chesnay Schepler <[hidden email]> 于2019年6月18日周二 下午3:18写道: >>>> >>>>> This is (hopefully a short-lived) hiccup on the Travis caching >>>>> infrastructure. >>>>> >>>>> There's nothing we can do to _fix_ it; if it persists we'll have to >>>>> rework our travis setup again to not rely on caching. >>>>> >>>>> On 18/06/2019 08:34, Kurt Young wrote: >>>>>> Hi dev, >>>>>> >>>>>> I noticed that all the travis tests triggered by pull request are >>> failed >>>>>> with the same error: >>>>>> >>>>>> "Cached flink dir /home/travis/flink_cache/xxxxx/flink does not exist. >>>>>> Exiting build." >>>>>> >>>>>> Anyone have a clue on what happened and how to fix this? >>>>>> >>>>>> Best, >>>>>> Kurt >>>>>> >>> |
In reply to this post by Kurt Young
Recent builds are passing again.
On 18/06/2019 08:34, Kurt Young wrote: > Hi dev, > > I noticed that all the travis tests triggered by pull request are failed > with the same error: > > "Cached flink dir /home/travis/flink_cache/xxxxx/flink does not exist. > Exiting build." > > Anyone have a clue on what happened and how to fix this? > > Best, > Kurt > |
Unfortunately, I met this problem again just now https://api.travis-ci.org/v3/job/549534496/log.txt (the build overview https://travis-ci.org/apache/flink/builds/549534489). For those non-committers, including me, we have to close-reopen the PR or push another commit to re-trigger the PR check🙁
Best Yun Tang ________________________________ From: Chesnay Schepler <[hidden email]> Sent: Wednesday, June 19, 2019 16:59 To: [hidden email]; Kurt Young Subject: Re: Something wrong with travis? Recent builds are passing again. On 18/06/2019 08:34, Kurt Young wrote: > Hi dev, > > I noticed that all the travis tests triggered by pull request are failed > with the same error: > > "Cached flink dir /home/travis/flink_cache/xxxxx/flink does not exist. > Exiting build." > > Anyone have a clue on what happened and how to fix this? > > Best, > Kurt > |
I met this problem again at https://api.travis-ci.com/v3/job/220732163/log.txt . Is there any place we could ask for help to contact tarvis or any clues we could use to figure out this?
Best Yun Tang ________________________________ From: Yun Tang <[hidden email]> Sent: Monday, June 24, 2019 14:22 To: [hidden email] <[hidden email]>; Kurt Young <[hidden email]> Subject: Re: Something wrong with travis? Unfortunately, I met this problem again just now https://api.travis-ci.org/v3/job/549534496/log.txt (the build overview https://travis-ci.org/apache/flink/builds/549534489). For those non-committers, including me, we have to close-reopen the PR or push another commit to re-trigger the PR check🙁 Best Yun Tang ________________________________ From: Chesnay Schepler <[hidden email]> Sent: Wednesday, June 19, 2019 16:59 To: [hidden email]; Kurt Young Subject: Re: Something wrong with travis? Recent builds are passing again. On 18/06/2019 08:34, Kurt Young wrote: > Hi dev, > > I noticed that all the travis tests triggered by pull request are failed > with the same error: > > "Cached flink dir /home/travis/flink_cache/xxxxx/flink does not exist. > Exiting build." > > Anyone have a clue on what happened and how to fix this? > > Best, > Kurt > |
There is nothing to report; we already know what the problem is but it
cannot be fixed. On 30/07/2019 08:46, Yun Tang wrote: > I met this problem again at https://api.travis-ci.com/v3/job/220732163/log.txt . Is there any place we could ask for help to contact tarvis or any clues we could use to figure out this? > > Best > Yun Tang > ________________________________ > From: Yun Tang <[hidden email]> > Sent: Monday, June 24, 2019 14:22 > To: [hidden email] <[hidden email]>; Kurt Young <[hidden email]> > Subject: Re: Something wrong with travis? > > Unfortunately, I met this problem again just now https://api.travis-ci.org/v3/job/549534496/log.txt (the build overview https://travis-ci.org/apache/flink/builds/549534489). For those non-committers, including me, we have to close-reopen the PR or push another commit to re-trigger the PR check🙁 > > Best > Yun Tang > ________________________________ > From: Chesnay Schepler <[hidden email]> > Sent: Wednesday, June 19, 2019 16:59 > To: [hidden email]; Kurt Young > Subject: Re: Something wrong with travis? > > Recent builds are passing again. > > On 18/06/2019 08:34, Kurt Young wrote: >> Hi dev, >> >> I noticed that all the travis tests triggered by pull request are failed >> with the same error: >> >> "Cached flink dir /home/travis/flink_cache/xxxxx/flink does not exist. >> Exiting build." >> >> Anyone have a clue on what happened and how to fix this? >> >> Best, >> Kurt >> |
Free forum by Nabble | Edit this page |