[jira] [Created] (FLINK-17794) Tear down installed software in reverse order in Jepsen Tests

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Created] (FLINK-17794) Tear down installed software in reverse order in Jepsen Tests

Shang Yuanchun (Jira)
Gary Yao created FLINK-17794:
--------------------------------

             Summary: Tear down installed software in reverse order in Jepsen Tests
                 Key: FLINK-17794
                 URL: https://issues.apache.org/jira/browse/FLINK-17794
             Project: Flink
          Issue Type: Bug
          Components: Tests
    Affects Versions: 1.10.1, 1.11.0
            Reporter: Gary Yao
            Assignee: Gary Yao
             Fix For: 1.11.0


Tear down installed software in reverse order in Jepsen Tests. This mitigates the issue that sometimes hadoop's node manager directories cannot be removed using {{rm -rf}} because Flink processes keep running and generate files after the YARN NodeManager is shut down. {{rm -r}} removes files recursively but if files are created in the background concurrently, the command can still fail with a non-zero exit code.

{noformat}
sh -c \"cd /; rm -rf /opt/hadoop\"", :exit 1, :out "", :err "rm: cannot remove '/opt/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1587567275082_0001/flink-io-3587fdbb-15be-4482-94f2-338bfe6b1acc/job_77be6dd9f1b2aa218348e8b8a2512660_op_StreamMap_5271c210329e73bd743f3227edfb3b71__27_30__uuid_02dbbf1e-d2d5-43e8-ab34-040345f96476/db': Directory not empty\nrm: cannot remove '/opt/hadoop/tmp/nm-local-dir/usercache/root/appcache/application_1587567275082_0001/flink-io-d14f2078-74ee-4b8b-aafe-4299577f214f/job_77be6dd9f1b2aa218348e8b8a2512660_op_StreamMap_7d23c6ceabda05a587f0217e44f21301__17_30__uuid_2de2b67d-0767-4e32-99f0-ddd291460947/db': Directory not empty
{noformat}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)