F@H on Linux (Docker, Unraid) Failed to remove directory. WUs get stuck
Go to solution
Solved by Gorgon,
2 hours ago, Metallus97 said:Sooo this happens and is filling up my drive/docker Image pretty quickly. Anny ideas on how to fix that or what to try? I can also see the WUs hanging in "Cleanup" in the client:
Log Dump:
FahCore_a7 -dir 01 -suffix 01 -version 704 -lifeline 30 -checkpoint 15 -np 21 21:16:32:WU01:FS00:Started FahCore on PID 69 21:16:32:WU01:FS00:Core PID:73 21:16:32:WU01:FS00:FahCore 0xa7 started 21:16:32:WU01:FS00:0xa7:*********************** Log Started 2020-03-09T21:16:32Z *********************** 21:16:32:WU01:FS00:0xa7:************************** Gromacs Folding@home Core *************************** 21:16:32:WU01:FS00:0xa7: Type: 0xa7 21:16:32:WU01:FS00:0xa7: Core: Gromacs 21:16:32:WU01:FS00:0xa7: Args: -dir 01 -suffix 01 -version 704 -lifeline 69 -checkpoint 15 -np 21 21:16:32:WU01:FS00:0xa7:************************************ CBang ************************************* 21:16:32:WU01:FS00:0xa7: Date: Nov 5 2019 21:16:32:WU01:FS00:0xa7: Time: 06:06:57 21:16:32:WU01:FS00:0xa7: Revision: 46c96f1aa8419571d83f3e63f9c99a0d602f6da9 21:16:32:WU01:FS00:0xa7: Branch: master 21:16:32:WU01:FS00:0xa7: Compiler: GNU 8.3.0 21:16:32:WU01:FS00:0xa7: Options: -std=c++11 -O3 -funroll-loops -fno-pie -fPIC 21:16:32:WU01:FS00:0xa7: Platform: linux2 4.19.0-5-amd64 21:16:32:WU01:FS00:0xa7: Bits: 64 21:16:32:WU01:FS00:0xa7: Mode: Release 21:16:32:WU01:FS00:0xa7:************************************ System ************************************ 21:16:32:WU01:FS00:0xa7: CPU: AMD Ryzen 9 3900X 12-Core Processor 21:16:32:WU01:FS00:0xa7: CPU ID: AuthenticAMD Family 23 Model 113 Stepping 0 21:16:32:WU01:FS00:0xa7: CPUs: 24 21:16:32:WU01:FS00:0xa7: Memory: 31.40GiB 21:16:32:WU01:FS00:0xa7:Free Memory: 4.16GiB 21:16:32:WU01:FS00:0xa7: Threads: POSIX_THREADS 21:16:32:WU01:FS00:0xa7: OS Version: 4.19 21:16:32:WU01:FS00:0xa7:Has Battery: false 21:16:32:WU01:FS00:0xa7: On Battery: false 21:16:32:WU01:FS00:0xa7: UTC Offset: 0 21:16:32:WU01:FS00:0xa7: PID: 73 21:16:32:WU01:FS00:0xa7: CWD: /config/work 21:16:32:WU01:FS00:0xa7:******************************** Build - libFAH ******************************** 21:16:32:WU01:FS00:0xa7: Version: 0.0.18 21:16:32:WU01:FS00:0xa7: Author: Joseph Coffland <joseph@cauldrondevelopment.com> 21:16:32:WU01:FS00:0xa7: Copyright: 2019 foldingathome.org 21:16:32:WU01:FS00:0xa7: Homepage: https://foldingathome.org/ 21:16:32:WU01:FS00:0xa7: Date: Nov 5 2019 21:16:32:WU01:FS00:0xa7: Time: 06:13:26 21:16:32:WU01:FS00:0xa7: Revision: 490c9aa2957b725af319379424d5c5cb36efb656 21:16:32:WU01:FS00:0xa7: Branch: master 21:16:32:WU01:FS00:0xa7: Compiler: GNU 8.3.0 21:16:32:WU01:FS00:0xa7: Options: -std=c++11 -O3 -funroll-loops -fno-pie 21:16:32:WU01:FS00:0xa7: Platform: linux2 4.19.0-5-amd64 21:16:32:WU01:FS00:0xa7: Bits: 64 21:16:32:WU01:FS00:0xa7: Mode: Release 21:16:32:WU01:FS00:0xa7:************************************ Build ************************************* 21:16:32:WU01:FS00:0xa7: SIMD: avx_256 21:16:32:WU01:FS00:0xa7:******************************************************************************** 21:16:32:WU01:FS00:0xa7:Project: 14199 (Run 5, Clone 34, Gen 8) 21:16:32:WU01:FS00:0xa7:Unit: 0x000000089bf7a4d55e655fbe8bcb6d31 21:16:32:WU01:FS00:0xa7:Reading tar file core.xml 21:16:32:WU01:FS00:0xa7:Reading tar file frame8.tpr 21:16:32:WU01:FS00:0xa7:Digital signatures verified 21:16:32:WU01:FS00:0xa7:Calling: mdrun -s frame8.tpr -o frame8.trr -cpt 15 -nt 21 21:16:32:WU01:FS00:0xa7:Steps: first=4000000 total=500000 21:16:32:WU01:FS00:0xa7:Completed 1 out of 500000 steps (0%) 21:16:34:WU00:FS00:Upload complete 21:16:34:WU00:FS00:Server responded WORK_ACK (400) 21:16:34:WU00:FS00:Final credit estimate, 3796.00 points 21:16:34:WU00:FS00:Cleaning up 21:16:34:ERROR:WU00:FS00:Exception: Failed to remove directory './work/00': boost::filesystem::remove: Directory not empty: "./work/00" 21:16:34:WU00:FS00:Cleaning up 21:16:34:ERROR:WU00:FS00:Exception: Failed to remove directory './work/00': boost::filesystem::remove: Directory not empty: "./work/00" 21:16:43:WU01:FS00:0xa7:Completed 5000 out of 500000 steps (1%) 21:16:53:WU01:FS00:0xa7:Completed 10000 out of 500000 steps (2%) 21:17:02:WU01:FS00:0xa7:Completed 15000 out of 500000 steps (3%) 21:17:12:WU01:FS00:0xa7:Completed 20000 out of 500000 steps (4%) 21:17:21:WU01:FS00:0xa7:Completed 25000 out of 500000 steps (5%) 21:17:31:WU01:FS00:0xa7:Completed 30000 out of 500000 steps (6%) 21:17:34:WU00:FS00:Cleaning up 21:17:34:ERROR:WU00:FS00:Exception: Failed to remove directory './work/00': boost::filesystem::remove: Directory not empty: "./work/00" 21:17:41:WU01:FS00:0xa7:Completed 35000 out of 500000 steps (7%) 21:17:50:WU01:FS00:0xa7:Completed 40000 out of 500000 steps (8%) 21:18:00:WU01:FS00:0xa7:Completed 45000 out of 500000 steps (9%) 21:18:09:WU01:FS00:0xa7:Completed 50000 out of 500000 steps (10%) 21:18:19:WU01:FS00:0xa7:Completed 55000 out of 500000 steps (11%) 21:18:29:WU01:FS00:0xa7:Completed 60000 out of 500000 steps (12%) 21:18:39:WU01:FS00:0xa7:Completed 65000 out of 500000 steps (13%) 21:18:48:WU01:FS00:0xa7:Completed 70000 out of 500000 steps (14%)
might be a permissions error and the process running FAHClient doesn't have rights to destroy a directory and all it's contents.
here what my /var/lib/fahclient/work folder contains and it's permissions:
root@fold8:~# cd /var/lib/fahclient/work/ root@fold8:/var/lib/fahclient/work# ls -alh total 72K drwxrwxrwx 3 fahclient root 4.0K Mar 9 19:00 . drwxrwxr-x 6 fahclient root 4.0K Feb 28 20:36 .. drwxrwxrwx 3 fahclient root 4.0K Mar 9 19:17 00 -rw-r--r-- 1 fahclient root 40K Mar 9 19:17 client.db -rw-r--r-- 1 fahclient root 17K Mar 9 19:17 client.db-journal
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now