Message boards :
Number crunching :
extreme long wu's
Message board moderation
Previous · 1 . . . 5 · 6 · 7 · 8 · 9 · 10 · Next
Author | Message |
---|---|
Send message Joined: 23 Feb 15 Posts: 2 Credit: 2,107,545 RAC: 0 |
I've got a long running result, will cancel that result now with a CPU time of more than 11 days. last checkpoint was 13.07.2017 and progress shows fraction_done of 0.450050 http://universeathome.pl/universe/workunit.php?wuid=10535753 wu_name: universe_bh2_160803_154_2_20000_1-999999_470000 result_name: universe_bh2_160803_154_2_20000_1-999999_470000_0 app_file: BHspin2_1_windows_intelx86.exe error.dat does show: error: bondi() accreted mass (6.458614) larger than envelope mass (4.354907) (2714882) error: in Renv_con() unknown Ka type: 1, iidd_old: 2840284error: in Menv_con() unknown Ka type: 1, iidd_old: 2840284 error.dat2 does show: error: bondi() accreted mass (5.652698) larger than envelope mass (5.233360) (240724) error: bondi() accreted mass (7.216964) larger than envelope mass (6.333716) (2569362) error.dat3 does show: error: bondi() accreted mass (5.652698) larger than envelope mass (5.233360) (240724) error: bondi() accreted mass (7.216964) larger than envelope mass (6.333716) (2569362) log.txt contains: 00:00:00 00:00:00 PROGRAM START: Thu Jul 13 02:29:34 2017 00:00:00 00:00:00 no checkpoint.dat file found00:00:00 00:00:00 cleaning checkpoints 00:00:00 00:00:00 gw_cpfile: source file "data0.dat2" not present 00:00:00 00:00:00 gw_cpfile: source file "data1.dat2" not present 00:00:00 00:00:00 gw_cpfile: source file "data2.dat2" not present 00:00:00 00:00:00 gw_cpfile: source file "error.dat2" not present 00:00:00 00:00:00 reading checkpoint: istart: -1; pp: 0; n: -1 00:00:00 00:00:00 checkpoint read 00:00:00 00:00:00 default values set 00:00:00 00:00:00 Reading param.in file 00:00:00 00:00:00 PARAMIN: num_tested = 20000 00:00:00 00:00:00 PARAMIN: hub_val = 1000 00:00:00 00:00:00 PARAMIN: idum = -470000 00:00:00 00:00:00 PARAMIN: OUTPUT = 3 00:00:00 00:00:00 PARAMIN: Mmina = 5.0 00:00:00 00:00:00 PARAMIN: Mminb = 3.0 00:00:00 00:00:00 PARAMIN: golambda = 0.1 00:00:00 00:00:00 PARAMIN: Beta = 0.1 00:00:00 00:00:00 PARAMIN: Fa = 1.0 00:00:00 00:00:00 PARAMIN: Sigma3 = 0 00:00:00 00:00:00 PARAMIN: Sal = -2.7 00:00:00 00:00:00 PARAMIN: SS = 0 00:00:00 00:00:00 PARAMIN unknown parameter: name: SS; value: 0 00:00:00 00:00:00 PARAMIN: ZZ = 0.0001 00:00:00 00:00:00 param.in file read 00:00:00 00:00:00 idum: -470000; num_tested: 20000 00:05:24 00:05:24 making checkpoint: j: 1000; iidd: 282852 00:05:24 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:05:24 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:05:24 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:05:24 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:05:24 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:05:24 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:05:24 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:05:24 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:11:01 00:05:37 making checkpoint: j: 2000; iidd: 575529 00:11:01 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:11:01 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:11:01 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:11:01 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:11:01 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:11:01 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:11:01 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:11:01 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:16:28 00:05:27 making checkpoint: j: 3000; iidd: 869551 00:16:28 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:16:28 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:16:28 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:16:28 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:16:28 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:16:28 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:16:28 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:16:28 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:22:36 00:06:08 making checkpoint: j: 4000; iidd: 1164932 00:22:36 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:22:36 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:22:36 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:22:36 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:22:36 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:22:36 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:22:36 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:22:36 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:28:27 00:05:51 making checkpoint: j: 5000; iidd: 1449110 00:28:27 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:28:27 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:28:27 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:28:27 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:28:27 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:28:27 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:28:27 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:28:27 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:34:07 00:05:40 making checkpoint: j: 6000; iidd: 1740336 00:34:07 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:34:07 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:34:07 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:34:07 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:34:07 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:34:07 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:34:07 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:34:07 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:39:57 00:05:50 making checkpoint: j: 7000; iidd: 2037642 00:39:57 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:39:57 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:39:57 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:39:57 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:39:57 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:39:57 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:39:57 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:39:57 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:45:09 00:05:12 making checkpoint: j: 8000; iidd: 2308124 00:45:09 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:45:09 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:45:09 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:45:09 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:45:09 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:45:09 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:45:09 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:45:09 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:50:43 00:05:34 making checkpoint: j: 9000; iidd: 2581356 00:50:43 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:50:43 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:50:43 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:50:43 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:50:43 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:50:43 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:50:43 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:50:43 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:00:00 00:00:00 PROGRAM START: Thu Jul 13 02:29:34 2017 00:00:00 00:00:00 no checkpoint.dat file found00:00:00 00:00:00 cleaning checkpoints 00:00:00 00:00:00 gw_cpfile: source file "data0.dat2" not present 00:00:00 00:00:00 gw_cpfile: source file "data1.dat2" not present 00:00:00 00:00:00 gw_cpfile: source file "data2.dat2" not present 00:00:00 00:00:00 gw_cpfile: source file "error.dat2" not present 00:00:00 00:00:00 reading checkpoint: istart: -1; pp: 0; n: -1 00:00:00 00:00:00 checkpoint read 00:00:00 00:00:00 default values set 00:00:00 00:00:00 Reading param.in file 00:00:00 00:00:00 PARAMIN: num_tested = 20000 00:00:00 00:00:00 PARAMIN: hub_val = 1000 00:00:00 00:00:00 PARAMIN: idum = -470000 00:00:00 00:00:00 PARAMIN: OUTPUT = 3 00:00:00 00:00:00 PARAMIN: Mmina = 5.0 00:00:00 00:00:00 PARAMIN: Mminb = 3.0 00:00:00 00:00:00 PARAMIN: golambda = 0.1 00:00:00 00:00:00 PARAMIN: Beta = 0.1 00:00:00 00:00:00 PARAMIN: Fa = 1.0 00:00:00 00:00:00 PARAMIN: Sigma3 = 0 00:00:00 00:00:00 PARAMIN: Sal = -2.7 00:00:00 00:00:00 PARAMIN: SS = 0 00:00:00 00:00:00 PARAMIN unknown parameter: name: SS; value: 0 00:00:00 00:00:00 PARAMIN: ZZ = 0.0001 00:00:00 00:00:00 param.in file read 00:00:00 00:00:00 idum: -470000; num_tested: 20000 00:05:24 00:05:24 making checkpoint: j: 1000; iidd: 282852 00:05:24 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:05:24 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:05:24 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:05:24 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:05:24 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:05:24 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:05:24 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:05:24 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:11:01 00:05:37 making checkpoint: j: 2000; iidd: 575529 00:11:01 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:11:01 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:11:01 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:11:01 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:11:01 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:11:01 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:11:01 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:11:01 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:16:28 00:05:27 making checkpoint: j: 3000; iidd: 869551 00:16:28 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:16:28 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:16:28 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:16:28 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:16:28 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:16:28 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:16:28 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:16:28 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:22:36 00:06:08 making checkpoint: j: 4000; iidd: 1164932 00:22:36 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:22:36 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:22:36 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:22:36 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:22:36 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:22:36 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:22:36 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:22:36 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:28:27 00:05:51 making checkpoint: j: 5000; iidd: 1449110 00:28:27 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:28:27 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:28:27 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:28:27 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:28:27 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:28:27 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:28:27 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:28:27 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:34:07 00:05:40 making checkpoint: j: 6000; iidd: 1740336 00:34:07 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:34:07 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:34:07 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:34:07 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:34:07 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:34:07 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:34:07 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:34:07 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:39:57 00:05:50 making checkpoint: j: 7000; iidd: 2037642 00:39:57 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:39:57 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:39:57 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:39:57 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:39:57 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:39:57 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:39:57 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:39:57 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:45:09 00:05:12 making checkpoint: j: 8000; iidd: 2308124 00:45:09 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:45:09 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:45:09 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:45:09 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:45:09 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:45:09 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:45:09 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:45:09 00:00:00 gw_cpfile: error.dat appended to error.dat3 00:50:43 00:05:34 making checkpoint: j: 9000; iidd: 2581356 00:50:43 00:00:00 gw_cpfile: data0.dat appended to data0.dat2 00:50:43 00:00:00 gw_cpfile: data1.dat appended to data1.dat2 00:50:43 00:00:00 gw_cpfile: data2.dat appended to data2.dat2 00:50:43 00:00:00 gw_cpfile: error.dat appended to error.dat2 00:50:43 00:00:00 gw_cpfile: data0.dat appended to data0.dat3 00:50:43 00:00:00 gw_cpfile: data1.dat appended to data1.dat3 00:50:43 00:00:00 gw_cpfile: data2.dat appended to data2.dat3 00:50:43 00:00:00 gw_cpfile: error.dat appended to error.dat3 Matthias |
Send message Joined: 11 Mar 15 Posts: 37 Credit: 271,242,973 RAC: 0 |
Hey krzys I've just found a bunch of these bad WU's over all my PC's, what a mess. I found out the hard way that even if I abort them they still don't die. They have to be manually killed from task manager. If you don't manually kill them after aborting them boinc thinks they've gone and assigns new work into the already loaded & running slot. Not good!! Will kill this WU now, this is the fifth in the last few hours :( The only way I can tell there locked up & not just long running is keeping an eye on the checkpointing. The WU below has been running for 15hrs 23mins and hasn't checkpointed for the last 13hrs. At least now I know what to look for & how to treat them. Aggressively... As usual the stderr is empty but If you're interested I kept a copy of the slot directory before I aborted it, if you would like it just ask. http://universeathome.pl/universe/results.php?hostid=1679&offset=0&show_names=0&state=6&appid= Contents of error.dat3..... error: bondi() accreted mass (6.024443) larger than envelope mass (5.618883) (60413) error: bondi() accreted mass (7.907802) larger than envelope mass (7.402827) (144779) error: bondi() accreted mass (5.415336) larger than envelope mass (2.590284) (146410) error: bondi() accreted mass (9.456944) larger than envelope mass (9.022832) (258705) error: bondi() accreted mass (9.386890) larger than envelope mass (5.976198) (356863) error: bondi() accreted mass (11.491090) larger than envelope mass (7.780758) (361139) error: bondi() accreted mass (6.318919) larger than envelope mass (5.818937) (438696) error: bondi() accreted mass (5.645096) larger than envelope mass (5.213384) (445394) error: bondi() accreted mass (5.773027) larger than envelope mass (5.230975) (671283) error: bondi() accreted mass (12.410371) larger than envelope mass (8.284976) (693333) error: bondi() accreted mass (8.904030) larger than envelope mass (6.786716) (702075) error: bondi() accreted mass (6.480212) larger than envelope mass (6.192082) (750103) error: bondi() accreted mass (5.496527) larger than envelope mass (4.505465) (818009) EDIT: Just tried to kill another one but instead this time it killed my PC, (blue screened) |
Send message Joined: 26 Feb 15 Posts: 3 Credit: 56,424,411 RAC: 0 |
This is still an issue! :( I have to ask -- is this problem being worked on? The general response seems to be deal with it, these wus come in spurts and it is a cost of doing business. I can handle that .. but it is concerning that such a long-standing problem has still not been successfully addressed and does not seem to be a priority. |
Send message Joined: 11 Mar 15 Posts: 37 Credit: 271,242,973 RAC: 0 |
Another two never ending BHspin tasks :( So far wingan involved has had the same problem, all though it wouldn't surprise me if they eventually validated. http://universeathome.pl/universe/result.php?resultid=24645815 http://universeathome.pl/universe/result.php?resultid=24645698 |
Send message Joined: 11 Mar 15 Posts: 37 Credit: 271,242,973 RAC: 0 |
Another bad batch, these WU should have been in the above post but regardless they all had to be manually aborted. Only my raspberry pi's do not show this behaviour. This is just a sample I have many more like em' "if" you're interested? http://universeathome.pl/universe/workunit.php?wuid=10841172 http://universeathome.pl/universe/workunit.php?wuid=10860295 |
Send message Joined: 4 Feb 15 Posts: 48 Credit: 15,956,546 RAC: 0 |
I have a BH Spin WU that has already been running for 1 and 1/2 days at 2 % with over 74 Days still to go and counting. I suspect this is a faulty WU and will never finish plus will go over deadline anyway. I will probably abort it this afternoon. Conan |
Send message Joined: 4 Feb 15 Posts: 48 Credit: 15,956,546 RAC: 0 |
I have just noticed that the percentage done has not move for many, many hours so I am aborting this WU. Time to complete has reached 80 days, percentage 2.029% after 1 day 16 hours. Conan |
Send message Joined: 4 Feb 15 Posts: 846 Credit: 144,180,465 RAC: 0 |
If any WU calculate longer then 6 hours feel free to abort it. Or, if it is no percentage progress over one hour. Krzysztof 'krzyszp' Piszczek Member of Radioactive@Home team My Patreon profile Universe@Home on YT |
Send message Joined: 21 Feb 15 Posts: 53 Credit: 1,385,888 RAC: 0 |
That doesn't work for unattended machines. Please fix your problem already, so computer resources aren't continually wasted! You were provided details, 6 months ago. |
Send message Joined: 6 Mar 15 Posts: 28 Credit: 16,721,329 RAC: 0 |
If any WU calculate longer then 6 hours feel free to abort it... But if it keeps checkpointing and appearing to advance toward completion, is there any reason to abort? |
Send message Joined: 21 Feb 15 Posts: 8 Credit: 364,694,894 RAC: 0 |
But if it keeps checkpointing and appearing to advance toward completion, is there any reason to abort? No. everything seems normal now |
Send message Joined: 29 May 17 Posts: 1 Credit: 2,938,600 RAC: 0 |
If any WU calculate longer then 6 hours feel free to abort it. Or turn off getting new tasks until this is fixed. |
Send message Joined: 21 Feb 15 Posts: 53 Credit: 1,385,888 RAC: 0 |
If any WU calculate longer then 6 hours feel free to abort it. EXACTLY. I have several PCs doing BOINC work, and I can't monitor the details of every task that they do. This problem is real, and it wastes resources, making a CPU thread completely useless, as it spins its wheels on a task that won't complete... The devs here should put more effort into solving this problem, instead of not caring about wasted resources. Hell, for that reason alone I'd set "No New Tasks", but I'll also do it because the tasks here sometimes don't work and waste my CPU. I still hope for a fix, but in the meantime, you don't deserve my CPU if you're going to abuse it. "No New Tasks" for you. |
Send message Joined: 28 Feb 15 Posts: 253 Credit: 200,562,581 RAC: 0 |
I may have mentioned this before, but the long runners seem to correlate with running other work units. I have been running Universe/BHspin v2 mostly by itself for a couple of months, and saw no long runners. Recently, I added LHC/SixTrack to this machine, and picked up a couple of long runners today. http://universeathome.pl/universe/result.php?resultid=26150652 http://universeathome.pl/universe/result.php?resultid=26150707 That is not much proof, and may be hard to fix, but I mention it for what it is worth. |
Send message Joined: 4 Feb 15 Posts: 24 Credit: 7,035,527 RAC: 0 |
AFAIR, i can confirm this. Things started to get messy on my machine, when i ran LHC & Universe. Doesn't necessarily have to mean something, however.... "I should bring one important point to the attention of the authors and that is, the world is not the United States..." |
Send message Joined: 28 Feb 15 Posts: 253 Credit: 200,562,581 RAC: 0 |
I am going to try a little trick, and see how it works. Normally, LHC/SixTrack has either a lot of work or none at all. So instead of mixing it up with Universe, I have set Universe to 0 resource share. That way, when SixTrack has work, it will run by itself. And then, when SixTrack is out of work, Universe will run by itself. Maybe it will avoid some of the problems. |
Send message Joined: 10 Sep 15 Posts: 12 Credit: 20,067,933 RAC: 0 |
I don't run Sixtrack, but I have got 4 stuck WUs and 1 suspect one. They're all named universe_bh2_160803_181_*. http://universeathome.pl/universe/workunit.php?wuid=11051278 http://universeathome.pl/universe/workunit.php?wuid=11051513 http://universeathome.pl/universe/workunit.php?wuid=11051517 http://universeathome.pl/universe/workunit.php?wuid=11051573 http://universeathome.pl/universe/workunit.php?wuid=11051697 error.dat files error: function Lzahbf(M,Mc) should not be called for HM stars What a waste! |
Send message Joined: 10 Sep 15 Posts: 12 Credit: 20,067,933 RAC: 0 |
Delete this post. |
Send message Joined: 28 Feb 15 Posts: 23 Credit: 42,229,680 RAC: 0 |
250h for nothing....this WU stands at 0,1%. WU http://universeathome.pl/universe/result.php?resultid=26112133 work packet http://universeathome.pl/universe/workunit.php?wuid=11376308 |
Send message Joined: 2 Mar 15 Posts: 7 Credit: 4,296,304 RAC: 0 |
I have regular endless tasks to. Win 10 PC which run 7/24. Aborting the last WU after 14 hours I found the process two days later still running in Taskmanager!!! So the process wasn´t canceled but removed from Boinc. I have 80 hours wasted time. So I can´t run no more WUs because of this bad behavior. |