Message boards : Number crunching : extreme long wu's
Message board moderation

To post messages, you must log in.

Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 10 · Next

AuthorMessage
Profile Krzysztof Piszczek - wspieram ...
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 846
Credit: 144,180,465
RAC: 0
Message 1983 - Posted: 23 Feb 2017, 23:41:50 UTC - in response to Message 1980.  

Are the new apps released now, it has been a week?

Not yet.
I had accident and motherboard in my main machine died...
I just get new one, but it takes me few days to recover all my current jobs :(
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home team
My Patreon profile
Universe@Home on YT
ID: 1983 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
adrianxw

Send message
Joined: 1 Oct 16
Posts: 32
Credit: 268,033
RAC: 0
Message 1984 - Posted: 24 Feb 2017, 5:57:02 UTC

Fair enough. I'll look in again next weekend.
ID: 1984 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Matthias Lehmkuhl

Send message
Joined: 23 Feb 15
Posts: 2
Credit: 2,107,545
RAC: 0
Message 1986 - Posted: 27 Feb 2017, 14:30:41 UTC

I've canceled today the result
http://universeathome.pl/universe/result.php?resultid=20240557
after over 10 days calculation and not having reached 50% progress
Matthias
ID: 1986 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
adrianxw

Send message
Joined: 1 Oct 16
Posts: 32
Credit: 268,033
RAC: 0
Message 1989 - Posted: 3 Mar 2017, 10:40:14 UTC

How are you getting on Chris, are we ready to go on the new versions now?
ID: 1989 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krzysztof Piszczek - wspieram ...
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 846
Credit: 144,180,465
RAC: 0
Message 1990 - Posted: 3 Mar 2017, 20:31:21 UTC - in response to Message 1989.  

How are you getting on Chris, are we ready to go on the new versions now?

Early next week I think.
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home team
My Patreon profile
Universe@Home on YT
ID: 1990 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
corris

Send message
Joined: 29 Aug 15
Posts: 3
Credit: 341,000
RAC: 0
Message 1992 - Posted: 6 Mar 2017, 23:14:21 UTC

Had some WUs in this category today, now aborted

examples :-
universe_bh2_160803_85_3_20000_1-999999_480000_1
universe_bh2_160803_85_1_20000_1-999999_800000_0


Both over 2.5 days with Boinc Manager reporting at 3% complete.
ID: 1992 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Alessio Susi
Avatar

Send message
Joined: 1 Apr 15
Posts: 49
Credit: 30,557,740
RAC: 0
Message 1993 - Posted: 7 Mar 2017, 12:54:09 UTC
Last modified: 7 Mar 2017, 12:55:06 UTC

http://universeathome.pl/universe/results.php?hostid=47318

The same with an Android TV Box. Passed from 8-10 hours to 35-40 hours. At the moment, I prefer to use it for World Community Grid and POGS.
ASUS X570 E-Gaming
AMD Ryzen 9 3950X, 16 core / 32 thread 4.4 GHz
AMD Radeon Sapphire RX 480 4GB Nitro+
Nvidia GTX 1080 Ti Gaming X Trio
4x16 GB Corsair Vengeance RGB 3466 MHz

ID: 1993 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
adrianxw

Send message
Joined: 1 Oct 16
Posts: 32
Credit: 268,033
RAC: 0
Message 1995 - Posted: 10 Mar 2017, 9:56:11 UTC

So, another week, seems like problems still arise, so I assume the fix has not been released. Update?

*** Off topic ***

Something I noticed when beginning to post this, I did not notice I was logged in already, and went to the login page, put my name and password in, but trying to to submit the form, got a 404. Trivial.
ID: 1995 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
NotRealName

Send message
Joined: 5 Feb 17
Posts: 6
Credit: 2,135,900
RAC: 0
Message 1998 - Posted: 15 Mar 2017, 2:52:05 UTC

ID: 1998 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 21 Feb 15
Posts: 53
Credit: 1,385,888
RAC: 0
Message 1999 - Posted: 15 Mar 2017, 3:04:53 UTC

It is utterly unacceptable that this project still continues to waste CPU cycles, with a known bad app or batch of tasks. Literally, unacceptable. Wasting crunching power!!

I may never turn "No New Tasks" off, because of this non-responsiveness! :(

krzyszp, can't you do something to stop the bleeding, even??
ID: 1999 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krzysztof Piszczek - wspieram ...
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 846
Credit: 144,180,465
RAC: 0
Message 2000 - Posted: 15 Mar 2017, 16:00:29 UTC - in response to Message 1999.  

We will upgrade application shortly when we sort out some compiling problems,
The very long units are still a mystery as it not happens very often and only on some computers (e.g. on one of my machines it's never happens).
Anyway, the upgrade changes a quite big part of application so the problem should go away soon...
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home team
My Patreon profile
Universe@Home on YT
ID: 2000 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
adrianxw

Send message
Joined: 1 Oct 16
Posts: 32
Credit: 268,033
RAC: 0
Message 2001 - Posted: 15 Mar 2017, 16:14:52 UTC

I think you should suspend work unit production until this is fixed. There are peoples machines wasting days of CPU time, which obviously could be doing useful work, You are not winning friends continuing the way you are.
ID: 2001 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 21 Feb 15
Posts: 53
Credit: 1,385,888
RAC: 0
Message 2002 - Posted: 15 Mar 2017, 16:36:25 UTC

Unattended machines could be wasting more than just days, on this issue!

Admin, please consider both of these:
- Stopping sending work for tasks that could end up in a never-ending state.
- Server-side-aborting tasks that could end up in a never-ending state.

That's what I'd do. Wasting CPU cycles is equivalent to stealing CPU cycles from other projects.
ID: 2002 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Feb 15
Posts: 253
Credit: 200,562,581
RAC: 0
Message 2004 - Posted: 15 Mar 2017, 16:56:55 UTC - in response to Message 2000.  

If it is a short time until the new application is ready, and you need the results of the present application, then I would keep it going. It is a relatively small problem for me. If other people have machines that are more susceptible to it, then they can turn them off as they desire. But it makes no sense to prevent the completion of a scientific study for a few bad work units. All projects have them, and the crunchers can choose other projects anytime they want.
ID: 2004 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 21 Feb 15
Posts: 53
Credit: 1,385,888
RAC: 0
Message 2005 - Posted: 15 Mar 2017, 17:01:07 UTC

To be clear ...

Tasks that error out eventually, are a pain to deal with, but a non-attended setup will handle them gracefully enough.
Tasks that run continuously without end, are a pain to deal with, but a non-attended setup will end up crunching indefinitely, wasting electricity and wasting resources indefinitely.

I speak loudly, because I think we're dealing with the 2nd case here.

It sounds like you are saying "Oh, it's okay to render machines and CPUs completely useless and have them waste energy, if we make progress overall."
... and that's a very very bad idea.
ID: 2005 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jim1348

Send message
Joined: 28 Feb 15
Posts: 253
Credit: 200,562,581
RAC: 0
Message 2007 - Posted: 15 Mar 2017, 17:23:21 UTC - in response to Message 2005.  

If my machines were susceptible to that problem to the extent that I found it unacceptable, I would choose another project. You seem to be asking them to cancel the project, or some portion thereof, for a problem that affects some people more than others and they can't find the solution for.

Are you expecting them to find the bad work units in advance? If they could do that, they could fix them.
ID: 2007 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
adrianxw

Send message
Joined: 1 Oct 16
Posts: 32
Credit: 268,033
RAC: 0
Message 2009 - Posted: 15 Mar 2017, 17:57:24 UTC
Last modified: 15 Mar 2017, 17:58:41 UTC

What he has said is that there is a fix, but for some reason he can't compile it, sounds odd to me, but then, I've only been a software engineer for 30 years, what would I know. Nobody has said stop the project, what has been, quite rightly said, is that the situation he seems happy to live with, some of his crunchers are not. He fixes the problem, or looses crunchers, his choice.
ID: 2009 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 21 Feb 15
Posts: 53
Credit: 1,385,888
RAC: 0
Message 2010 - Posted: 15 Mar 2017, 18:27:53 UTC - in response to Message 2007.  
Last modified: 15 Mar 2017, 18:34:35 UTC

If my machines were susceptible to that problem to the extent that I found it unacceptable, I would choose another project. You seem to be asking them to cancel the project, or some portion thereof, for a problem that affects some people more than others and they can't find the solution for.

Are you expecting them to find the bad work units in advance? If they could do that, they could fix them.


I am attached to every possible project, about 60 of them. I routinely do work for about 15 of them. I'm also one of the main BOINC Alpha testers.

What I am asking for is not unreasonable. The request is: If a project has a situation where a task can get stuck in the worst possible state of running indefinitely (100% waste), the project does everything in their power to stop the bleeding, including possibly taking the app offline or cancelling affected batches of tasks.

It has happened to other projects before, and they have responded correctly. I'm hoping for a correct response with this project. In the meantime, I'm lucky I don't have unattended setups, and I easily set No New Tasks on all 4 of my PCs.
ID: 2010 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Profile Krzysztof Piszczek - wspieram ...
Project administrator
Project developer
Project tester
Avatar

Send message
Joined: 4 Feb 15
Posts: 846
Credit: 144,180,465
RAC: 0
Message 2011 - Posted: 15 Mar 2017, 19:27:59 UTC - in response to Message 2010.  

including possibly taking the app offline or cancelling affected batches of tasks.

It's never "batch" of tasks - is always just few tasks (in worst batch it was 14 WU's) in batch of around 40'000 and they mostly calculate correctly on wingman machine. This is why is so difficult to find result.
Even when I manually run those tasks on my machine I didn't have any answers because their are finish properly...
Krzysztof 'krzyszp' Piszczek

Member of Radioactive@Home team
My Patreon profile
Universe@Home on YT
ID: 2011 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Jacob Klein

Send message
Joined: 21 Feb 15
Posts: 53
Credit: 1,385,888
RAC: 0
Message 2012 - Posted: 15 Mar 2017, 19:55:25 UTC

If there is anything you'd like me to test or try, tell me what to do and I'll do it. I want it solved, and am willing to try things for you.
ID: 2012 · Rating: 0 · rate: Rate + / Rate - Report as offensive     Reply Quote
Previous · 1 · 2 · 3 · 4 · 5 · 6 . . . 10 · Next

Message boards : Number crunching : extreme long wu's




Copyright © 2024 Copernicus Astronomical Centre of the Polish Academy of Sciences
Project server and website managed by Krzysztof 'krzyszp' Piszczek