Fluctuating GPU usage (0%-100% every few seconds)

QuellwasserQuellwasser Member Posts: 43
Hi guys,
I'am mining on Windows 7 with 1x HD7990 (GPU 1 +2), 2x R9 290x (GPU 3+4) 1x R9 390 (GPU 5). As seen in the picture the GPU usage with all cards but the HD 7990 fluctuates between 0% and 100% every few seconds - no difference in solo- and poolmining. No special settings, I just used a straight forword windows tutorial and installed the newest Catalyst drivers (yes - noob ;) ). I had to underclock the HD 7990 and the two 290x cards for heating reasons, the 390x is a little bit overclocked - temperatures are all around 80 - 85 C°.

Is this fluctuating GPU usage normal? Thanks for your help .



«1

Comments

  • happytreefriendshappytreefriends Member Posts: 537 ✭✭✭
    Is this an open-air rig? Those temps seem pretty high for underclocked speeds.
    The 7990 will run hot and I underclock mine too, but the 290 and 390's should handle stock or higher speeds and stay under 70C.
  • QuellwasserQuellwasser Member Posts: 43
    Unfortunatelly it's a closed rig in a small server case - can't make in open either, it has to fit in a specific place. Do you think the temperatures are the problem, or are those gpu usuage spikes normal?
  • happytreefriendshappytreefriends Member Posts: 537 ✭✭✭
    I get the 0-100% spikes too on my 390's the others don't seem to do so. Not sure why, but it doesn't seem to affect my hash rates. My temps are 60-61C with a -50mv undervolt and running at 1100/1600MHz speeds. This rig with the 390's is running in my office on the PC I'm using to type this post. :smile:
    Temps are about 74F in my office.
  • QuellwasserQuellwasser Member Posts: 43
    thanks for your answer :smile:
    Does anyone else nows more about those spikes and if or if not they effect mining speed.
  • blueboxbluebox Member Posts: 181 ✭✭
    My GTX970's drop their gpu/mem clock rates every time there's a new getwork package, if that's what you're seeing. Newer cards auto-throttle the clocks with demand, perhaps the 79xx doesn't, or maybe it just doesn't with the miner...
  • dlehenkydlehenky Member Posts: 2,249 ✭✭✭✭
    @Quellwasser The OpenCL kernel that runs on the GPU does not run continuously. The kernels can only report results when they finish. So, each kernel runs X number of hashes, which is given by the "global work size", a parameter passed to the kernel when it's launched. The default value results in a kernel that runs for just over 10 ms. You can change the default value of 4096 using the '--cl-global-work' option of 'ethminer'. A value of 8192, for example, will give you a kernel run time of 20+ ms. There is OS overhead involved in re-launching the kernel for another run. During the time from the end of one kernel run until the beginning of the next, the GPU isn't working. The times I've given above are based on a hash rate of 25 MH/s. A slower hash rate will give a longer kernel run time for any given value of "global work size". Kernels *always* run to completion; they cannot be stopped (really), and they don't stop when they find a result, they continue to run to the end and then report the result.
  • QuellwasserQuellwasser Member Posts: 43
    edited February 2016
    Thank you bluebox and especially dlehenky :smile:
    I'am not sure if I understood everything you said correctly, but would I even benefit from raising the kernel run time? Or wouldn't everyone benefit from it, since the GPU doesn't have to re-launch that often. From what you said those GPU-spikes seem reasonable.
    Thank you again for your detailed answer :smile:
  • megahz2megahz2 Member Posts: 36
    you should really open your case, you are going to cook your 7990.
    85 is borderline toast.
    I have a [email protected] 42 to 45mh+-
    try this

    ethminer -G --opencl --cl-local-work 256 --cl-global-work 8192*128 -t 2

    should be smooth as silk..
  • QuellwasserQuellwasser Member Posts: 43
    edited February 2016
    Haha I thought 85C is low for a 7990 and an ok temperature for GPU's in general - but it's more on the 80C side. I hope the miner will cool down when I put it in the server room since the room is cooled down to 17C.
    And I will try that command thank you :).

    Edit: wow cool thanks, it really runs smoother and even faster :smile:
    Post edited by Quellwasser on
  • taz002devtaz002dev 0x26ff3ed5c5025b5a3c288b9139d0396f9de5891fMember Posts: 47
    edited March 2016
    megahz2 said:

    you should really open your case, you are going to cook your 7990.
    85 is borderline toast.
    I have a [email protected] 42 to 45mh+-
    try this

    ethminer -G --opencl --cl-local-work 256 --cl-global-work 8192*128 -t 2

    should be smooth as silk..

    Hi.
    What settings do you recommend for a rig 1x7990 + 4x290?
    I usually tried below based on stilt recommendations to adapt especially for the 290s I have. Is there a computed way of determining those settings?

    ethminer -G --opencl --cl-local-work 256 --cl-global-work 32768

    Thanks
  • happytreefriendshappytreefriends Member Posts: 537 ✭✭✭
    Does anyone have an actual primer on the command line arguments like the above one, What they actually mean?
  • dlehenkydlehenky Member Posts: 2,249 ✭✭✭✭
    @happytreefriends Search my posts - I described them in detail a couple times.
  • happytreefriendshappytreefriends Member Posts: 537 ✭✭✭
    I did. I just found examples which I already know but no explanation on the actual use of every command that's in etherminer.exe. I knwo about the GPU selection, etc., but the more advanced commands I'm interested in.

    Anyone have a list / info?
  • dlehenkydlehenky Member Posts: 2,249 ✭✭✭✭
    @happytreefriends Which ones are you interested in?
  • happytreefriendshappytreefriends Member Posts: 537 ✭✭✭
    pretty much any and all the ones pertaining to the GPU control. Like cl-local-work, cl-global-work, etc.. etc... all those.
    I played around a bit with them and found some sweet spots for my rigs, and some bad spots, but don't really know 100% how they affect mining/GPU function.
  • dlehenkydlehenky Member Posts: 2,249 ✭✭✭✭
    @happytreefriends Those, in particular, I have described in detail in prior posts. I worked with both the c++ code and the OpenCL code, and I know exactly what they do, and don't do. The way I see most people using them is total off the mark, but there's no way you would know that without understanding the code, which really requires a thorough understanding of OpenCL and it's nomenclature. There are also side effects of setting the values high, which I've also written about, at length.
  • dlehenkydlehenky Member Posts: 2,249 ✭✭✭✭
    @happytreefriends Here'e a forum link that should take you to 2 posts I made regarding '--cl--global-work' and '--cl-local-work':

    https://forum.ethereum.org/discussion/comment/19289/#Comment_19289
  • happytreefriendshappytreefriends Member Posts: 537 ✭✭✭
    TY. I missed that one.
    So when I see this : --cl-global-work 8192*128
    What does the *128 do?

    Also I've noticed if you raise up the #8192 to a high number, the hash rate becomes very erratic. And sometimes shows 0 hash rate mixed in with actual numbers.
  • dlehenkydlehenky Member Posts: 2,249 ✭✭✭✭
    @happytreefriends *128 multiplies by 128 :) Basically, you make the kernel runtime so long that it runs longer than the timer interval used to measure the hash rate, generally 500ms.
  • happytreefriendshappytreefriends Member Posts: 537 ✭✭✭
    Hmm.. When that is set it seems to increase the reported hash rate though and be much more stable than using 16384 or 32768 as the number directly.
  • davethetrousersdavethetrousers Member Posts: 46
    edited March 2016
    dlehenky said:

    There is OS overhead involved in re-launching the kernel for another run. During the time from the end of one kernel run until the beginning of the next, the GPU isn't working.

    Can we conclude from this that worksize is simply a tradeoff between GPU time being utilized more (larger worksize) and the miner being more responsive regarding things like noticing and reporting a found hash (smaller worksize)?
    megahz2 said:

    you should really open your case, you are going to cook your 7990.
    85 is borderline toast.

    You're overstating this by a lot. Those are absolutely acceptable temperatures, and hardware can and will get way hotter than that (think notebooks, blade servers etc). Even judging from the fact that thermal throttling doesn't take place until 94°C, everything below 95° is basically considered alright by AMD themselves.

    Only thing you really lose is some power/energy efficiency (resistance decreases), and maybe a bit on lifetime of the GPUs' thermal interface material.
  • dlehenkydlehenky Member Posts: 2,249 ✭✭✭✭
  • dlehenkydlehenky Member Posts: 2,249 ✭✭✭✭
    @happytreefriends I'm confused. You say at those setting you sometimes get 0 hashes, and a large variance in hash rates, right? How is that more "stable". It's indicative of the kernel running longer than the granularity of the timer that's sampling the hash rate. Maybe I misunderstanding what your saying?
  • megahz2megahz2 Member Posts: 36
    sorry if I scarred/frustrated you on temps and missed some of your posts...
    for some reason if I hit 90° my system shuts down, and I tried the card on 2 separate rigs/motherboards. probably my card.. Yes-server room :-)

    so I have been playing with my numbers --cl-local-work xx,
    I tried 64, 128, 256 and noticed the higher the hash rate but more erratic.

    @256
    I hit 88MH/s with 1/4 of the time
    I hit 64MH/s with 1/2 of the time
    I hit 43MH/s with 1/4 of the time

    @128
    I hit 66MH/s with 1/2 of the time
    I hit 54MH/s with 1/2 of the time

    @64
    I hit 54MH/s with 3/4 of the time
    I hit 43MH/s with 1/4 of the time

    128 or 256 is my favorite "for me"..

    Note: I think dividing/multiplying --cl-global-work 8192*128 like davethetrousers said, dealing with OS overhead & tradeoff between GPU time being utilized,
    It makes a difference crunching to memory & processes differently, I use to use the * in the old bitcoin cuda mining days,

    For my system I have 32gb of ECC/chipkill ram and 2 E5-2650 CPUs, I have been running --cl-local-work 128 --cl-global-work 9220*264 lately..(I know odd numbers)
    I get 54 to 66MH/s on average. I am trying to find 59steady..


    Sometimes get 0 hashes is not good,, I never get 0 hashes,, 43MH/s is as low as I've ever seen,,
    try lower settings..

    I hope you find good settings for your world!!
  • dlehenkydlehenky Member Posts: 2,249 ✭✭✭✭
    @megahz2 This is getting more and more confusing. I *think* you're responding to my post above, which was directed at happytreefriends, not you. Above that post you'll see another post to HTF that has a link to my previous posts on '--cl-global-work' and '--cl-local-work'. I guess the most succinct way I can put it is: no one on this forum, other than Genoil, knows what their doing when they set these values. Please read the posts (there are 2 in the same thread) I linked above, and then perhaps we can have an easier discussion about your settings. Sound good??
  • happytreefriendshappytreefriends Member Posts: 537 ✭✭✭
    edited March 2016
    dlehenky said:

    @happytreefriends I'm confused. You say at those setting you sometimes get 0 hashes, and a large variance in hash rates, right? How is that more "stable". It's indicative of the kernel running longer than the granularity of the timer that's sampling the hash rate. Maybe I misunderstanding what your saying?

    I think you misunderstood my reply to you.

    1st : "Also I've noticed if you raise up the #8192 to a high number, the hash rate becomes very erratic. And sometimes shows 0 hash rate mixed in with actual numbers."

    2nd : I was replying to the *128 addition from before. "When that is set it seems to increase the reported hash rate though and be much more stable than using 16384 or 32768 as the number directly."

    When I raised the numbers to 16384 or higher sometimes it showed '0' as hash rate. When I use 8192*128 it does now, just as I wrote it.
  • dlehenkydlehenky Member Posts: 2,249 ✭✭✭✭
    @happytreefriends My reply about 8192*128 was simply based on what that normally means, i.e. multiplication. I honestly don't know if that's how the argument parser interprets it. If it was actually interrupted as the product of 8192*128, the kernel would run non-stop for 2.5 seconds, if your hash rate is 25 mh/s, even longer, if your hash rate is lower. That would cause all sorts of trouble. So, I'm just guessing that it either ignores the "*128", or ignores the whole thing and uses the default of 4096. Recall that once a kernel starts, you cannot stop it before it completes the hashes it was told to do. If you just set '--cl-global-work', the number of hashes will be the parameter you provide (8192 in your case) times 64, which is the local work size (aka work group size) default, or 524288 hashes per kernel run. At 25 mh/s, that a little more than 20 ms per kernel run. If you don't set any options, the kernel runs just under 11 ms - that's the default. I think you can see how a 2.5 second kernel run would be disastrous. You can test this by setting '--cl-global-work' to 1048576, rather than 8192*128, in a benchmark and note the difference in the results.
  • dlehenkydlehenky Member Posts: 2,249 ✭✭✭✭
    @happytreefriends OK, so I just ran the test. The command line parameter parser is ignoring the '*128'. The benchmark gives the same result with '8192' and '8192*128', and the kernel run time was the same at 20.2 ms for my 26 mh/s hash rate. With the parameter set explicitly to '1048576' the kernel ran for 2.54 seconds, and of course, the trial numbers were junk - trial 1-3 were 0, 4 was 44739242, and 5 was half of 4.
  • happytreefriendshappytreefriends Member Posts: 537 ✭✭✭
    Ok, that makes sense. I wasn't sure you can pass a function into the comman line liek that, but when I tried it against higher than 12000 numbers (--cl-global-work) it was more stable. When I set it to 12000 or higher, the shown hash rate is much higher, but it's random up and down and shows the '0' sometimes. I think 8192 to 11000 is the best range. I will have to do some testing.

    TY as usual.
  • dlehenkydlehenky Member Posts: 2,249 ✭✭✭✭
    @happytreefriends Are you doing anything with '--cl-local-work'?
Sign In or Register to comment.