PCI_E1 Slot Causes GPU to overheat?

Rigboy786Rigboy786 Member Posts: 45
PCI_E1 is registered as GPU4 on smOS and whichever GPU I place in there, seems to get hotter than the others and eventually that GPU will stop hashing. After a few minutes of getting 0.00 Mh/s on that gpu, the system will reboot and it will work as normal... then it will stop hashing again and reboot. Sometimes it will work for 10min, sometimes 10 hours.

I've opened all my GPU's and reapplied the thermal paste, all ports are clear and I'm using the latest 008s risers. My room temp is 10c right now but still happening, any ideas?

Comments

  • cidmocidmo Member Posts: 446 ✭✭✭
    load up a win2go flash stick or another OS and see if it happens with a different OS
    then its prolly smOs settings or something

    there is a bit of info u didnt give just in case:
    swapped risers around as well as the GPU? everything cable, card, and 1x pcie
    possibly swapped pcie cables and molex from psu too?
  • Rigboy786Rigboy786 Member Posts: 45
    Thanks Cidmo for responding.
    I swapped risers, 1x pcie and cables. I didn't swap molex from psu because 1 gpu is powered by 1 psu and the other is powered by another psu. So i ruled out power supply issue.

    One thing I seen online is everyone saying you must allocated 16GB of virtual memory for mining. In ubuntu that would be 16Gb of 'Swap" and I've just checked, my swap allocation is 0Gb.. do you think it could be this?
  • peshetomanpeshetoman Member Posts: 78
    yes, it could be this. allocate at least 20gb of virtual memory
  • cidmocidmo Member Posts: 446 ✭✭✭
    edited February 2018
    u do need some swap for linux but its not important to have 16GB for mining
    swap in linux is almost completely different than page file
    this is not the problem for heat tho
    all swap will do is possibly give u more hashrate if for some reason ur running out of physical memory
    i would assume at this point its smOS
    as there is not much a slot can do by itself to overheat a gpu on a riser
    is there a way to change drivers on smOS?
    like maybe even for that specific slot?
  • Rigboy786Rigboy786 Member Posts: 45
    I was way off... the problem is not related to one slot as I've swapped slots, cards, risers and cables... it still occurs and is actually random as per which gpu/slot will be affected.

    It's not heat related either, I had the room temp at 10c and left cards off overnight... switched on in morning and it happened within 10minutes of starting to hash, even though temps were under 78c and these cards can run a lot hotter.

    I've read online it's due to a 'memory leak' issue but still found no possible way of resolving it :(
  • peshetomanpeshetoman Member Posts: 78
    set the virtual memory at 20gb and max to 40gb. see if that helps you. I have had the same problem with some of my cards. since the memory increase i havent seen it. Also set -r in claymore and create reboot.bat that restarts your pc. Set the claymore bat in the startup
  • cidmocidmo Member Posts: 446 ✭✭✭
    edited February 2018
    which miner does it run?
    if claymore check the logs for 511C temp error
    otherwise ur mem OC might be too high
    try to dial that back a lil and see if u gain stability

    force rebooting ur rig with -r is just a bandaid and doesnt solve any problems
    and could potentially allow a bad problem to become worse
    all 10 of my rigs run for 300+ hours before i manually restart them
Sign In or Register to comment.