Firstly - Just want to say this forum is by far the most informative and most helpful I've come across... and I feel bad I've recently opened a few discussions for help that I needed but I've not really had the time to give any help back to others so I will from today try to be more interactive as well. I do however try to keep all my discussions updated with latest resolution in case someone else with the same issue does land on it
Before I dive right into the problem, I want to make this thread as in-depth as possible because from searching day & night on the net I've seen a lot of people are having a similar issue and struggling to find a solution. Hopefully, this will give me best chance at finding a solution and helping others.
MSI Z170A Tomahawk Motherboard (Purchased New)
Intel Pentium G4400 Skylake LGA 1151 Processor (Purchased New)
4x 4Gb (16Gb) DDR4 DIMM Ram 2300Mhz (Purchased Used)
3x R9 290 & 3x R9 290x GPUs (Purchased Used)
2x 1000w Corsair PSUs (2kw) (Purchased 1 New + 1 Used)
Currently running on Windows 10 Pro 64-Bit
Powered Risers (Latest Ver 008S)
Windows Update > Disabled
AMD Overdrive > Disabled
Windows Defender > Apps & Files > Disabled
Virtual Memory > Increased to 20,000Mb (20Gb)
Battery & Performance > High Performance (Never Sleep, Never Hibernate, Never Screen Off, PCIE Never Off)
Updated to latest MSI BIOS from MSI website > Yes (Successful, no errors)
PCIE Speed> Auto
Dim Link Speed > Auto
4G Decoding / Multi GPU Mining > Enabled
Primary Boot Display > PGE
VD-T > Disabled
HD Audio Controller > Disabled
Serial Port > Disabled
Parallel Port > Disabled
Power Management > Power On after AC Power Loss > Enabled
Boot Sequence > USB then SSD
I started with Simplemining.net (smOS) as my preferred OS. I could get 3 GPUs detected and start mining perfectly but after a short time (10min-5hrs) I would see one GPU stop hashing. It can be any GPU, on any slot at any time. I then changed all my risers to the new Ver 008S risers and managed to get 5 GPUs detected but the problem was still there, one random GPU would hang (any gpu, any slot, any time).
When a GPU stops hashing, the others continue for a while but after a minute or two the miner reloads and the GPU starts hashing again. Then it will hang again or another might hang... and sometimes 2 can hang and the third might drop from 26Mh/s to something like 15...8...1Mhs and then the whole system would reboot. Sometimes it might force a reboot straight after the first GPU hanging as well. It's very random.
I have since tried HiveOS and Perfectmining.io but all three give the same issue - I then assumed it was an Ubuntu issue as some said changing to Windows resolved it for them. I also took the time to open all my GPUs and clean them.. cleaned the board, the fan, intercooler and heatsink & I applied new thermal paste.
Yesterday installed Windows 10 Pro, updated it and then switched off further updates. I installed the Adrenaline drivers from AMD (Latest AMDGPU Pro) and began hashing on all 5 GPUs using Claymore V10.2 but the same thing started to happen. I then downloaded Claymore V11.1 and again, the same thing was happening.
If I noticed that GPU2 for example would hang like 3x in a row, I would shut down; disconnect GPU2 and then reboot but then it would be another GPU that hangs, eventually I'd be left with 1 GPU connected and still it can hang randomly.
This morning I've uninstalled the latest AMD drivers using DDU and then installed the 15.12 drivers as recommended on a lot of sites for older cards. I noticed fan control conflicts with miner, msi afterburner and amd overdrive so I disabled amd overdrive and added -tt 1 to the miner config, this has resolved fan control issues.
I've tried again with Claymore V11.1 but it hanged pretty much straight away with just 1 GPU connected. So I'm currently testing with Claymore V10.0 and with 1 GPU connected it is so far running for about an hour (Dual Mining ETH + DCR)... Eth = 42shares so far + DCR = 40 shares so far with 1 rejection on DCR and 0 on Eth. - I'm thinking to let this run overnight before considering connecting a 2nd GPU. It's currently 3:30pm here so that would be a decent 15hr test I think, which would be the lost i've managed to get it to run lol.
Some people suggest clocking back or reflashing GPU bios but all my GPU's are showing STOCK OC on MSI, I did not OC any of them and I haven't messed with flashing any GPU bios. I also don't believe previous owners would have flashed them because from what I've read online there is no point doing that with the R9's.
I've been having this problem for about 5-6 weeks now and I don't know what else I can try. I checked my PSU's and with 5 cards connected they are taking 550w + 740w from the wall. Well under the 1000w capacity but anyway the problem happens with even 1-2 GPUs connected so it's not a PSU issue.
Can someone please help me and guide me through how to find the problem and rectify it?