As some of you may know, I've been doing a bit of research into the effects of increasing DAG file size on hashrate. The reason behind it being the dramatic drop of hashrate on GTX750Ti on Windows as DAG file size increased. So far, I have concluded that other modern Nvidia cards are relatively safe until the DAG size hits 2GB, but I don't have any data on AMD cards yet.
Today, I've ported my CUDA dag simulator to OpenCL, but as I don't own any AMD cards, I still don't know what the results are. Attached is a (64-bit) Windows binary that can show you the approximate ethminer hashrate with a given DAG size. I would be very grateful if some of you could do some tests with me.
[big update to this post since a major update to the test program]
By default, the program will generate a pseudo DAG file with a size matching your GPU RAM (or a max of 4GB) and test bandwidth/hashrate with incremental steps from 128MB up until the maximum possible.
dagSimCL.exe
The results will be written to a tab-delimited csv file called results.csv.
Using a spreadsheet program like Excel, you can easily convert this to a graph like this one for my GTX780:
x-axis is DAG size in MB, y-axis is hashrate in MH/s. Because the test uses a simplified dagger loop without the SHA3/Keccak stages, you may see slightly better results than with the real ethminer.
If you have less system RAM than your GPU RAM, you can set the maximum using the 1st command line parameter:
dagSimCL.exe 2048
This will test DAG size up until 2048MB (= 2GB)
I think most AMD cards will be fine, but you never know. If you have more than one GPU in your system, you can use the 2nd command line param to select a different card:
dagSimCL.exe 4096 1
will measure on the second card (1st card is 0). Because the command line switches are not very clever, you will have to set the first one to your GPU RAM or higher like shown here.
If you have problems with multiple openCL platforms installed, you may add a third cmd line arg to select the right platform:
dagSimCL.exe 4096 0 1
will measure on the first card on 2nd platform.
Source code and most recent binaries here:
https://github.com/Genoil/dagSimCL. Haven't tested on Linux. Also no Linux makefile available yet.
Most recent win-64 binaries also in the attached ZIP.
Please post results including results file attached. Thank you!
Comments
http://gathering.tweakers.net/forum/list_message/45343419 (in dutch, but the graphs speak fro themselves)
BTW the hashrates displayed there are far above normal ethash, at least for the Radeons. That's likely because of the missing SHA3/Keccak stages, that puts enormous register pressure on the kernel (== less wavefronts in flight).
set GPU_MAX_HEAP_SIZE=100
added. Is that in the bat file you used?Strangely enough I got some results from people with 79x0 devices on Win10 that got to 2048MB without issues. Still not enough, but much better than 1280.
7950 Boost Windows 7 here, and I'd like to help but I get a crash (see here). Am I doing something wrong? Edit: every time happens to 1408MB. And the idea that we "set GPU_MAX_ALLOC_PERCENT 100" throw in the cmd administrator before we start dagSimCL.exe?
I almost have a "chunked" version of this test working on GTX780, but I'm having some difficulties allocating up to the full 3GB.
I just finished a version of the test that splits up the allocation in 256MB chunks.
-- edit-- temporaily removed the 256MB chunks version. It contains a bug.
-- edit2--
Attached binary has support for chunks with different sizes. It has becomes the first cmd line param. I've found 256MB to be the best for GTX780 (altough CUDA OpenCL prefers no chunking at all).
Yes I'm talking to this guy as well. Currently trying to get a version up that allocates GPU RAM in chunks.
-
Much appreciated.
Also, i posted results here:
https://bitcointalk.org/index.php?topic=1268355.msg13110655
because codetags.
There is no drop until 2688MB.
DAG size (MB) Bandwidth (GB/s) Hashrate (MH/s)
128 166.728 21.8533
256 165.537 21.6973
384 110.105 14.4317
512 164.8 21.6006
640 229.946 30.1394
768 120.987 15.858
896 362.78 47.5503
1,024 138.054 18.0951
1,152 232.697 30.5001
1,280 139.576 18.2945
1,408 166.527 21.827
1,536 138.565 18.1621
1,664 154.867 20.2988
1,792 127.143 16.6649
1,920 146.892 19.2535
2,048 141.562 18.5548
2,176 151.261 19.826
2,304 162.762 21.3336
2,432 192.039 25.1709
2,560 184.887 24.2335
2,688 29.7203 3.8955
2,816 21.8517 2.86414
2,944 9.08344 1.19059
The hash rate does not drop monotonously.
when I say drop starts at 49MHS drops to 46-47MHS
256MB chunk
=> Can run up to 2944MB DAG size, but the result is not stable and not trustable, the hashrate seem to be dropped from 2688MB DAG size.
================================================
512MB chunk
=> Can run up to 2816MB DAG size, the result looks good from 128 to 512MB DAG, unstable for the rest
================================================
768MB chunk
=> Can run up to 2560MB DAG size, the result looks good from 128 to 768MB DAG, unstable for the rest
================================================
1024MB chunk
=> Can run up to 1152MB, the results look fine except the last one.
================================================
After some test, I see that DAG size is a problem. I tried ETH benchmark (ethminer -G -M) and got 19.5MH/s, the DAG size in this test is 1024MB. But in real mining (1200MB DAG size) the speed drop off to 17.4MH/s. Do you know a solution for this issue?
@davidrentao yes same issue as with GTX750Ti. TLB trashing.
https://gist.github.com/anonymous/83237aa42d4292f13ceb
it's adding a makefile for unix systems and fixes two includes. this way i was able to compile on linux. i have amd cards r9-390x (8gb) and r9-270x (2gb) available for testing. if you are interested i will let you know.