CUDA miner

191012141569

Comments

  • skunkskunk Member Posts: 13
    edited August 2015
    works fine on a 6x750ti rig too...
    cuda: 52Mhs, opencl: 46Mhs
  • cryptletcryptlet Member Posts: 29
    edited August 2015
    @Genoil Like @skunk says, it builds after removing -Werror. Built on ubuntu with cuda 7.0.
    Results for my 2x 750Ti.

    (Benchmark with 1x 750Ti )
    ./ethminer -M -U
    min/mean/max: 8825514/8877943/8912896 H/s
    inner mean: 2970965 H/s

    (Benchmark with 1x 750Ti )
    ./ethminer -M -U --cuda-turbo
    min/mean/max: 8825514/8860466/8912896 H/s
    inner mean: 2970965 H/s

    (Benchmark with 2x 750Ti )
    ./ethminer -M -G
    min/mean/max: 16340309/16410213/16427690 H/s
    inner mean: 5475896 H/s

    In actual mining my 2x 750Ti get a total about 17.7+MHs total.

    Edit: Just noticed this ethminer build is using 100% of 1 core on my quad cpu.
  • skunkskunk Member Posts: 13
    cryptlet said:


    Edit: Just noticed this ethminer build is using 100% of 1 core on my quad cpu.

    it'll stop using cpu time once finished with the dag files...
  • GenoilGenoil 0xeb9310b185455f863f526dab3d245809f6854b4dMember Posts: 769 ✭✭✭
    Oh this is interesting.

    I optimized the keccak hash in the opencl kernel with some CUDA PTX assembly (borrowed from sp_ 's ccminer) and now the opencl kernel is as fast as the CUDA kernel.

    opencl: ethminer.exe -M -G --cl-global-work 16384
    cuda: ethminer.exe -M -U --cuda-grid-size 8192 --cuda-block-size 128 [--cuda-turbo]

    Compute 3.5 or higher card required.

    Source @ https://github.com/Genoil/cpp-ethereum/tree/cuda-opencl-ptx

    console output should show: Using Inline PTX

    also interested in any performance impact on AMD cards. The fallback function may be slower than what was originally in there.
  • skunkskunk Member Posts: 13
    just one thing:
    [CUDA]:Found suitable CUDA device [GeForce GTX 960] with 2147287040 bytes of GPU memory miner 19:38:17|ethminer Getting work package... ✘ 19:38:27|ethminer Failed to submit hashrate. ✘ 19:38:27|ethminer Dynamic exception type: N7jsonrpc16JsonRpcExceptionE std::exception::what: Exception -32003 : Client connector error: libcurl error: 28 -> Operation timed out JSON-RPC problem. Probably couldn't connect. Retrying in 1... miner 19:38:39|ethminer Getting work package... ✘ 19:38:48|ethminer Failed to submit hashrate. ✘ 19:38:48|ethminer Dynamic exception type: N7jsonrpc16JsonRpcExceptionE std::exception::what: Exception -32003 : Client connector error: libcurl error: 56 JSON-RPC problem. Probably couldn't connect. Retrying in 1... miner 19:39:00|ethminer Getting work package... ✘ 19:39:03|ethminer Failed to submit hashrate. ✘ 19:39:03|ethminer Dynamic exception type: N7jsonrpc16JsonRpcExceptionE std::exception::what: Exception -32003 : Client connector error: libcurl error: 7 -> Could not connect to http://eth1.nanopool.org:8888/0x873517202d47c9dd023ee89fa2e072dcc76ea2c4 JSON-RPC problem. Probably couldn't connect. Retrying in 1... miner 19:39:05|ethminer Getting work package... ✘ 19:39:05|ethminer Failed to submit hashrate. ✘ 19:39:05|ethminer Dynamic exception type: N7jsonrpc16JsonRpcExceptionE std::exception::what: Exception -32003 : Client connector error: libcurl error: 7 -> Could not connect to http://eth1.nanopool.org:8888/0x873517202d47c9dd023ee89fa2e072dcc76ea2c4 JSON-RPC problem. Probably couldn't connect. Retrying in 1... miner 19:39:08|ethminer Getting work package... ✘ 19:39:08|ethminer Failed to submit hashrate. ✘ 19:39:08|ethminer Dynamic exception type: N7jsonrpc16JsonRpcExceptionE std::exception::what: Exception -32003 : Client connector error: libcurl error: 7 -> Could not connect to http://eth1.nanopool.org:8888/0x873517202d47c9dd023ee89fa2e072dcc76ea2c4 JSON-RPC problem. Probably couldn't connect. Retrying in 1... miner 19:39:10|ethminer Getting work package... ✘ 19:39:10|ethminer Failed to submit hashrate. ✘ 19:39:10|ethminer Dynamic exception type: N7jsonrpc16JsonRpcExceptionE std::exception::what: Exception -32003 : Client connector error: libcurl error: 7 -> Could not connect to http://eth1.nanopool.org:8888/0x873517202d47c9dd023ee89fa2e072dcc76ea2c4 JSON-RPC problem. Probably couldn't connect. Retrying in 1... miner 19:39:12|ethminer Getting work package... ✘ 19:39:12|ethminer Failed to submit hashrate. ✘ 19:39:12|ethminer Dynamic exception type: N7jsonrpc16JsonRpcExceptionE std::exception::what: Exception -32003 : Client connector error: libcurl error: 7 -> Could not connect to http://eth1.nanopool.org:8888/0x873517202d47c9dd023ee89fa2e072dcc76ea2c4 JSON-RPC problem. Probably couldn't connect. Retrying in 1... miner 19:39:15|ethminer Getting work package... ✘ 19:39:15|ethminer Failed to submit hashrate. ✘ 19:39:15|ethminer Dynamic exception type: N7jsonrpc16JsonRpcExceptionE std::exception::what: Exception -32003 : Client connector error: libcurl error: 7 -> Could not connect to http://eth1.nanopool.org:8888/0x873517202d47c9dd023ee89fa2e072dcc76ea2c4 JSON-RPC problem. Probably couldn't connect. Retrying in 1... miner 19:39:17|ethminer Getting work package... miner 19:39:17|ethminer Grabbing DAG for #f2e59013… miner 19:39:18|ethminer Got work package: ℹ 19:39:18|ethminer Loading full DAG of seedhash: #582b0644… miner 19:39:18|ethminer Header-hash: daf66715f3b8793d514a9ed1c76f68f681a6f8e19d2202626c707360000a986b miner 19:39:18|ethminer Seedhash: f2e59013a0a379837166b59f871b20a8a0d101d1c355ea85d35329360e69c000 miner 19:39:18|ethminer Target: 00000000dbe6fecebdedd5beb573440e5a884d1b2fbf06fcce912adcb8d8422e ℹ 19:39:18|cudaminer0 workLoop 0 #00000000… #f2e59013… ℹ 19:39:18|cudaminer0 Initialising miner... miner 19:39:18|ethminer Mining on PoWhash #daf66715… : 0 H/s = 0 hashes / 0.5 s ℹ 19:39:18|ethminer Full DAG loaded Using device: GeForce GTX 960(5.2)
    nothing serious as from there the miner starts to hash normaly...
    i can guess it's just an issue with the pool not understanding the miner's new hashrate submission feature, but maybe a --legacy or a --dont-submit-hashrate switch won't hurt...
  • BojchaBojcha Member Posts: 7
    edited August 2015
    That is because ddos protection on pool. It should make just few retires at start tho.
    --
  • OzdiggerOzdigger Member Posts: 25
    Genoil said:

    Oh this is interesting.

    I optimized the keccak hash in the opencl kernel with some CUDA PTX assembly (borrowed from sp_ 's ccminer) and now the opencl kernel is as fast as the CUDA kernel.

    opencl: ethminer.exe -M -G --cl-global-work 16384
    cuda: ethminer.exe -M -U --cuda-grid-size 8192 --cuda-block-size 128 [--cuda-turbo]

    Compute 3.5 or higher card required.

    Source @ https://github.com/Genoil/cpp-ethereum/tree/cuda-opencl-ptx

    console output should show: Using Inline PTX

    also interested in any performance impact on AMD cards. The fallback function may be slower than what was originally in there.

    I tried it on my 3 x 280x, dropped 10mh.
  • ssstandssstand Member Posts: 30
    edited August 2015
    At the end i switched to linux, i can't success build the miner but compiled
    easily in linux..

    i would like to share the test:

    OS: Ubuntu 15.04

    Benchmarking on platform: { "platform": "CUDA 7.0", "device": "GeForce GTX 750 Ti", "version": "Compute 5.0" }
    Preparing DAG...

    Using device: GeForce GTX 750 Ti(5.0)
    Trial 1... 8563370
    Trial 2... 8475989
    Trial 3... 8475989
    Trial 4... 8388608
    Trial 5... 8388608
    min/mean/max: 8388608/8458512/8563370 H/s
    inner mean: 8505116 H/s

    The test is fair well compares to real mining, which i got the similar result too ( 8.5 M/s)


    Start mining with 750 Ti now, i will donate to the developer when i accumulate some coins,
    Thank you so much for the effort !

  • GenoilGenoil 0xeb9310b185455f863f526dab3d245809f6854b4dMember Posts: 769 ✭✭✭
    @cryptlet thanks! @ssstand good to see it's working well for you. meanwhile i've filed a bug with NVidia about the Win8/750ti issue, hopefully they'll find the cause..
  • antho281antho281 0x76a5e9033548cd4c92a2891a2e13ec019c9b7384Member Posts: 4
    Gigabyte GTX 960 G1 Gaming 2GB :

    min/mean/max: 10660522/10660522/10660522 H/s
    inner mean: 10660522 H/s

    Gigabyte GTX 750 Ti :
    min/mean/max: 8643581/8643581/8643581 H/s
    inner mean: 8643581 H/s

    Since I can't launch the new version of your Ethminer without error, (Error Cuda Mining : Invalid Argument or OpenCL error), I'm using the first version with SP_

    That's great work Genoil!

  • GenoilGenoil 0xeb9310b185455f863f526dab3d245809f6854b4dMember Posts: 769 ✭✭✭
    @antho281 which binary is that? And what launch params did you use?
  • antho281antho281 0x76a5e9033548cd4c92a2891a2e13ec019c9b7384Member Posts: 4
    I use cudaminer-sp binary
    and this is my command line for the moment : ./ethminer -U -F http://blabla --gpu-devices 0 1 2 3 4 5
    Looking for params atm haha but looks like if gpu-devices and -t do the same effects..
    When I use --gpu-devices 0 or --gpu-devices 1, the same GPU launch
  • GenoilGenoil 0xeb9310b185455f863f526dab3d245809f6854b4dMember Posts: 769 ✭✭✭
    @antho281 yes that is broken in sp binary. What binary exits with "Error Cuda Mining : Invalid Argument or OpenCL error"?
  • antho281antho281 0x76a5e9033548cd4c92a2891a2e13ec019c9b7384Member Posts: 4
    edited August 2015
  • GenoilGenoil 0xeb9310b185455f863f526dab3d245809f6854b4dMember Posts: 769 ✭✭✭
    edited August 2015
    @antho281 with what parameters did you launch it?

    Nevertheless this is more of an experiment. I've got some good speedup of opencl on NVidia, but the CUDA kernel stays on top:

    GTX780, using my new mining simulation mode: (-S flag)

    opencl: ethminer.exe -M -G --cl-global-work 16384
    cuda: ethminer.exe -M -U --cuda-grid-size 8192 --cuda-block-size 128 [--cuda-turbo]

    opencl default: 14.8MH
    opencl optimized keccak for nvidia: 16.6MH
    opencl optimized keccak with ptx assembly: 16.9MH
    cuda default: 17.3MH
    cuda turbo: 18MH




  • antho281antho281 0x76a5e9033548cd4c92a2891a2e13ec019c9b7384Member Posts: 4
    edited August 2015
    @Genoil I copied that line : cuda: ./ethminer -M -U --cuda-grid-size 8192 --cuda-block-size 128
    And then CUDA failed, benchmark hashrate = 0

    Note : I'm using Ubuntu 14.04
  • sot173sot173 Member Posts: 10
    Are there any binaries for the cuda enabled ethminer that work on OSX 10.9.5?
    Or perhaps someone is so very kind to compile one?

    Thank you in advance.

  • greggreg Member Posts: 14
    edited August 2015
    @Genoil I need your help here, to understand how ethminer parameters work and correlate with each other.

    In "vanilla" ethminer we have such parameters as --cl-local-work (def 64) and --cl-global-work (def 4096). Some people use local 128 and global 8192 for performance boost - and that works.
    I thought that global work had something to do with GPU's shader (processor) count (radeons 7970/280X have 2048 shaders, so x4 we got 8192) so I've tweaked that parameter for 7950 (1792 shaders x5 = 8960 global work) and R9 270 (1280 shaders - 5120, 8960, 10240 globalwork tried). Results were roughly the same as with 8192, but I haven't tested much. So, the first question is - have global work something to do with GPU's shader count, or that just must be power of 2 and that's all?

    Now to CUDAs. Using some version of your CUDA miner in OpenCL mode (miner downloaded from cryptomining blog) and adding --gpu-batch-size 20 to launch bat file definitely increases performance on Radeons, roughly to the same level as with local 128 + global 8192 in vanilla ethminer, maybe a little (very little) above that. But in that version of miner we have no --cl-local-work / --cl-global-work parameters!
    So the second question is: how gpu-batch-size correlates with local/global worksizes, and is there a possibility to use all these parameters together, or they are mutually exclusive and have basically the same effect?

    And for last versions:
    "cuda-frontier" from 26 Aug. - speed doesn't display in command line, only found shares. --verbosity didn't help, and I don't know another ways to enable speed showing in console (real speed, not benchmarking)
    "inline-ptx" - gave it a try, got a drop from 96 MH to 80 MH on radeons, so smth not right there. Btw, on start it shows something about "keccak round1 declared but not referenced".
    So, returned to version with gpu-batch-size.
    Post edited by greg on
  • GenoilGenoil 0xeb9310b185455f863f526dab3d245809f6854b4dMember Posts: 769 ✭✭✭
    @greg global work has nothing to do with shader count. In vanilla ethminer and CUDA-frontier, --cl-global-work is a multiplier for --cl-local-work. The result is the total amount of runs of the ethash algorithm in one batch. After one batch, any possible solutions are verified and then the next batch is launched. Higher batch sizes give higher hash rate because there's less kernel launch overhead, but at some point it doesn't gain anything and can only become a problem. You may have a solution but are late in submitting it because your kernel is still running.

    In my fork I created the --gpu-batch-size flag because at that point in time, vanilla ethminer didn't have the flag yet. Instead of a multiplier of local work size, it is the actual batch size (as a power of two, to both make it easy and most efficiënt. Simply stated, GPUs simply like powers of 2.

    I'm not sure yet about the performance drop on AMD with the new cudaminer-frontier and ptx branches. As these are based on 0.9.40 instead of 0.9.23, I think it has nothing to do with my work, the drop was already in there.

    If I can get my hands on a cheap R7 or R9 card, I could dive deeper into the cause. For now my focus stays with NVidia.
  • greggreg Member Posts: 14
    @Genoil thanks a lot for clarifying! So, batch-size 20 is exactly the same as local 128 * global 8192, and there is no connection with shaders... when there will be some parameter to tweak for shader count - that will be a big day for many miners!
    Genoil said:

    I'm not sure yet about the performance drop on AMD with the new cudaminer-frontier and ptx branches. As these are based on 0.9.40 instead of 0.9.23, I think it has nothing to do with my work, the drop was already in there.

    No-no, ethminer 0.9.40 is completely OK in terms of speed, and cudaminer-frontier is OK too, I suppose. It just doesn't show speed in console during mining - for the purpose of clean output, I think. There are only results (found solution, submitted and accepted, not accepted and so on), and no lines with speed. It's just me - I like to see numbers running and don't know how to enable them in that version of cudaminer :) there must be some simple parameter, like --log or so, but there is nothing in --help about that.
    As for the drop, it happens only in "inline-ptx" version with (in your words) "CUDA's PTX assembly borrowed from ccminer". Either this assembly is slower by itself or it requires some change in launch parameters to work properly.
  • GenoilGenoil 0xeb9310b185455f863f526dab3d245809f6854b4dMember Posts: 769 ✭✭✭
    @greg on AMD the PTX assembly is not loaded, so it shouldn't make a difference. That's why I thought 0.9.40 was already slower than 0.9.23.
  • greggreg Member Posts: 14
    Genoil said:

    @greg on AMD the PTX assembly is not loaded, so it shouldn't make a difference. That's why I thought 0.9.40 was already slower than 0.9.23.

    Genoil said:

    Oh this is interesting.

    I optimized the keccak hash in the opencl kernel with some CUDA PTX assembly (borrowed from sp_ 's ccminer) and now the opencl kernel is as fast as the CUDA kernel.
    ...
    also interested in any performance impact on AMD cards. The fallback function may be slower than what was originally in there.

    So why ask about performance impact on AMD cards, if it's not loaded? :)
    Anyway, this is thread about nVidias, so we are going a little offtopic here.
  • GenoilGenoil 0xeb9310b185455f863f526dab3d245809f6854b4dMember Posts: 769 ✭✭✭
    @greg Well because the code had to be reconstructed a little bit in order to make the PTX mod (which is dynamically applied) possible.
  • trotoltrotol Member Posts: 102
    could you, please, implement eth_submithashrate function?
  • cryptletcryptlet Member Posts: 29
    @trotol The ethminer which reports hashrate to pool is on page 11 somewhere in the middle of page. See link below.

    https://forum.ethereum.org/discussion/comment/13618/#Comment_13618
  • trotoltrotol Member Posts: 102
    cryptlet said:

    @trotol The ethminer which reports hashrate to pool is on page 11 somewhere in the middle of page. See link below.

    https://forum.ethereum.org/discussion/comment/13618/#Comment_13618

    Thx a lot, but this pack doesn`t submit the current speed to console-window :-(
    anyway, the pool can calculate my hashing speed now.
  • RichieRichie United KingdomMember Posts: 10
    Dear Genoil,

    I am stuck with your solution's configurations on ethminer-cuda/ethereumpool-cuda.bat slightly.

    I am running on Dual-SLI Nvidia GeForce GT 755M laptop.
    My currenct configuration is like this: ethminer -U -F http://ethereumpool.co/[email protected]...
    I have set miner=8 because my one GPU generates around 3.7Mhash/s

    But the problem is that with thse settings I can launch only 1 GPU, but no 2 GPUs.
    Help me please to configure this device properly, much appreciated.

    Best regards,
    Richie
  • PhantomPhantom Member Posts: 46
    Hey guys, use this to enable OC and fan controls for GTX 750 ti in Ubuntu with newest driver.

    sudo nvidia-xconfig -a --cool-bits=28 --allow-empty-initial-configuration (https://bitcointalk.org/index.php?topic=826901.msg12279696#msg12279696)
  • PhantomPhantom Member Posts: 46
    @Richie : use flag -t n (n = how many GPU you have)

    Sample: ethminer -F http://xxxxxx -U -t 2
Sign In or Register to comment.