Wolf's Ethereum Miner

Wolf0Wolf0 Member Posts: 329 ✭✭✭
edited June 2016 in Mining
Proof:

Ethereum Stock
Wolf's Ethereum Miner

For those of you at work, it's a rough 9% increase in performance. 28 MH/s (stock) vs 31 MH/s (Wolf).
This was also done with the stock VBIOS on the cards - not modified.

Features:

* LDS Usage Eliminated:
Originally, Ethash made heavy use of LDS - removal of it ensures a latency decrease and, in theory, a reduction in power consumption.
* Compact:
Fits in the 32 KiB code cache - currently at 21,752 bytes.
* VPGR Count Reduction:
Waves in flight can be pulled alongside the original. Originally, Genoil's optimized miner was only able to achieve three waves due to using around 80 VPGR's (256 total). Wolf's Ethereum Miner uses an exact 64.
* Cycle Reduction:
While not as important in some algorithms (GCN masks the latency with other wavefronts), it provides a modest increase in performance.
* L1 Cache Bypassed on DAG Reads:
Since accesses are already scattered, there is no sense in caching them. Originally, Ethash was not able to access the bit used for it in load instruction.

Currently, only available for Hawaii - others can be done upon buyer's request.

Price: Available upon request - price is dependent on many factors. Interested farm owners can contact OhGodAGirl or Wolf0

Note: NVIDIA cannot be supported (not to the same degree). Their instruction set architecture is not documented. For the correct price, minor optimizations can be attempted, but buyers must understand that the tools are limited to begin with.

Special Thanks: Thanks to Genoil, who made a better ethminer to start with, improving the host code which I based off of. I modified his to load my custom GPU binary (entirely new code for the GPU, only loaded by the ethminer.)
Post edited by Heliox on
«1

Comments

  • dolleminerdolleminer Member Posts: 50
    Hello @Wolf0 Where I can download the miner?
    I will test it right away
  • michidamichida Member Posts: 31
    there is no download without paying for the miner i guess - i dont know the price structur but would like to know the price for a 160mhs rig ....
    can you make a comparison with claymore miner?
  • TermieTermie Member Posts: 130
    @Wolf0 maybe you should provide a 30 or 60 min demo-version.
    If people can test it and see that it has all advantages above, then more people would rather buy it IMHO.
  • ExeverExever Member Posts: 3
    or just non-stop youtube video, with claymore miner vs wolf miner
  • HelioxHeliox Member, Moderator Posts: 634 mod
    bitcanuck said:

    What happened to the comment thread? I'm pretty sure my comments didn't warrant deletion...
    Can @Heliox or another mod explain what happened?

    I seriously have no idea..

    I know there were a lot of comments on this thread before, no idea who removed them..

    I will ask..

    Greetings
  • densetsdensets Member Posts: 69
    so you use GPL code but don't provide source?
  • dlehenkydlehenky Member Posts: 2,249 ✭✭✭✭
    @densets His effort is entirely in the GPU kernel, which he has rewritten in assembler - it's not even OpenCL. The rest of it, as he stated, is Genoil's miner, which is open source already. If you wanted to, you can just use his kernel with whatever miner software you care to use, if you know how to build the miner from source code. Since his kernel in no way uses any of the open source kernel code, it does not need to be open source. I dare say, even if you had the source, you wouldn't know how to assemble it into a useable binary. As I mentioned, it is not OpenCL - or CUDA, for that matter.
  • madsquirrelmadsquirrel Colorado, USAMember Posts: 38
    Has anyone tried this? I'm curious what the results are.
  • michidamichida Member Posts: 31
    no prices, no demo ... no test, no results :smiley:
  • dlehenkydlehenky Member Posts: 2,249 ✭✭✭✭
    @bitcanuck I think some of the reason for the smallish gains is due to the DAG global memory accesses. I know Wolf managed to get the 4th wavefront on the SHA3, which was something no one had been able to accomplish in OpenCL, due to the compiler's uncontrollable vreg allocation scheme. Even with that substantial improvement in the SHA3s, which bookend the DAG accesses/mix function in the inner loop of the kernel, the performance boost was modest. It looks like a case of hurry up and wait (on the DAG accesses).
  • Wolf0Wolf0 Member Posts: 329 ✭✭✭
    edited June 2016
    dlehenky said:

    @bitcanuck I think some of the reason for the smallish gains is due to the DAG global memory accesses. I know Wolf managed to get the 4th wavefront on the SHA3, which was something no one had been able to accomplish in OpenCL, due to the compiler's uncontrollable vreg allocation scheme. Even with that substantial improvement in the SHA3s, which bookend the DAG accesses/mix function in the inner loop of the kernel, the performance boost was modest. It looks like a case of hurry up and wait (on the DAG accesses).

    This is right - it's about latency. I did shave cycles like crazy in the inner loop, but the long ass lookup latency hurts a lot. I'll need to set up another testing env and do more testing, that's also true. This post was more to gauge interest.

    EDIT: I also have SGMiner kind of working - high rejects on getwork, so I'll need to port Stratum.
  • Wolf0Wolf0 Member Posts: 329 ✭✭✭
    G416G said:

    bitcanuck said:

    What happened to the comment thread? I'm pretty sure my comments didn't warrant deletion...
    Can @Heliox or another mod explain what happened?

    It was me..

    I dared question the wolfman on his testing and business plan which he said wasn't done and I ripped him on his sexual orientation (dog)

    Shortly after I got a bullshit email from the ragman but he doesn't say anyting just beats around the bush with small talk.

    The wolfman was also manipulating the post with a supporting account or business partner that chimes in for him when he needs help. I think they just pulled the whole thing instead of picking out specific posts.
    Says the guy with... oh shit, 23 abuse reports.
  • mootwomootwo Member Posts: 46
  • heandog69heandog69 Member Posts: 283 ✭✭
    wolf how much for this miner ? and is it easy to configure ?
  • ImAMiner?ImAMiner? Member Posts: 208 ✭✭
    Isn't that what the NSFW is for...
  • ImAMiner?ImAMiner? Member Posts: 208 ✭✭
    Yeah I've seen that crazy stuff for over 2 years. I am subscribed to this thread but it was erased before I could say furry. @Wolf0 can you throw in a pic of jenna jameson once in a while. :smile:
  • mootwomootwo Member Posts: 46
    G416G said:

    Seems like you're unhappy about something I said and coming back to this thread to whine about it.

    Prolly cause you came in here and shit all over this thread. Just cause your shitposts got deleted doesn't mean they didn't happen and can't be responded to.

  • mootwomootwo Member Posts: 46
    Lol you're not even a good troll. U shud just mind ur bizness m8t. Lol.
  • Wolf0Wolf0 Member Posts: 329 ✭✭✭
    edited July 2016
    I have raw numbers from power tests on all three miners - my kernels, Genoil's, and Claymore's:

    Freya (the test rig) - 4 cards - 280X, 290X, 290X, and Fury (unlocked). Idle draw 155W after a few min.

    Clocks for all tests (set with aticonfig): 1100/1250

    Wolf's miner: ./ethminer/ethminer -G -F http://eth-us.suprnova.cc:3001/Slut.1/50 --cl-local-work 256 --cl-global-work 8192 --opencl-devices 1 --farm-recheck 350 -- low reading 29.7, high reading 34.5. 380 - 385W.

    Genoil's miner: ./ethminer/ethminer -G -F http://eth-us.suprnova.cc:3001/Slut.1/50 --cl-local-work 256 --cl-global-work 8192 --opencl-devices 1 --farm-recheck 350 -- low reading 24.8, high reading 29.6. 375 - 380W.

    Claymore's miner: LD_PRELOAD=libcurl.so.3 ./ethdcrminer64 -di 1 -epool eu1.nanopool.org:9999 -ewal 0xd69af2a796a737a103f12d2f0bcc563a13900e6f -epsw x -eworker Freya -- 30.68MH/s - 30.7MH/s. 390 - 392W. Effective hashrate because 1% fee: 30.393MH/s

    I should really try with powertune -50 - it seems to do nothing to hashrate...

    Anyways, on those numbers, vs Claymore's, mine does: 2.303% more hash (this accounts for the 1% fee he takes), AND uses 2.564% less power.

    Numbers used: 30.4MH/s for Claymore (effective cause 1% fee), 31.1MH/s for mine (may be slightly higher, but I guessed low), 390W for Claymore, and 380W for mine.

    EDIT: I should do the power tests per card, not per rig, as the savings would be over multiple cards. So, doing load draw minus idle draw and taking the percent...

    Claymore: 390 - 155 = 235W
    Wolf: 380 - 155 = 225W
    Percent saved: ((235 - 225) / 235) * 100 = 4.255%
    Post edited by Wolf0 on
  • Marvell9Marvell9 Member Posts: 593 ✭✭✭
    380 watts for 3 cards huh ?
  • ImAMiner?ImAMiner? Member Posts: 208 ✭✭
    Hey @Wolf0 did you try dual mining? Eth hashrate is higher if you dual mine, but significantly more power used.
  • Wolf0Wolf0 Member Posts: 329 ✭✭✭
    edited July 2016
    Marvell9 said:

    380 watts for 3 cards huh ?

    No, no, only one was mining, four in the rig.
    ImAMiner? said:

    Hey @Wolf0 did you try dual mining? Eth hashrate is higher if you dual mine, but significantly more power used.

    No, that's odd, and interesting.
  • ImAMiner?ImAMiner? Member Posts: 208 ✭✭
    Wolf0 said:

    Marvell9 said:

    380 watts for 3 cards huh ?

    No, no, only one was mining, four in the rig.
    ImAMiner? said:

    Hey @Wolf0 did you try dual mining? Eth hashrate is higher if you dual mine, but significantly more power used.

    No, that's odd, and interesting.
    Yep, claymore posted his thoughts on this somewhere in his btc thread.
  • dolleminerdolleminer Member Posts: 50
    @Wolf0 do you also intergate dual mining? With Decred I farm the fee double back.
  • mootwomootwo Member Posts: 46
    @bitcanuck so there's a power consumption increase from dual mining, and then you underclock the core to bring it back down? Is power consumption still higher than mining ETH only? And this is only for dual mining correct? No use if I'm only mining ETH? I'm currently running R9 380 at 980/1500.
  • mootwomootwo Member Posts: 46
    @bitcanuck wow thanks. I'm gonna give that a try.
  • ImAMiner?ImAMiner? Member Posts: 208 ✭✭
    bitcanuck said:

    Wolf0 said:


    ImAMiner? said:

    Hey @Wolf0 did you try dual mining? Eth hashrate is higher if you dual mine, but significantly more power used.

    No, that's odd, and interesting.
    That's only true sometimes. Depends on the card, and only if you don't have your clocks optimized. On GPUs that are limited by memory speed, there is a slight (
    That's true, I remember your discussion with @Heliox regarding your 380s. There is a speed boost for hawaii cards.
  • Marvell9Marvell9 Member Posts: 593 ✭✭✭
    Marvell9 said:

    380 watts for 3 cards huh ?

    Thats pretty high wattage though for not dual mining I thought 290 and 290x users claim they pull aroun 180 watts with undervolt
  • dlehenkydlehenky Member Posts: 2,249 ✭✭✭✭
    bitcanuck said:

    Genoil's miner: ./ethminer/ethminer -G -F http://eth-us.suprnova.cc:3001/Slut.1/50 --cl-local-work 256 --cl-global-work 8192 --opencl-devices 1 --farm-recheck 350 -- low reading 24.8, high reading 29.6. 375 - 380W.

    I get slightly better (about 0.5%) performance with 256/16384 vs 256/8192
    Have you compared your stale shares with those 2 settings? I would expect the 16384 global to produce a noticeably higher stale rate.
  • dlehenkydlehenky Member Posts: 2,249 ✭✭✭✭
    @bitcanuck I use 256/8192 for a kernel runtime of ~70 ms; my stale rate on ethpool/ethermine is around 0.75% average.
Sign In or Register to comment.