Hey,
as far as i see the r9 280x is still one of the best mining cards.
But when i compare the specs with, for example, a R9 nano. Theoretically the R9 nano should clearly outperform the 280x. But instead it also does something around 25 MH like the 280x ?
R9 280x Specs:
http://www.techpowerup.com/gpudb/2398/radeon-r9-280x.htmlR9 Nano Specs:
http://www.techpowerup.com/gpudb/2735/radeon-r9-nano.htmlMy question is why ? What are the key facts for fast ether mining ? Cache Bandwith ? Clock rate ? GFLOPS ?
Thanks
regards
Alex
Comments
The reasons for the poor performance aren't well understood but perhaps has something to do with the size of the dag itself. @Genoil's work on the CUDA miner revealed some very disturbing behaviour on the nVidia GPUs, especially under windows. As DAG size grows >1GB, hashrate drops to nothing. He suggests memory paging is hammering the bandwidth.
I've noticed degrading performance on my HD 7950's after each new epoch also. So maybe there is a similar behaviour with them too.
Dagger/Hashimoto algo is 'memory hard' and so performance is bound to memory bandwidth. GPU cores can be under clocked with little effect on hashrate to yield better efficiency.
Overall, GPU specs have not been a particularly good predictor of mining performance. I did up this spreadsheet during Olympic to try and get my head around it, but really just had to wait until benchmarks came in. In the end, second hand R9's and HD 79xx came out the better GPU's for mining.
While the algo was designed to scale with memory bandwidth, in fact it seems to be designed to run nicely on GCN 1.0 cards and not scale that well on modern GPU's.
But...you never know who cooks up some alternative implementation...
Brand new there ist a new Radeon card -> http://www.techpowerup.com/gpudb/2758/radeon-r9-380x.html
Stats are same like old R9 270X -> can be interesting in terms of ROI calculation because of low price. Does anyone allready got hands on this ?
Using this logic, one would assume that AMD's HBM cards and the upcoming Nvidia Pascal would blow ethash through the roof with the 4096-bit wide bus. But due to the much lower memory clocks, the effective bandwidth is not that much higher than the top of line GDDR5 cards. Again, fine for gaming, but ethash simply works better with a narrower bus accessed at higher speeds. I'm still trying to understand exactly how this works, but it's pretty difficult to exactly figure out what's going on under the hood.
But, I've done extensive research on the dagger algorhytm and its supposed scaling with GPU bandwidth. Unfortunately this is not fully the case. While bandwidth does play a big part, the GPU's TLB (translation lookaside buffer) has a huge impact on the achieved hashrate. It's basically a table that holds copies of earlier done translations from virtual into physical memory addresses ranges (pages). The problem with dagger and its huge slab of GPU RAM that is pseudorandomly accessed, is that the TLB fills up quite quickly and then has to redo many of those translations. With the growing DAG size, this has already led to GTX750ti having become useless to mine ETH on. For other modern Nvidia cards, the DAG has to grow until 2GB before hashrate will plumet.
AMD cards seem to have 'better' TLB's (and generally wider memory busses which also helps) so they seem to be less affected. I still have to write a testcase in OpenCL to see to what extent they will be suffering from growing DAG size. But i've read here on the forum that with each new epoch, hashrate is already slowly dropping on AMD cards, too. Hopefully for you guys it's a linear decrease, not hyperbolic like on Nvidia
Another 'problem' holding performance back is that the algo only requires 128 sequental bytes in GPU RAM per iteration of the dagger loop, while the high bandwidth of modern GPU's can only be achieved when loading larger sequential (coalesced) chunks. On NVidia Kepler / Maxwell, this is 512 bytes (warp of 32 threads * 16 bytes), on AMD GCN it is 256 bytes. I suspect the fact that because on AMD its is only the double of 128, it can utilize available bandwidth more effciently, but that is somewhat speculative. Another theory I'm exploring is in the amount of available memory channels/banks. Nvidia Kepler and Maxwell and GCN 1.2 have 8, while GCN 1.0 has 12 and GCN 1.1 has 16. With pseudorandom access, this would generally lead to more bank conflicts on cards with fewer memory controllers. Perhaps some optimization possible in that area. But I don't own any AMD cards
If the DAG grows so large that even 4GB cards have issues then they need to modify the algorithm. While they're at it, they could add a few more flags to make it easier to mine with certain cards.
Hopefully they can manage to prevent ASICs to come in and ruin it for the rest of us.
Nvidia is already sunk. AMD cards look like they might have winners and looses yet to be seen. All in all, Etherum mining has turned out to be something of a leaky boat race... no real finish line, just whoever stays afloat the longest...
I'll see if I can free up some time to port my "dagger simulator" from CUDA to OpenCL to quickly assess how much AMD cards are affected by increasing DAG size.
min/mean/max: 19835562/19957896/20010325 H/s
inner mean: 13311089 H/s
However the thing i'm curious about is the POS client and validator betting - there does not seem to have a clear use case of how one can become a validator in a pool, to make bets against the protocol to earn ether.
If anyone can enlighten me that would be helpful. Appreciate it!
Cheers
Shaun
Had a slack attempt the other day to find some stats on 480/580's but haven't really been update the chart for quite a while as I've move more into Solidity development which is taking up all my time.
I'd encourage anyone who still finds the chart useful (and there's always a few online when eve I go tot it) to simply save a copy and update according to their own situation.