Help with crowdfunding contract

cybertreibercybertreiber Vienna, AustriaMember Posts: 29 ✭✭
Hi guys!

As a nice demo we want to showcase a contract in the upcoming Vienna meetup. Most importantly, the HLL based code doesn't exhibit the expected behavior. Maybe someone with great HLL and EVM skills can help out?
First, the contract is initialized with data[0..2]: name of the project, time to expire in hours and funding limit. Subsequently, crowd funders can chip in up to a maximum of 900 participants. If the 901st makes transaction, funds (minus an arbitrary 1% fee) will be returned immediately. If date of expiry is met, all fundings are reversed. However, if the funding limit is reached, the fundee, i.e. the one who initialized the contract receives the funds.
It's a very nice showcase since you could directly go from there to incorporate a DAO with certain rights/obligations of the funders and fundees instead of just handing over the cash to fundee... The power of Ethereum is strong in this one!

State and memory doc

states: 0 … initalize (tx.sender = fundee) [name, deadline (+hours), limit] 1 … chip in + check of limit and time (tx.sender = funder) 2 … payout or payback storage locations: 1000 - state 1001 - fundee 1002 - project name 1003 - deadline (absolute) 1004 - limit 1100 - amount of funders >= 1101 + k = addresses of funders >= 2000 - non protected = funders’ investment

Code is here: https://github.com/dafcok/ethereum/blob/master/crowd_fund.txt

Bugs

Now to the problems and current debugging insights.

1) With Aleth 3.0, the contract goes live. Currently there is one at the address f22a7267b96e44fa1d049508fc740b235adcf43c. However, intializing it with a transaction of testfund" 10 1200finney', increases the contract balance but fails to add additional information to storage[1000..1004] as was intended.

2) Using http://k1n0k0.github.io/ethereum-simulator/ , gives the same result. So neither the state in [1000] changes nor anything else. I suspect it's an embarrassingly easy bug or non adequate use of transactions.

3) Since the latest HLL compiler, the code won't compile anymore and complains with ValueError: '=' is not in list on each line with mktx(...). The once compiled EVM is here https://github.com/dafcok/ethereum/blob/master/crowd_fund_compiled.txt. I'm unclear on what invalidated the code. Maybe the latest EVM change?

Comments

  • aatkinaatkin Member Posts: 75 ✭✭
    edited April 2014
    @Alex_GuangTou? Hi! I'm working on the same type of contract and will put it up to github on the weekend. I'm also giving a 5-10 min talk on a variant called dominant assurance contracts. Maybe we can work together. Feel free to message me if you're interested. Nice work BTW!

    A few notes on your github:
    =====================
    -Comments will be supported in HLL using //
    -I think 2 space indent helps on some version of the compiler
    -The compiler was modified recently. MKTX will go away in POCv4 to be replaced by CALL
    -I don't know why contract.storage[1000-1004] isn't modified after first run. It should be. I would have put a "stop" after that first if state == 0 block. The first initialize run should run only once when the contract is placed on the blockchain.
    - On state == 1 we chip in when block.timestamp <= campaign time limit. We cannot chip in after the time limit. Do not change the state to 2. That should happen only when the campaign time limit is exceeded. Otherwise you'll only have 1 funder.
    -Maybe take out fundthecontractingplatform for now to simplify.
    -In my contract I have state == 2 being payout which is when block.timestamp > campaign time limit AND contract.balance >= funding target (your limit in [1004]) + processing fees. I see you do this at the end.
    -In my contract state == 3 for payback when block.timestamp > campaign time limit AND contract.balance < limit + processing fees
    -The processing fee (current contract burn rate) gets simplified when GAS comes around.
    -There's no need to limit to 999 funders. You can use a while loop to read the non-zero funding addresses for the payback.
    -For your state == 3 where the contract is over, maybe we should automatically refund to the sender. This is better than suiciding because then a latecomer will get something back instead of burning by sending to a non-existent address. On the other hand, if the address wasn't there, maybe he wouldn't send in the first place.
    -For the chip-in I see you do:
    contract.storage[tx.sender] = contract.storage[tx.sender] + tx.value
    I guess this is key-value storing the address. Not sure how you would iterate through the list.
    -I'm not really sure using RLP how contract.storage[] works with strings. I thought address take 20 bytes so contract.storage[2000] = myaddress would take up [2000-2020], no? I know you can easily do key value stuff like addressx = contract.storage[tx.sender] because the array size is 2^256 (huge) and it takes less space. Great for lookups, but hard to iterate through a sparse list of addresses, no?

  • cybertreibercybertreiber Vienna, AustriaMember Posts: 29 ✭✭
    Thanks for your thorough walk trough. Lets keep it here for discussion as it will be very valuable to others as well! What I don't comment back, generally means ack + gratitude for pointing that out.

    -Comments will be supported in HLL using //
    really waiting for that, _DOCS.txt for now.

    -I don't know why contract.storage[1000-1004] isn't modified after first run.
    Yeah, the main culprit. Left in the dark about this til now.

    - On state == 1 we chip in when block.timestamp <= campaign time limit. We cannot chip in after the time limit. Do not change the state to 2.
    Nice spot!

    -Maybe take out fundthecontractingplatform for now to simplify.
    As you guessed, a placeholder while no comments available. Eventually the fee can also be configured optional during the initialization and shall be non existent if the project entirely runs P2P (i.e. a self marketing crowd fundee).

    -There's no need to limit to 999 funders. You can use a while loop to read the non-zero funding addresses for the payback.
    -For the chip-in I see you do:
    contract.storage[tx.sender] = contract.storage[tx.sender] + tx.value
    I guess this is key-value storing the address. Not sure how you would iterate through the list.
    -I'm not really sure using RLP how contract.storage[] works with strings. I thought address take 20 bytes so contract.storage[2000] = myaddress would take up [2000-2020], no? I know you can easily do key value stuff like addressx = contract.storage[tx.sender] because the array size is 2^256 (huge) and it takes less space. Great for lookups, but hard to iterate through a sparse list of addresses, no?

    My understanding is/was, that an ascii string gets turned and truncated into binary blob of 256 bit. You can observe that if you type in "foo bar" into the data field of the reference client. So using a string as a key would not spill over the array, however you are limited to 16 bytes.
    Furthermore, I think of the k,v storage as a dictionary. So theoretically it would be easy to iterate over the keys > 1004. Only thing is, I don't know how. This is also why I opted to maintain a separate space [1101-2000] where funders are listed linearly. Then you can iterate easily, unfortunately limited to a certain amount of funders.

    I'd be glad if someone confirms the assumption about the storage data structure and/or point to a more elegant solution.. meanwhile I cut the rough edges you mentioned.
    Although some fundamentals are impeding real debugging by my side. Can't wait to see your take plus collaborate.
  • aatkinaatkin Member Posts: 75 ✭✭
    edited April 2014
    Thanks for taking a look. Your contract is certainly more elegant and efficient than mine. I really simplified my blocks so I could easily count OPs and costs for each of the 4 main execution branches through the code. Then I can specify a GAS cost and refund to the sender for the cheaper branches if they run them. The full refund branch is variable cost because as the number of funders increases so does the refun d CALL (formerly mktx) count. Another consideration is that the min donation needs to be > the cost for a CALL. Otherwise we won't have enough ether for a refund if the campaign fails.

    I'd like to get some idea on the string storage in HLL as well. On reddit it's said that contract.storage[1000] can hold 32 bytes. So contract.storage[1000] = "gnomecoin" is ok. You mention 16 bytes. I thought I read 32 bits. For iterating over a dictionary, hmm, maybe we're supposed to reinvent the wheel down and dirty and create random access file using a hash. I was just going to use it as an array and iterate over it while contract.storage[i] > 0 where i starts at 2000 or something.

    The other problem I'm having (and @vitalik? is working on) is getting the Python compiler to run on Win64. I'm a total Python newb, so that doesn't help. Too bad multisig.info:3000 is down. The old etherchain.org CLL compiler is up. I can't compile contract.storage[1000] = "aatkin" on it. Basically I'm desk checking like my father did at IBM in the early '60s when you only had 1 compile per day allowed. Lol.
  • chris613chris613 Member Posts: 93 ✭✭
    Here is my understanding of some of the points you guys are discussing based on my study of the most recent github code for cpp-ethereum.

    There is no need to refund gas from within your contract. After executing a transaction (including running any target contract) any excess gas is refunded to the sender. It would be nice if contracts could tell you what value to use somehow (if it's computable, which is not always the case in a turing complete language).

    I can say for certain that each storage location is 256bits, or 32 bytes of storage. The string handling in the alethzero LLL at least, (not sure about other HLLs) is very simple. When you enter any length string "abcd..." it compiles as a PUSH32 instruction, which packs the next 32 bytes of data (your string) into a single stack element. If you have less than 32 bytes in your string it is padded with 0s, if you have more it is truncated. A compiled-down version of contract.storage[1000] = "aatkin" would look like this:

    PUSH32 0x61 0x61 0x74 0x6b 0x69 0x6e 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0 0x0
    PUSH2 0x03 0xe8
    SSTORE

    HLLs might want to handle longer strings more gracefully by compiling down to multiple iterations of that, like: PUSH32 PUSH2 PUSH32 PUSH2 PUSH32 PUSH2 PUSH32 PUSH2 PUSH32 PUSH2 PUSH32 PUSH2 STORE STORE STORE STORE STORE STORE - that gets you 192 bytes of string storage and could be arbitrarily extended. The HLL would also need to implement a length indicator or handling for null-termination since there is nothing in the EVM language that does that.

    As for the issue of iterating over the storage dictionary, @Alex_GuangTou? showed the common idiom of contract.storage[tx.sender]. In this case the EVM does not provide a way of iterating the initialized keys in the storage, so you do need to maintain your own linearly-stored lookup table as he described. Interestingly, the underlying data structure for contract storage in the cpp-ethereum code is capable of iterating over its keys, so it's possible this functionality could be exposed later on. It would be a bit awkward
  • aatkinaatkin Member Posts: 75 ✭✭
    edited April 2014
    Awesome! Many thanks @chris613!
  • cybertreibercybertreiber Vienna, AustriaMember Posts: 29 ✭✭
    edited April 2014
    @aatk yes I actually meant 32 bytes (258 / 8). Will edit the post accordingly.

    @chris613? Thx for going in depth and verifying the data structure notwithstanding the convention of string conversions. Now, I would find it it quite a boon to iterate over all keys. Given this possibility, certain implementations will be less data redundant and hence lead to more efficient code. In python, STL you'd do that quite freely afaik.

    On the whole notion of data layouting I stumbled upon a rather non intuitive technique. In the 'egalitarian' DAO example on ethereum.org (discussed here: new code proposals are stored with a pseudo random offset (k = sha3(32,tx.data[1])) for each proposal. The votes are subsequently stored with yet another offset by the voter's public address.

    contract.storage[k + i] = tx.data[i] contract.storage[k + tx.sender] = 1

    As far as I can see, you are bound to run into collision issues. Lets say two proposals generate keys not far apart, or conversely k+tx.sender yields values wich are near to another k+tx.sender. This possibly and by pure coincidence would lead to overwriting of previous proposals, voter's permissions and/or votes.
    I can't find that such contracts are built solely on hopes of no coinciding collisions. At least, what you'd want is some bounds checking of public addresses then. Although it would get much more cumbersome than reserving a linear space and rejection of addresses more opaque.
    Maybe I'm off and some crypto voodoo makes compounded indexing in this way non colliding. Then I'd really be interested in why that. Would be great if someone can elaborate.

    + bump to Bug 1) and 3) mentioned in my initial post and +1 to bring multisig:3000 online. Maybe Mac's or homebrew's Python causes 3).

    edit: can't edit post from 3rd of April anymore. so please bear with it... 32 bytes per key/value.
  • chris613chris613 Member Posts: 93 ✭✭
    Any HLL is free to develop a high-level datatype that implements an iterable hash map, one will not necessarily be more efficient than the other because they each have to implement that logic on top of the same EVM opcodes. For efficiency, though, it might be nice to expose some more of the underlying feaures of the std::map that the storage is actually based on. Some storage iterator opcodes such as SIBEGIN, SIEND, SINEXT, SIPREV would reduce overhead of HLLs implementing their own schemes on top of what is already an std::map type object under the hood.

    @Alex_GuangTou? , I think your comments about collisions are spot-on. Some will argue that "k + tx.sender values which are near to another" is just so crypto-voodoo-unlikely that it's worth the convenience, whereas I see sloppy code of this sort granting leverage to some classes of future theoretical attacks where otherwise there would be no problem.
  • cybertreibercybertreiber Vienna, AustriaMember Posts: 29 ✭✭
    @chris613? If this is spot on I'm wondering about figures of the probability of collision for this convenience. Until now, I wouldn't dare to do it that way because why to jump through all the hoops to have a deterministic crypto-law implementation and then ask for random or even intentional attack vectors? Otoh, full determinism is going to cost dearly and surely is going to bloat a lot of implementations.
    What would really be neat is a k,v dictionary where the v is a variable length, recursive RLE. Then, arbitrary length names, votes, balances and what not could be stored effectively for each address. However, I don't know if that could be implemented without problems of fast deserialization/lookup and contract storage maximum size checks. Any thoughts?
  • chris613chris613 Member Posts: 93 ✭✭
    For probabilities, I usually screw them up since I don't work in a field that requires me to be good at it, but my guess at the chance of any collision at all (assume they would all be problematic) is essentially (Nk + Nv + Lcode)/2^256 (where Nk is the number of unique "proposals", Nv is the number of votes, and Lcode is the length of the code since it is also in storage starting at 0).

    For an arbitrarily huge scale, let:
    Nk = 1,000,000 (proposals)
    Nv = 100,000,000,000 (1B voters vote on 100 proposals each)
    Lcode = 10,000 (huge by current contract standard)

    So by my calculation, that's a 1/8e-67 chance of random collision. My problem is not so much with this raw probability, I'm comfortable that it's suitably miniscule. What I'm concerned about is future attacks on SHA3 that might make portions of that attack space easier to target, or otherwise leverage this sloppiness through carefully constructed input. Perhaps I'm just paranoid, and then there is that part of me that would prefer the bloat to get it 'truly correct' just so I never have to screw up another probability calculation. This is the same feeling you share with "why to jump through all the hoops to have a deterministic crypto-law implementation and then ask for random or even intentional attack vectors?"

    "What would really be neat is a k,v dictionary where the v is a variable
    length, recursive RLE". Yes! That would certainly be cool.
  • chris613chris613 Member Posts: 93 ✭✭
    I'm reading Gav's "yellowpaper" (http://gavwood.com/Paper.pdf) and it says that program code (Lcode) is not in storage:

    "Rather than storing program code
    in memory or storage, it is stored separately in a virtual
    ROM. It cannot be read directly; instead it exists only as
    a model for determining the next instruction to execute"

    I swear I found it there before, and it certainly looked like it from my read of the code this afternoon, but I'm either wrong, or it's changing soon.
  • cybertreibercybertreiber Vienna, AustriaMember Posts: 29 ✭✭
    @chris613 In POC4, contracts now list the Body Code separately to storage addresses. I.e. contracts store data from @0x0 on. So your memory served you well :)
  • chris613chris613 Member Posts: 93 ✭✭
    (off-topic) I guess this new model puts an end to my self-replicating contracts.
  • aatkinaatkin Member Posts: 75 ✭✭
    (off-topic). So then contracts are now immutable? There's no way I can code a contract for which a transaction can modify its code body? :(
  • cybertreibercybertreiber Vienna, AustriaMember Posts: 29 ✭✭
    Although code appears to be immutable now, I somehow doubt it. Otoh, send() and msg() seem to be asynchronous now, so it would make sense.
    Regarding the crowd-funding code, I migrated to POC4 standards and terminology. And with the newest compiler, compilation to byte code was successful:
    https://github.com/dafcok/ethereum/blob/master/crowd_fund_POC4.txt
    https://github.com/dafcok/ethereum/blob/master/crowd_fund_compiled.txt

    Migrating included incorporating the idea of separating msg from tx and the use of elif instead of else if.
    Unfortunately, pasting the EVM to Aleph doesn't translate to an equivalent data blob. Merely a two opcode section under Init and three opcodes under Body. Why it wouldn't understand EVM (anymore)?
  • JasperJasper Eindhoven, the NetherlandsMember Posts: 514 ✭✭✭
    Suggestions: rename to .cll it may be named .hll.. well neither is very established. Specificly say which is the latest.

    Put crowd_fund_DOCS.txt in readme.md.

    Test with cll-sim? Might be a good example for there too. For that, I actually prefer to use stop // Comment and then use the test to check the correct exit was taken. (but i kindah prefered if you could check logic paths midway aswel, not currently)
  • chris613chris613 Member Posts: 93 ✭✭
    edited April 2014
    @Alex_GuangTou The aleth client doesn't accept raw EVM opcodes like that, you need to feed it LLL. This used to look like this:
    (seq (mstore8 0 287) (lt 2000 caller...

    But it appears the language has changed somewhat in POC4: https://github.com/ethereum/cpp-ethereum/wiki/LLL-Examples-for-PoC-4
  • aatkinaatkin Member Posts: 75 ✭✭
    Ok, here's a dumb question. If you can't feed it OP codes and you can only feed it LLL in PoC4 then what's the point of HLL? Is HLL totally non-functional in PoC4?
  • JasperJasper Eindhoven, the NetherlandsMember Posts: 514 ✭✭✭
    Maybe we need a HLL(/CLL)→LLL compiler then.
  • aatkinaatkin Member Posts: 75 ✭✭
    Yes a transcompiler was discussed. It seems LLL takes priority these days. I would like to be able to paste in EVM though. Time to learn LISP for me! :)

    I don't think @vitalik is abandoning HLL or he wouldn't have updated the compiler.

    BTW we could also try writing the crowdfunding contract (and variants) using Etherscripter.
  • aatkinaatkin Member Posts: 75 ✭✭
    Aha, Serpent is out, when in doubt, read the blog. Awesome! Can't wait to try it along with the crowdfunding contract. Just remember not to use floats.
  • cybertreibercybertreiber Vienna, AustriaMember Posts: 29 ✭✭
    Gosh, I wanted to escape Lisp for the better of times. Sure thing one can eventually if to wait some more. For now, serpent can already compile down to bytecode. However it doesn't compile this code contrary to compiler.py and complains with KeyError: '1'
    These two compile options are confusing.
    @Jasper thx. will clean up and follow some conventions when there is time.
  • JasperJasper Eindhoven, the NetherlandsMember Posts: 514 ✭✭✭
    Lisp is good. Well, you can have a full lisp and not write everything in s-expressions, as Julia and another one shows.
  • aatkinaatkin Member Posts: 75 ✭✭
    Thanks to @Jam10o‌ I have a basic kickstarter contract here: http://www.mintchalk.com/c/68f3e

    It hasn't yet been tested on the public blockchain.
    Next up: a dominant assurance contract and then kickstarter + equity then DAC + equity

    Equity meaning funders are awarded shares in the form of a tradable subcurrency proportional to their funding amount.
Sign In or Register to comment.