relational vs distributed database

jskyjsky Member Posts: 8
When i was looking into distributed databases, this article came up:
which basically concluded that if your data is relational, it is alot better to choose a traditional rdb over a document store.
Is this problem the same with ethereum and blockchains in general?
Wouldn't this effect the ease of programming a social network and other apps on ethereum.
Would it be easy enough in ethereum to create "queries" on a directed graph?
I understand the benefits of decentralisation and distribution, but im trying to outweigh potential cost in implementation.
I am new to Ethereum so if i have overlooked something please inform me. Thanks.


  • StephanTualStephanTual London, EnglandMember, Moderator Posts: 1,282 mod
    edited July 2014
    Moving you to general discussion because this a good question and is on-topic :)
  • jskyjsky Member Posts: 8
    edited July 2014
    The article i quoted was abit old and didnt return to graphDb's:

    "But what are the alternatives? Some folks say graph databases are more natural, but I’m not going to cover those here, since graph databases are too niche to be put into production. Other folks say that document databases are perfect for social data, and those are mainstream enough to actually be used. So let’s look at why people think social data fits more naturally in MongoDB than in PostgreSQL."
    The article later concludes document stores are not as useful as relational db.

    After watching a webinar on Neo4j i'm confident that graph db's are more useful here and can be scaled. Companies such as Foursquare have done it. (another potential graphdb is but i havent looked further at this one yet.
    Neo4j seems to be the leader still.

    So a blockchain's main selling point is the energy and security distribution and transparency, and in case of ethereum; standard programmability of distributed apps and central/liquid Ether.

    So rephrasing my question, if i want to have the full graph traversal abilities of a graphDB like Neo4j, but also full network distribution/decentralisation like a blockchain does, how do i do this with ethereum?

    After googling, there seems to be a way to do cluster distribution with Neo4j: (
    but i would like my project to be on ethereum (as i like the idea of currency, open source/verifiable etc).

    Maybe someone could point me to how this would be done on Ethereum?
    Im still new here, in process of doing a uni exam so haven't had time to look at docs yet.
  • StephanTualStephanTual London, EnglandMember, Moderator Posts: 1,282 mod
    Hey @jsky. The two concepts (blockchain and graphdb) are totally different. In fact, the blockchain is also completely different from a distributed/sharded relational database. Computation is not distributed amongst nodes on a blockchain: ALL nodes process ALL transactions/computation locally. It's decentralized, but it's not distributed computing, although it can help incentive it (more on that below).

    This is true at least for the time being, because of course scalability (or lack thereof) comes into play, and will need addressing. He's a handy description of such problems by Vitalik: Also, again quoting Vitalik: "There have been a few ideas developed in this regard mostly in relation to Bitcoin, such as merge-mined hypercube chains, Peter Todd's tree chains idea and strategies based on advanced cryptography such as SCIP/zk-SNARK, but there is still a lot of research that remains to be done. A successful scalability solution would need to handle moving coins across different parts of the state that are stored by different entities, not sacrifice (too much) mining security, and make sure the protocol keeps working even if some data becomes unavailable. This is of course something we intend to research further.

    I suppose you could make a parallel between Key/Value pair stores and ethereum contracts, as ethereum contracts tend to be structured as such. But cassandra or leveldb this is not, and you should adjust your expectations on speed accordingly. I think a more promising solution is the use of ethereum for the incentivization of nodes I was referring to earlier: you could imagine a scenario whereby 1H of computing time gets rewarded by the issuance of 1 metacoin, which is then redeemable for goods and services, or traded at an exchange. The two processes (ethereum and the distributed computing platform) would run in parallel, and assuming ethereum is running in a light node configuration, almost 100% of the CPU time could be dedicated to the distributed process (something like SETI or folding at home).

    There has been a lot of research done on totally distributed database layers which you'll find interesting, they involve atomic clocks for synchronization, have a look at Spanner for example:
Sign In or Register to comment.