Solidity compiler's innards

Could anyone give a brief description of how the solc compiler works? I can't find much documentation on the subject... describes the language more than the compiler.

As far as parsers/compilers are concerned, I'm mostly familiar with how ANTRL processes DSLs. I was curious to know how similar (or not) this compiler was. Does it use c++ libraries (for tokenizing for instance), or is it an entirely custom effort? And if so, what was the motivation? Why not have used something like ANTLR, or even a c++ equivalent?

If someone could point me to the most relevant files in the solidity repo, that would be awesome!



    So you are asking specifically about the parsing part of the compiler, and not the type-checking and code-generation part?

    It is a simple recursive descent parser which originated from V8's JavaScript parser. The reason not to use a parser generator was that it is much easier to modify the parser and debug the parsing process. I think parser generators are great, but they do not work well with an evolving grammar. Also JavaScript's / Solidity's grammar is really well suited for recursive descent as there are almost no ambiguities and or left-recursions.
    Thanks, this is very helpful! Any place in the code you recommend I take a look at for the parser?

    I guess the parsing is a good place for me to start (I want to get a high-level understanding of the compiler in general).

    I agree with your statement about evolving grammar, so this makes sense at this point.

    I don't really see that big a resemblance between Solidity and JS though, starting with the type system... If anything it gives more a feel of c++ meets java, but maybe it's just me!
    The Scanner is in Scanner.h/Scanner.cpp, the Parser in Parser.h/Parser.cpp and the output data structure of the parser is in AST.h / AST.cpp.
    awesome, thanks :)
