Jump to content

Creating a Compiler

Nettly_

Hello, I am creating a programming language called Syl and I have no experience writing a compiler. Now I want to make a compiler to learn how to make one for B later on. Syl is my test dummy (for now) and was wondering what good resources do I need to create a compiler. Maybe a interpreter to C, I am unsure which is best. Now I do know Bison and Yacc are good but don't really know how to use them and I cannot find any good resources myself. I mostly code in C++ right now so if that is possible please tell me.

Any help and resources would be appreciated. Also maybe some examples of doing it with existing languages. TYIA.

Link to comment
Share on other sites

Link to post
Share on other sites

Building a compiler requires targeting an ISA of some sort, like x86, ARM, etc. And it's not just a matter of knowing assembly and/or machine code, it's a matter of understanding the system that those ISAs run on, including things like how memory access works.

 

I would suggest as a first run to target an 8-bit home computer like the Commodore 64 or the Atari 8-bit line, using an emulator to verify it works. Those are relatively easy to understand.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, Mira Yurizaki said:

I would suggest as a first run to target an 8-bit home computer like the Commodore 64 or the Atari 8-bit line, using an emulator to verify it works. Those are relatively easy to understand.

Another option I'd like to suggest would be to write a transpiler, ie. a sort of a compiler that converts one programming-language into another, like e.g. this "Syl" to Python. Writing a transpiler first would be a good exercise in figuring out how to parse files efficiently, how to handle code-blocks and all that stuff, so it would, IMHO, be a pretty good way of getting started and some practice in.

Hand, n. A singular instrument worn at the end of the human arm and commonly thrust into somebody’s pocket.

Link to comment
Share on other sites

Link to post
Share on other sites

3 minutes ago, WereCatf said:

Another option I'd like to suggest would be to write a transpiler, ie. a sort of a compiler that converts one programming-language into another, like e.g. this "Syl" to Python. Writing a transpiler first would be a good exercise in figuring out how to parse files efficiently, 

I was thinking of doing that. Since this shared so much with Python I thought it would be of no problem transpiling into python. I will write that down for an idea

Link to comment
Share on other sites

Link to post
Share on other sites

Some classic resources for compiler programming are the Compilers: Principles, Techniques, and Tools (ISBN: 

0-201-10088-6) and Principles of Compiler Design (ISBN: 0-201-00022-9). There is also a high-level overview on how compilers work on the OSDev wiki (magnificent resource). There used to be a sister site to OSDev called CompilerDev but it seems dead now. You might also like this article on implementing conditionals for a C compiler, which has code for different instruction set architectures.

 

You can also look at the code and documentation for other compilers like the Tiny C Compiler. GCC and Clang might be too complex for a beginner to read through, but challenging yourself can't hurt. Make sure you understand common algorithms and data structures and, more importantly, how to implement them in your target language (i.e., C++). Ypu'll also need to understand the platform your targeting (x86, x86-64, ARM...). Compiler construction is hard and a massive endeavor to undertake. OSDev considers it something of "medium" difficulty, but this is also a community of people dedicated to programming entire operating systems practically alone. It's not impossible though, and you'll be able to create a compiler so long as you understand the theory and you can survive the struggle of realising you know a lot less than you thought. Good luck!

Link to comment
Share on other sites

Link to post
Share on other sites

On 8/9/2019 at 5:41 AM, Mira Yurizaki said:

Building a compiler requires targeting an ISA of some sort, like x86, ARM, etc. And it's not just a matter of knowing assembly and/or machine code, it's a matter of understanding the system that those ISAs run on, including things like how memory access works.

 

I would suggest as a first run to target an 8-bit home computer like the Commodore 64 or the Atari 8-bit line, using an emulator to verify it works. Those are relatively easy to understand.

 I'm going to say that this is bad advice! In part, it depends on what you want to do with your language. But also, it makes for an unusable language. Nobody wants to use a 6502 in 2019.

 

With modern compiler building tools, targeting something like LLVM would be good for procedural languages, as it provides optimisation, register allocation, and every backend you could want. You would have to learn SSA form, and how to use phi instructions, but it's actually easier than most assembly languages.

 

But if the language is very similar to python, you may want to build an interpreter on top of python, rather than a compiler. Then, you can use the Futamura projections to collapse the tower of interpreters into a compiler. If you're willing to dive deep into the tools, the PyPy project provides tools for building a JIT compiler this way, with the RPython toolchain, though it is a limited form of python. This is kind of an arcane way of doing it, and won't teach you too much about traditional compilation techniques, but it is also fascinating and fun.

On 8/10/2019 at 5:04 AM, WatermelonLesson said:

Some classic resources for compiler programming are the Compilers: Principles, Techniques, and Tools (ISBN: 

0-201-10088-6) and Principles of Compiler Design (ISBN: 0-201-00022-9).

+1 on the dragon book(s). Perhaps a little dated, but still good.

Link to comment
Share on other sites

Link to post
Share on other sites

(Maybe) the most amazing architecture to learn compiler writing is a Motorola 68k. Get a simulator for that.

Write in C.

Link to comment
Share on other sites

Link to post
Share on other sites

Make it do syntax checking as well. Highlight the code where there is syntax error.

Sudo make me a sandwich 

Link to comment
Share on other sites

Link to post
Share on other sites

6 hours ago, Fourthdwarf said:

 I'm going to say that this is bad advice! In part, it depends on what you want to do with your language. But also, it makes for an unusable language. Nobody wants to use a 6502 in 2019.

Judging by OP's post, they want to create a compiler to get some experience on how to create a compiler. Considering most people think of a compiler as something that turns high level programming languages into an executable, this means knowing about how systems work in general, rather than just knowing the ISA of a particular CPU. So I would argue it'd be a much better experience to target something that's "obsolete" because it's simple and easier to understand. Those build the fundamentals to which one can use to make a compiler for something else.

 

Also there's still quite a lot of people who are tinkering with 8-bit computers or building applications for them.

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, wasab said:

Make it do syntax checking as well. Highlight the code where there is syntax error.

once I figure out a few last things (like proper colon checking which broke in my current edition) yeah I will add VSCode support for basic syntax, then make the real C based compiler.

 

4 hours ago, Dat Guy said:

(Maybe) the most amazing architecture to learn compiler writing is a Motorola 68k. Get a simulator for that.

May I ask why?

 

10 hours ago, Fourthdwarf said:

 I'm going to say that this is bad advice! In part, it depends on what you want to do with your language. But also, it makes for an unusable language. Nobody wants to use a 6502 in 2019.

 

With modern compiler building tools, targeting something like LLVM would be good for procedural languages, as it provides optimisation, register allocation, and every backend you could want. You would have to learn SSA form, and how to use phi instructions, but it's actually easier than most assembly languages.

 

But if the language is very similar to python, you may want to build an interpreter on top of python, rather than a compiler. Then, you can use the Futamura projections to collapse the tower of interpreters into a compiler. If you're willing to dive deep into the tools, the PyPy project provides tools for building a JIT compiler this way, with the RPython toolchain, though it is a limited form of python. This is kind of an arcane way of doing it, and won't teach you too much about traditional compilation techniques, but it is also fascinating and fun.

+1 on the dragon book(s). Perhaps a little dated, but still good.

The language is very similar to python but I do want to have better use in C (speed is key)

Link to comment
Share on other sites

Link to post
Share on other sites

nvm, misunderstood the posts. 

Sudo make me a sandwich 

Link to comment
Share on other sites

Link to post
Share on other sites

On 8/12/2019 at 7:52 PM, SafyreLyons-5LT said:

The language is very similar to python but I do want to have better use in C (speed is key)

So, of the options I mentioned, PyPy (The RPython version of python) is actually faster than CPython (the normal version of python), but LLVM is actually used in the Clang C compiler, so may be the better option.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×