Go in Go

Gopherfest

26 May 2015

Rob Pike

Google

Go in Go

As of the 1.5 release of Go, the entire system is now written in Go.
(And a little assembler.)

C is gone.

Side note: gccgo is still going strong.
This talk is about the original compiler, gc.

2

Why was it in C?

Bootstrapping.

(Also Go was not intended primarily as a compiler implementation language.)

3

Why move the compiler to Go?

Not for validation; we have more pragmatic motives:

Already seeing benefits, and it's early yet.

Design document: /s/go13compiler

4

Why move the runtime to Go?

We had our own C compiler just to compile the runtime.
We needed a compiler with the same ABI as Go, such as segmented stacks.

Switching it to Go means we can get rid of the C compiler.
That's more important than converting the compiler to Go.

(All the reasons for moving the compiler apply to the runtime as well.)

Now only one language in the runtime; easier integration, stack management, etc.

As always, simplicity is the overriding consideration.

5

History

Why do we have our own tool chain at all?
Our own ABI?
Our own file formats?

History, familiarity, and ease of moving forward. And speed.

Many of Go's big changes would be much harder with GCC or LLVM.

6

Big changes

All made easier by owning the tools and/or moving to Go:

The last three are all but impossible in C:

(Gccgo will have segmented stacks and imprecise (stack) collection for a while yet.)

7

Goroutine stacks

These were each huge steps, made quickly (led by khr@).

8

Converting the runtime

Mostly done by hand with machine assistance.

Challenge to implement the runtime in a safe language.
Some use of unsafe to deal with pointers as raw bits in the GC, for instance.
But less than you might think.

The translator (next sections) helped for some of the translation.

9

Converting the compiler

Why translate it, not write it from scratch? Correctness, testing.

Steps:

10

Translator

First output was C line-by-line translated to (bad!) Go.
Tool to do this written by rsc@ (talked about at GopherCon 2014).
Custom written for this job, not a general C-to-Go translator.

Steps:

The Yacc grammar was translated by sam-powered hands.

11

Translator configuration

Aided by hand-written rewrite rules, such as:

Also diff-like rewrites for things such as using the standard library:

diff {
-    g.Rpo = obj.Calloc(g.Num*sizeof(g.Rpo[0]), 1).([]*Flow)
-    idom = obj.Calloc(g.Num*sizeof(idom[0]), 1).([]int32)
-    if g.Rpo == nil || idom == nil {
-        Fatal("out of memory")
-    }
+    g.Rpo = make([]*Flow, g.Num)
+    idom = make([]int32, g.Num)
}
12

Another example

This one due to semantic difference between the languages.

diff {
-    if nreg == 64 {
-        mask = ^0 // can't rely on C to shift by 64
-    } else {
-        mask = (1 << uint(nreg)) - 1
-    }
+    mask = (1 << uint(nreg)) - 1
}
13

Grind

Once in Go, new tool grind deployed (by rsc@):

Changes guided by profiling and other analysis:

14

Performance problems

Output from translator was poor Go, and ran about 10X slower.
Most of that slowdown has been recovered.

Problems with C to Go:

C compiler didn't free much memory, but Go has a GC.
Adds CPU and memory overhead.

15

Performance fixes

Profile! (Never done before!)

Use tools like grind, gofmt -r and eg for much of this.

Removing interface argument from a debugging print library got 15% overall!

More remains to be done.

16

Technical benefits

Other benefits of the conversion:

Garbage collection means no more worry about introducing a dangling pointer.

Chance to clean up the back ends.

Unified 386 and amd64 architectures throughout the tool chain.

New architectures are easier to add.

Unified the tools: now one compiler, one assembler, one linker.

17

Compiler

GOOS=YYY GOARCH=XXX go tool compile

One compiler; no more 6g, 8g etc.

About 50K lines of portable code.
Even the registerizer is portable now; architectures well characterized.
Non-portable: Peepholing, details like registers bound to instructions.
Typically around 10% of the portable LOC.

18

Assembler

GOOS=YYY GOARCH=XXX go tool asm

New assembler, all in Go, written from scratch by r@.
Clean, idiomatic Go code.

Less than 4000 lines, <10% machine-dependent.

Almost completely compatible with previous yacc and C assemblers.

How is this possible?

19

Linker

GOOS=YYY GOARCH=XXX go tool link

Mostly hand- and machine- translated from C code.

New library, internal/obj, part of original linker, captures details about machines, writes object files.

27000 lines summed across 4 architectures, mostly tables (plus some ugliness).

Example benefit: one print routine to print any instruction for any architecture.

20

Bootstrap

With no C compiler, bootstrapping requires a Go compiler.

Therefore need to build or download a working Go installation to build 1.5 from source.

We use Go 1.4+ as the base to build the 1.5+ tool chain. (Newer is OK too.)

Details: /s/go15bootstrap

21

Future

Much work still to do, but 1.5 is mostly set.

Future work:

Better escape analysis.
New compiler back end using SSA (much easier in Go than C).
Will allow much more optimization.

Generate machine descriptions from PDFs (or maybe XML).
Will have a purely machine-generated instruction definition:
"Read in PDF, write out an assembler configuration".
Already deployed for the disassemblers.

22

Conclusions

Getting rid of C was a huge advance for the project.
Code is cleaner, testable, profilable, easier to work on.

New unified tool chain reduces code size, increases maintainability.

Flexible tool chain, portability still paramount.

23

Thank you

Rob Pike

Google

Use the left and right arrow keys or click the left and right edges of the page to navigate between slides.
(Press 'H' or navigate to hide this message.)