Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: bad pointer due to map iteration #9384

Closed
rsc opened this issue Dec 18, 2014 · 16 comments
Closed

runtime: bad pointer due to map iteration #9384

rsc opened this issue Dec 18, 2014 · 16 comments

Comments

@rsc
Copy link
Contributor

rsc commented Dec 18, 2014

An internal Google program is executing code like the below and getting occasional runtime crashes during garbage collection:

func computeIncomingDependencies() (map[string]map[string]struct{}, error) {
    gopath := os.Getenv("GOPATH")
    if gopath == "" {
        return nil, fmt.Errorf("GOPATH is not set")
    }
    dirs := strings.Split(gopath, ":")
    allDirs := map[string]struct{}{}
    for _, dir := range dirs {
        if err := collectDirs(filepath.Join(dir, "src"), "", allDirs); err != nil {
            return nil, err
        }
    }
    allDeps := map[string]map[string]struct{}{}
    for dir, _ := range allDirs {
        allDeps[dir] = map[string]struct{}{} <<< GC during makemap on this line <<<
    }
    for dir, _ := range allDirs {
        mode := build.ImportMode(0)
        pkg, err := build.Import(dir, "", mode)
        if err != nil {
            fmt.Errorf("Import(%v, %v) failed: %v", dir, mode, err)
        }
        imports := pkg.Imports
        if includeTestsFlag {
            imports = append(imports, pkg.TestImports...)
        }
        for _, dep := range imports {
            if deps, ok := allDeps[dep]; ok {
                deps[dir] = struct{}{}
            }
        }
    }
    return allDeps, nil
}

A garbage collection happens on the marked line. During the scan of the stack frame corresponding to this function, the garbage collector finds an invalid heap pointer and crashes the program. The invalid heap pointer is at 0x1f8(SP). The map iterator for the loop being executed start at 0x1f0(SP), making this the second word in the iterator, it.value.

The error I am looking at says:

runtime: garbage collector found invalid heap pointer *(0xc20805ec20+0x198)=0xc2080f7000 span=0xc2080ee000-0xc2080f7000-0xc2080f8000 state=0
fatal error: invalid heap pointer

The actual stack frame is sp=0xc20805ebc0, giving the extra 0x60+0x198 = 0x1f8.

This is the generated code for the creation of allDeps and then that loop:

:118    0x45ebc1    e81a7bfaff          CALL runtime.makemap(SB)
:118    0x45ebc6    488b5c2410          MOVQ 0x10(SP), BX
:118    0x45ebcb    48899c2480000000        MOVQ BX, 0x80(SP)
:119    0x45ebd3    488b4c2478          MOVQ 0x78(SP), CX
:119    0x45ebd8    488dbc24f0010000        LEAQ 0x1f0(SP), DI
:119    0x45ebe0    31c0                XORL AX, AX
:119    0x45ebe2    e865acfdff          CALL 0x43984c
:119    0x45ebe7    488d1dd2fc1000          LEAQ 0x10fcd2(IP), BX
:119    0x45ebee    48891c24            MOVQ BX, 0(SP)
:119    0x45ebf2    48894c2408          MOVQ CX, 0x8(SP)
:119    0x45ebf7    488d9c24f0010000        LEAQ 0x1f0(SP), BX
:119    0x45ebff    48895c2410          MOVQ BX, 0x10(SP)
:119    0x45ec04    e8c790faff          CALL runtime.mapiterinit(SB)
:119    0x45ec09    488b9c24f0010000        MOVQ 0x1f0(SP), BX
:119    0x45ec11    31ed                XORL BP, BP
:119    0x45ec13    4839eb              CMPQ BP, BX
:119    0x45ec16    0f84b4000000            JE 0x45ecd0
:119    0x45ec1c    488b9c24f0010000        MOVQ 0x1f0(SP), BX
:119    0x45ec24    4883fb00            CMPQ $0x0, BX
:119    0x45ec28    0f8431060000            JE 0x45f25f
:119    0x45ec2e    488b0b              MOVQ 0(BX), CX
:119    0x45ec31    488b6b08            MOVQ 0x8(BX), BP
:120    0x45ec35    48898c24b8000000        MOVQ CX, 0xb8(SP)
:120    0x45ec3d    48898c2408010000        MOVQ CX, 0x108(SP)
:120    0x45ec45    4889ac24c0000000        MOVQ BP, 0xc0(SP)
:120    0x45ec4d    4889ac2410010000        MOVQ BP, 0x110(SP)
:120    0x45ec55    488d1d64fc1000          LEAQ 0x10fc64(IP), BX
:120    0x45ec5c    48891c24            MOVQ BX, 0(SP)
:120    0x45ec60    48c744240800000000      MOVQ $0x0, 0x8(SP)
:120    0x45ec69    e8727afaff          CALL runtime.makemap(SB) <<< GC here <<<
:120    0x45ec6e    488b5c2410          MOVQ 0x10(SP), BX
:120    0x45ec73    48895c2470          MOVQ BX, 0x70(SP)
:120    0x45ec78    488d1dc1fa1000          LEAQ 0x10fac1(IP), BX
:120    0x45ec7f    48891c24            MOVQ BX, 0(SP)
:120    0x45ec83    488b9c2480000000        MOVQ 0x80(SP), BX
:120    0x45ec8b    48895c2408          MOVQ BX, 0x8(SP)
:120    0x45ec90    488d9c2408010000        LEAQ 0x108(SP), BX
:120    0x45ec98    48895c2410          MOVQ BX, 0x10(SP)
:120    0x45ec9d    488d5c2470          LEAQ 0x70(SP), BX
:120    0x45eca2    48895c2418          MOVQ BX, 0x18(SP)
:120    0x45eca7    e88486faff          CALL runtime.mapassign1(SB)
:119    0x45ecac    488d9c24f0010000        LEAQ 0x1f0(SP), BX
:119    0x45ecb4    48891c24            MOVQ BX, 0(SP)
:119    0x45ecb8    e87392faff          CALL runtime.mapiternext(SB)
:119    0x45ecbd    488b9c24f0010000        MOVQ 0x1f0(SP), BX
:119    0x45ecc5    31ed                XORL BP, BP
:119    0x45ecc7    4839eb              CMPQ BP, BX
:119    0x45ecca    0f854cffffff            JNE 0x45ec1c

The bad pointer is 0xc2080f7000 and the span summary is 0xc2080ee000-0xc2080f7000-0xc2080f8000, meaning that s.limit == 0xc2080f7000. The conclusion seems to be that mapiternext (or mapiterinit, which calls mapiternext) can leave it.value pointing at the end of the underlying map data array.

I don't see how that can happen by reading mapiternext, but it seems to be possible somehow. Should probably fix for Go 1.4.1.

@rsc rsc added this to the Go1.4.1 milestone Dec 18, 2014
@rsc
Copy link
Contributor Author

rsc commented Dec 18, 2014

Ah, I just noticed that the map values are empty structs. Easy to believe a pointer to one might actually be pointing beyond the end of the underlying array.

@randall77
Copy link
Contributor

Yes, that looks like the problem. A pointer to &bucket.value[5] might be at the end of the bucket if the values are 0-sized.

This seems to be a generic problem with taking the address of zero-sized things. I can manufacture a pointer to the end of an object like this:

package main

type T struct {
a, b, c, d int
z [8]struct{}
}

var t *T

func main() {
t = new(T)

println(&t.a)
println(&t.b)
println(&t.c)
println(&t.d)
println(&t.z[6])

}

prints
0x2080e8000
0x2080e8008
0x2080e8010
0x2080e8018
0x2080e8020

That last pointer points to the next object.

We could solve this one in the compiler, replacing &(zero-sized thing) with a pointer to the zero object. But it's an ugly pile of special-case code in the hashmap implementation :(

@randall77
Copy link
Contributor

A) Fix the compiler to convert &x where x is of a zero-sized type to a pointer to the zero object. Make sure to preserve the side-effects of x, if any.
B) Check every place we use add(unsafe.Pointer, ...) in the runtime. Any of those could manufacture an end-of-object pointer. Also in reflect.
C) Tell users not to do #B?

@randall77
Copy link
Contributor

Another option is to add a FlagZeroTail or some such to each type that ends in a zero-sized object. In malloc.go, do "if flags & FlagZeroTail { size++ }"

@RLH
Copy link
Contributor

RLH commented Dec 19, 2014

This basically is what Austin and I were just discussing. An extra word of
allocation is a small price for this rare corner case.

On Thursday, December 18, 2014, Keith Randall notifications@github.com
wrote:

Another option is to add a FlagZeroTail or some such to each type that
ends in a zero-sized object. In malloc.go, do "if flags & FlagZeroTail {
size++ }"


Reply to this email directly or view it on GitHub
#9384 (comment).

@minux
Copy link
Member

minux commented Dec 19, 2014

It seems that solution will make struct{} type less useful as it
will now require at least 1 byte of memory.

@rsc
Copy link
Contributor Author

rsc commented Dec 19, 2014

Let's start by fixing the map code.

@randall77
Copy link
Contributor

If we do the size++ trick we don't need to fix the map code.

struct{} will sometimes use a byte of memory, but often won't. For instance, make(chan struct{}, 100) will only need sizeof(chan header) + 1 bytes, not sizeof(chan header) + 100 bytes. Same thing for maps, the padding only needs to go at the end of the bucket array, not between every bucket.

@RLH
Copy link
Contributor

RLH commented Dec 19, 2014

What is the use case for a struct{} type? Are they common or critical to
some important idiom?

On Thu, Dec 18, 2014 at 8:10 PM, Minux Ma notifications@github.com wrote:

It seems that solution will make struct{} type less useful as it
will now require at least 1 byte of memory.


Reply to this email directly or view it on GitHub
#9384 (comment).

@sjamesr
Copy link

sjamesr commented Dec 19, 2014

We use empty structs as values in a map in order to implement a set type, e.g. map[string]struct{}.

@titanous
Copy link
Member

It's also common to use a buffered chan struct{} as a token bucket, and an unbuffered channel of the same type as a signal.

@minux
Copy link
Member

minux commented Dec 19, 2014

Also map[keyType]struct{} is a popular way to implement a set of keyTypes
without wasting storage.

If we just add one byte to each channel/map with zero-sized types, then I'm
all good.

@rsc
Copy link
Contributor Author

rsc commented Dec 19, 2014

The issue here is that map code is broken for value = struct{}. In general things do work fine with struct{}. For example new(struct{}) is fine. Map code is the exception here, maps with struct{} values are very common, and we need a simple, targeted fix for this specific problem for Go 1.4.1.

It is true that if you have a non-zero-size struct type that has a final zero-sized field, taking the address of that field can give you a pointer beyond the allocation. That's really a separate issue and is a much less common case. Please discuss this more general problem on #9401 and leave this issue for the map problem.

Thanks.

@josharian
Copy link
Contributor

@randall77 as a first pass, could we just do the size++ trick directly in the map code allocations?

(I went to go try this out myself, but I have been unable to write a test that reproduces the original bug. Not sure what I'm missing.)

@rsc
Copy link
Contributor Author

rsc commented Dec 19, 2014

If the value size is zero, I believe we can leave it.value == nil and the bug is fixed. There's no need to change the allocation behavior for this fix (and doing so is risky enough that I wouldn't want it in 1.4.1).

@randall77
Copy link
Contributor

Proposed fix at https://go-review.googlesource.com/#/c/1869/

rsc pushed a commit that referenced this issue Jan 14, 2015
… of bucket

Pointers to zero-sized values may end up pointing to the next
object in memory, and possibly off the end of a span.  This
can cause memory leaks and/or confuse the garbage collector.

By putting the overflow pointer at the end of the bucket, we
make sure that pointers to any zero-sized keys or values don't
accidentally point to the next object in memory.

fixes #9384

Change-Id: I5d434df176984cb0210b4d0195dd106d6eb28f73
Reviewed-on: https://go-review.googlesource.com/1869
Reviewed-by: Russ Cox <rsc@golang.org>
(cherry picked from commit fbc56cf)
Reviewed-on: https://go-review.googlesource.com/2801
Reviewed-by: Andrew Gerrand <adg@golang.org>
jxson pushed a commit to vanadium-archive/go.devtools that referenced this issue Mar 5, 2015
Due to a bug in Go1.4 (see golang/go#9384),
using an empty struct as the value in a map can cause a crash. This CL
works around the issue by using the boolean "true" as the map value.

The workaround will not be required after Go1.4.1.

Change-Id: Id8d51f4b78226780d1513d1978dfdd5071bcaab5
@golang golang locked and limited conversation to collaborators Jun 25, 2016
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jun 25, 2018
… of bucket

Pointers to zero-sized values may end up pointing to the next
object in memory, and possibly off the end of a span.  This
can cause memory leaks and/or confuse the garbage collector.

By putting the overflow pointer at the end of the bucket, we
make sure that pointers to any zero-sized keys or values don't
accidentally point to the next object in memory.

fixes golang#9384

Change-Id: I5d434df176984cb0210b4d0195dd106d6eb28f73
Reviewed-on: https://go-review.googlesource.com/1869
Reviewed-by: Russ Cox <rsc@golang.org>
(cherry picked from commit fbc56cf)
Reviewed-on: https://go-review.googlesource.com/2801
Reviewed-by: Andrew Gerrand <adg@golang.org>
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jun 26, 2018
… of bucket

Pointers to zero-sized values may end up pointing to the next
object in memory, and possibly off the end of a span.  This
can cause memory leaks and/or confuse the garbage collector.

By putting the overflow pointer at the end of the bucket, we
make sure that pointers to any zero-sized keys or values don't
accidentally point to the next object in memory.

fixes golang#9384

Change-Id: I5d434df176984cb0210b4d0195dd106d6eb28f73
Reviewed-on: https://go-review.googlesource.com/1869
Reviewed-by: Russ Cox <rsc@golang.org>
(cherry picked from commit fbc56cf)
Reviewed-on: https://go-review.googlesource.com/2801
Reviewed-by: Andrew Gerrand <adg@golang.org>
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jul 9, 2018
… of bucket

Pointers to zero-sized values may end up pointing to the next
object in memory, and possibly off the end of a span.  This
can cause memory leaks and/or confuse the garbage collector.

By putting the overflow pointer at the end of the bucket, we
make sure that pointers to any zero-sized keys or values don't
accidentally point to the next object in memory.

fixes golang#9384

Change-Id: I5d434df176984cb0210b4d0195dd106d6eb28f73
Reviewed-on: https://go-review.googlesource.com/1869
Reviewed-by: Russ Cox <rsc@golang.org>
(cherry picked from commit fbc56cf)
Reviewed-on: https://go-review.googlesource.com/2801
Reviewed-by: Andrew Gerrand <adg@golang.org>
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jul 20, 2018
… of bucket

Pointers to zero-sized values may end up pointing to the next
object in memory, and possibly off the end of a span.  This
can cause memory leaks and/or confuse the garbage collector.

By putting the overflow pointer at the end of the bucket, we
make sure that pointers to any zero-sized keys or values don't
accidentally point to the next object in memory.

fixes golang#9384

Change-Id: I5d434df176984cb0210b4d0195dd106d6eb28f73
Reviewed-on: https://go-review.googlesource.com/1869
Reviewed-by: Russ Cox <rsc@golang.org>
(cherry picked from commit fbc56cf)
Reviewed-on: https://go-review.googlesource.com/2801
Reviewed-by: Andrew Gerrand <adg@golang.org>
wheatman pushed a commit to wheatman/go-akaros that referenced this issue Jul 30, 2018
… of bucket

Pointers to zero-sized values may end up pointing to the next
object in memory, and possibly off the end of a span.  This
can cause memory leaks and/or confuse the garbage collector.

By putting the overflow pointer at the end of the bucket, we
make sure that pointers to any zero-sized keys or values don't
accidentally point to the next object in memory.

fixes golang#9384

Change-Id: I5d434df176984cb0210b4d0195dd106d6eb28f73
Reviewed-on: https://go-review.googlesource.com/1869
Reviewed-by: Russ Cox <rsc@golang.org>
(cherry picked from commit fbc56cf)
Reviewed-on: https://go-review.googlesource.com/2801
Reviewed-by: Andrew Gerrand <adg@golang.org>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants