Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker stats live container resource metrics #9984

Merged
merged 9 commits into from Jan 21, 2015

Conversation

crosbymichael
Copy link
Contributor

This PR allows you to receive live container metrics for your containers. You can use the docker stats <containers...> cli command to get a live top like interface. This only displays a few of the metrics available.

docker stats insurgency1 insurgency2 insurgency3 minecraft-family redis

CONTAINER           CPU %               MEM USAGE/LIMIT     MEM %               NET I/O
insurgency1         3.62%               244.4 MB/2.099 GB   11.64%              0 B/0 B
insurgency2         4.65%               135.6 MB/2.099 GB   6.46%               0 B/0 B
insurgency3         3.65%               79.18 MB/2.099 GB   3.77%               0 B/0 B
minecraft-family    14.13%              408.6 MB/2.099 GB   19.47%              0 B/0 B
redis               0.17%               6.558 MB/67.11 MB   9.77%               648 B/648 B

For people who require more information they can subscribe to the container's stats stream and receive more information such as blkio information.

GET /v1/containers/redis/stats

{
   "read" : "2015-01-08T22:57:31.547920715Z",
   "network" : {
      "rx_dropped" : 0,
      "rx_bytes" : 648,
      "rx_errors" : 0,
      "tx_packets" : 8,
      "tx_dropped" : 0,
      "rx_packets" : 8,
      "tx_errors" : 0,
      "tx_bytes" : 648
   },
   "memory_stats" : {
      "stats" : {
         "total_pgmajfault" : 0,
         "cache" : 0,
         "mapped_file" : 0,
         "total_inactive_file" : 0,
         "pgpgout" : 414,
         "rss" : 6537216,
         "total_mapped_file" : 0,
         "writeback" : 0,
         "unevictable" : 0,
         "pgpgin" : 477,
         "total_unevictable" : 0,
         "pgmajfault" : 0,
         "total_rss" : 6537216,
         "total_rss_huge" : 6291456,
         "total_writeback" : 0,
         "total_inactive_anon" : 0,
         "rss_huge" : 6291456,
         "hierarchical_memory_limit" : 67108864,
         "total_pgfault" : 964,
         "total_active_file" : 0,
         "active_anon" : 6537216,
         "total_active_anon" : 6537216,
         "total_pgpgout" : 414,
         "total_cache" : 0,
         "inactive_anon" : 0,
         "active_file" : 0,
         "pgfault" : 964,
         "inactive_file" : 0,
         "total_pgpgin" : 477
      },
      "max_usage" : 6651904,
      "usage" : 6537216,
      "failcnt" : 0,
      "limit" : 67108864
   "blkio_stats" : {},
   "cpu_stats" : {
      "cpu_usage" : {
         "percpu_usage" : [
            16970827,
            1839451,
            7107380,
            10571290
         ],
         "usage_in_usermode" : 10000000,
         "total_usage" : 36488948,
         "usage_in_kernelmode" : 20000000
      },
      "system_cpu_usage" : 20091722000000000,
      "throttling_data" : {}
   }
}

@@ -97,6 +97,7 @@ func init() {
{"save", "Save an image to a tar archive"},
{"search", "Search for an image on the Docker Hub"},
{"start", "Start a stopped container"},
{"stats", "Receive container stasts"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo here

@bobrik
Copy link
Contributor

bobrik commented Jan 8, 2015

What about showing stats for all running containers if <containers...> is not specified? You can run docker stats $(docker ps -q), but showing stats for everything that is running looks like a sane default.

@vieux
Copy link
Contributor

vieux commented Jan 8, 2015

can you go up the exact number of lines and rewrite on top of it, instead of clearing the screen.

(that's how we do for pull)

@crosbymichael
Copy link
Contributor Author

@bobrik I guess it would depend on how many containers you have. You can always do

docker stats `docker ps -q`

@vieux
Copy link
Contributor

vieux commented Jan 8, 2015

exited containers aren't handled properly

@crosbymichael
Copy link
Contributor Author

@vieux I think exited containers should get zeroed out then when they start again you start getting metrics. What do you think of that functionality ?

func (s *statsCollector) start() {
go func() {
for _ = range time.Tick(s.interval) {
log.Debugf("starting collection of container stats")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this debug is printed way too often, I feel like the -D is useless with it.

cpuPercent = calcuateCpuPercent(previousCpu, previousSystem, v)
}
start = false
d := data[name]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with map concurrency guarantees, but is that a safe thing to do without locking?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no ;)

ThrottledTime uint64 `json:"throttled_time,omitempty"`
}

// All CPU stats are aggregate since container inception.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/aggregate/aggregated

@vieux
Copy link
Contributor

vieux commented Jan 8, 2015

@crosbymichael I mean the cli is hanging and waiting for

@@ -49,6 +49,9 @@ func (daemon *Daemon) ContainerRm(job *engine.Job) engine.Status {
}

if container != nil {
// stop collection of stats for the container reguardless
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/reguardless/regardless/

@vieux
Copy link
Contributor

vieux commented Jan 8, 2015

I got a panic when trying to delete a container:

DEBU[0010] Calling DELETE /containers/{name:.*}
INFO[0010] DELETE /v1.16/containers/b0808d936b19?force=1
INFO[0010] +job rm(b0808d936b19)
INFO[0010] -job rm(b0808d936b19)
2015/01/08 23:28:30 http: panic serving @: runtime error: invalid memory address or nil pointer dereference
goroutine 42 [running]:
net/http.func·011()
    /usr/local/go/src/net/http/server.go:1130 +0xbb
github.com/docker/docker/daemon.(*statsCollector).stopCollection(0xc20822b780, 0xc208035ba0)
    /go/src/github.com/docker/docker/daemon/stats_collector.go:66 +0x141
github.com/docker/docker/daemon.(*Daemon).ContainerRm(0xc2080dbad0, 0xc208031100, 0x2)
    /go/src/github.com/docker/docker/daemon/delete.go:54 +0x7a2
github.com/docker/docker/daemon.*Daemon.ContainerRm·fm(0xc208031100, 0x7ffbfc292270)
    /go/src/github.com/docker/docker/daemon/daemon.go:118 +0x31
github.com/docker/docker/engine.(*Job).Run(0xc208031100, 0x0, 0x0)
    /go/src/github.com/docker/docker/engine/job.go:83 +0x936
github.com/docker/docker/api/server.deleteContainers(0xc2080db4a0, 0xc2081e9f89, 0x4, 0x7ffbfc29a990, 0xc2082030e0, 0xc208033e10, 0xc208144570, 0x0, 0x0)
    /go/src/github.com/docker/docker/api/server/server.go:768 +0x39e
github.com/docker/docker/api/server.func·002(0x7ffbfc29a990, 0xc2082030e0, 0xc208033e10)
    /go/src/github.com/docker/docker/api/server/server.go:1243 +0x940
net/http.HandlerFunc.ServeHTTP(0xc2080563c0, 0x7ffbfc29a990, 0xc2082030e0, 0xc208033e10)
    /usr/local/go/src/net/http/server.go:1265 +0x41
github.com/gorilla/mux.(*Router).ServeHTTP(0xc208095220, 0x7ffbfc29a990, 0xc2082030e0, 0xc208033e10)
    /go/src/github.com/docker/docker/vendor/src/github.com/gorilla/mux/mux.go:98 +0x2b9
net/http.serverHandler.ServeHTTP(0xc208054fc0, 0x7ffbfc29a990, 0xc2082030e0, 0xc208033e10)
    /usr/local/go/src/net/http/server.go:1703 +0x19a
net/http.(*conn).serve(0xc208202e60)
    /usr/local/go/src/net/http/server.go:1204 +0xb57
created by net/http.(*Server).Serve
    /usr/local/go/src/net/http/server.go:1751 +0x35e

@crosbymichael
Copy link
Contributor Author

Fixed the panic

@crosbymichael
Copy link
Contributor Author

@vieux fixed the issue where you request stats for a non running container, it will go ahead and add it then when it's started you see the stats start flowing in.

@bfirsh bfirsh added the UX label Jan 9, 2015

func calcuateCpuPercent(previousCpu, previousSystem uint64, v *stats.Stats) float64 {
cpuPercent := 0.0
cpuDelta := float64(v.CpuStats.CpuUsage.TotalUsage) - float64(previousCpu)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check previousCpu < TotalUsage, otherwise negative numbers appear on container restart.

@tobegit3hub
Copy link

@crosbymichael It's like what @LK4D4 said. We may want to get the one-time status of containers.

@ekuric
Copy link
Contributor

ekuric commented Jan 29, 2015

I noticed if I run
$ docker stats $(docker ps -q) and in second console start new container above view will not be automatically refreshed to include container(s) started after
$ docker stats $(docker ps -q) was started
tested with binaries attached earlier by @jfrazelle

@cpuguy83
Copy link
Member

@ekuric This is expected $(docker ps -q) is only run once, and is only pulling those ids.

@ekuric
Copy link
Contributor

ekuric commented Jan 29, 2015

@cpuguy83 ah,yes, thank you,sorry for noise

@gbelur
Copy link

gbelur commented Feb 11, 2015

Thanks! This is a great addition. Are there plans to let users control the frequency with which they can extract stats data from the api endpoint? 1 second may be too aggressive when there are a lot of containers running on a host.

@ghost
Copy link

ghost commented Feb 11, 2015

The CPU usage % seem to be stuck at 0% for me. Being able to monitor the growing size of a running container and its disk I/O would also be very useful.

@cicoub13
Copy link

Thanks for this new feature. Works like a charm

@crosbymichael
Copy link
Contributor Author

@usertaken if you want disk IO for monitoring hit the API, the CLI is very dumbed down for a quick and simple view. If you are using this for monitoring use the api

@tehmaspc
Copy link
Contributor

very nice stuff!

@alicek106
Copy link

I'm also waiting for the option like --count or something that just capture the stats once. It will be very useful when getting remote parameter.

@lars2893
Copy link

+1 to the count option. I'd love to create a New Relic plugin for this data but there is a lot of extra work I have to do if I don't have control over the polling interval (plus the consumption of a continuous stream is much more involved process)

@tobegit3hub
Copy link

For me, I would like to get the current stats of the containers and no need to return the whole stream. It would be great to support this feature 😃

@alicek106
Copy link

I agree :)

@SamSaffron
Copy link

+1 for a simply way of just getting current stats as opposed to subscribing and disconnecting

@SamSaffron
Copy link

@SvenDowideit API docs need a bit more info on what the cpu numbers mean and how you go about converting them to % cpu

@bobrik
Copy link
Contributor

bobrik commented Feb 14, 2015

+1 for just getting current numbers without subscription. @crosbymichael I can create a separate issue.

@kenzodeluxe
Copy link

+1 :)

@noisy
Copy link

noisy commented Feb 16, 2015

👍

@sprin
Copy link

sprin commented Feb 25, 2015

Stats are nice. Thanks.

Here's a one-liner that does what you would expect docker stats to do:

docker stats `docker ps | tail -n+2 | awk '{ print $NF }'`

However, the flicker is distinctly unpleasant.

@jovandeginste
Copy link

I love and appreciate the concept and current possibilities, however I have some thoughts:

I don't understand why it is repeating (or why it doesn't have the count flag). I could eg. use 'watch' and its features to show me a constant update if I want that.

I also find it hard to imagine writing a simple script to parse the json data if it has to be running forever. I'd rather query once a minute and send those stats to graphite (just an example), but I don't see how I could accomplish this in a simple way (someone has any thoughts?)

I would also appreciate the presence of an 'all' option, especially for the json-part (where you would have to run several queries to get all container data)

@km4rcus
Copy link

km4rcus commented Mar 20, 2015

+1 for a count option.

@thaJeztah
Copy link
Member

Please stop adding new feature requests to a closed/merged PR. If you have an enhancement/feature request, open a new issue.

For those looking for a "count" option; there is a PR that implements a --no-stream flag, which will return the stats only once; Allow pulling stats once and disconnecting. #10766

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet