Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve timestamp resolution #55

Closed
brendangregg opened this issue May 23, 2013 · 7 comments
Closed

improve timestamp resolution #55

brendangregg opened this issue May 23, 2013 · 7 comments

Comments

@brendangregg
Copy link

The timestamp variable returns time with a ~10 ms granularity (based on xtime cache). For performance analysis it needs much better resolution.

This looks like it is due to dtrace_gethrtime() in driver/dtrace_linux.c, which looks like a work in progress.

As a workaround, I've been making the following change:

diff --git a/driver/dtrace_linux.c b/driver/dtrace_linux.c
index ac13788..8bb2ed3 100644
--- a/driver/dtrace_linux.c
+++ b/driver/dtrace_linux.c
@@ -387,6 +387,9 @@ dtrace_gethrtime()
 {
        struct timespec ts;

+       getnstimeofday(&ts);
+       return (hrtime_t) ts.tv_sec * 1000 * 1000 * 1000 + ts.tv_nsec;
+
        /*
        void (*ktime_get_ts)() = get_proc_addr("ktime_get_ts");
        if (ktime_get_ts == NULL) return 0;

WARNING: getnstimeofday() grabs seqlock. I don't know if this is safe to do in DTrace context - I'd assume it isn't until known otherwise. Generally, grabbing locks in DTrace context is unsafe (blocking while interrupts disabled), however, the characteristics of seqlock sound similar to the behavior of the Solaris dtrace_gethrtime(), so this might be ok.

As some example output with the above change, showing the resolution is satisfactory:

root@ubuntu:~/linux# dtrace -n 'syscall:::entry { self->ts = timestamp; }
    syscall:::return /self->ts/ { @["ns"] = quantize(timestamp - self->ts); self->ts = 0; }'
dtrace: description 'syscall:::entry ' matched 1318 probes
^C

  ns                                                
           value  ------------- Distribution ------------- count    
             256 |                                         0        
             512 |@                                        250      
            1024 |@@@@@@@@@@@@@@@@@@                       2982     
            2048 |@@                                       424      
            4096 |@                                        106      
            8192 |@@@@@@@@@@@@@@@@@                        2815     
           16384 |@                                        127      
           32768 |                                         3        
           65536 |                                         6        
          131072 |                                         17       
          262144 |                                         0        
          524288 |                                         0        
[...]
@dtrace4linux
Copy link
Owner

Hi Brendan,

what kernel are you doing this on. Its not so much as this function is a
work-in-progress, but the timer mechanisms keep changing from kernel to
kernel. Looking at my 3.8 kernel which shows xtimer_cache is missing from
the kernel, then devolves to a zero response which is pretty horrid.

You are right that getnstimeofday() grabs a lock, so it requires unwrapping
the code to get a lock (and suffer the odd blip) to do this. I will look at
this, especially as you have given me the test case to prove the badness
and see what I can do.

thanks

On 23 May 2013 01:40, Brendan Gregg notifications@github.com wrote:

The timestamp variable returns time with a ~10 ms granularity (based on
xtime cache). For performance analysis it needs much better resolution.

This looks like it is due to dtrace_gethrtime() in driver/dtrace_linux.c,
which looks like a work in progress.

As a workaround, I've been making the following change:

diff --git a/driver/dtrace_linux.c b/driver/dtrace_linux.cindex ac13788..8bb2ed3 100644--- a/driver/dtrace_linux.c+++ b/driver/dtrace_linux.c@@ -387,6 +387,9 @@ dtrace_gethrtime()
{
struct timespec ts;

  •   getnstimeofday(&ts);+       return (hrtime_t) ts.tv_sec \* 1000 \* 1000 \* 1000 + ts.tv_nsec;+
    /*
    void (*ktime_get_ts)() = get_proc_addr("ktime_get_ts");
    if (ktime_get_ts == NULL) return 0;
    

WARNING: getnstimeofday() grabs seqlock. I don't know if this is safe to
do in DTrace context - I'd assume it isn't until known otherwise.
Generally, grabbing locks in DTrace context is unsafe (blocking while
interrupts disabled), however, the characteristics of seqlock sound similar
to the behavior of the Solaris dtrace_gethrtime(), so this might be ok.

As some example output with the above change, showing the resolution is
satisfactory:

root@ubuntu:~/linux# dtrace -n 'syscall:::entry { self->ts = timestamp; }
syscall:::return /self->ts/ { @["ns"] = quantize(timestamp - self->ts); }'
dtrace: description 'syscall:::entry ' matched 1318 probes
^C

ns
value ------------- Distribution ------------- count
256 | 0
512 |@ 250
1024 |@@@@@@@@@@@@@@@@@@ 2982
2048 |@@ 424
4096 |@ 106
8192 |@@@@@@@@@@@@@@@@@ 2815
16384 |@ 127
32768 | 3
65536 | 6
131072 | 17
262144 | 0
524288 | 0
[...]


Reply to this email directly or view it on GitHubhttps://github.com//issues/55
.

@dtrace4linux
Copy link
Owner

I've pushed out a new release which - not using your change - but comes in
at a different mechanism, to get what I was trying to achieve. Can you try
this out and see if it works (this is lockless).

The issue is that on the later kernels, we run without access to the
timestamp data (defaults to the 'return 0' case). With this change, in 3.4
and later kernels (maybe earlier too), it uses a different API which
exposes what I wanted. (They moved the xtime data structure inside a
timekeeping struct).

On 25 May 2013 16:37, Paul Fox paul.d.fox@gmail.com wrote:

Hi Brendan,

what kernel are you doing this on. Its not so much as this function is a
work-in-progress, but the timer mechanisms keep changing from kernel to
kernel. Looking at my 3.8 kernel which shows xtimer_cache is missing from
the kernel, then devolves to a zero response which is pretty horrid.

You are right that getnstimeofday() grabs a lock, so it requires
unwrapping the code to get a lock (and suffer the odd blip) to do this. I
will look at this, especially as you have given me the test case to prove
the badness and see what I can do.

thanks

On 23 May 2013 01:40, Brendan Gregg notifications@github.com wrote:

The timestamp variable returns time with a ~10 ms granularity (based on
xtime cache). For performance analysis it needs much better resolution.

This looks like it is due to dtrace_gethrtime() in driver/dtrace_linux.c,
which looks like a work in progress.

As a workaround, I've been making the following change:

diff --git a/driver/dtrace_linux.c b/driver/dtrace_linux.cindex ac13788..8bb2ed3 100644--- a/driver/dtrace_linux.c+++ b/driver/dtrace_linux.c@@ -387,6 +387,9 @@ dtrace_gethrtime()
{
struct timespec ts;

  •   getnstimeofday(&ts);+       return (hrtime_t) ts.tv_sec \* 1000 \* 1000 \* 1000 + ts.tv_nsec;+
    /*
    void (*ktime_get_ts)() = get_proc_addr("ktime_get_ts");
    if (ktime_get_ts == NULL) return 0;
    

WARNING: getnstimeofday() grabs seqlock. I don't know if this is safe to
do in DTrace context - I'd assume it isn't until known otherwise.
Generally, grabbing locks in DTrace context is unsafe (blocking while
interrupts disabled), however, the characteristics of seqlock sound similar
to the behavior of the Solaris dtrace_gethrtime(), so this might be ok.

As some example output with the above change, showing the resolution is
satisfactory:

root@ubuntu:~/linux# dtrace -n 'syscall:::entry { self->ts = timestamp; }
syscall:::return /self->ts/ { @["ns"] = quantize(timestamp - self->ts); }'
dtrace: description 'syscall:::entry ' matched 1318 probes
^C

ns
value ------------- Distribution ------------- count
256 | 0
512 |@ 250
1024 |@@@@@@@@@@@@@@@@@@ 2982
2048 |@@ 424
4096 |@ 106
8192 |@@@@@@@@@@@@@@@@@ 2815
16384 |@ 127
32768 | 3
65536 | 6
131072 | 17
262144 | 0
524288 | 0
[...]


Reply to this email directly or view it on GitHubhttps://github.com//issues/55
.

@brendangregg
Copy link
Author

Thanks; I just built it on 3.5, and this looks the same (no sub-millisecond resolution):

root@ubuntu:~/linux# dtrace -n 'syscall:::entry { self->ts = timestamp; } syscall:::return /self->ts/ { @["ns"] = quantize(timestamp - self->ts); self->ts = 0; }'
dtrace: description 'syscall:::entry ' matched 1318 probes
^C

  ns                                                
           value  ------------- Distribution ------------- count    
              -1 |                                         0        
               0 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  8347     
               1 |                                         0        
               2 |                                         0        
               4 |                                         0        
               8 |                                         0        
              16 |                                         0        
              32 |                                         0        
              64 |                                         0        
             128 |                                         0        
             256 |                                         0        
             512 |                                         0        
            1024 |                                         0        
            2048 |                                         0        
            4096 |                                         0        
            8192 |                                         0        
           16384 |                                         0        
           32768 |                                         0        
           65536 |                                         0        
          131072 |                                         0        
          262144 |                                         0        
          524288 |                                         0        
         1048576 |                                         0        
         2097152 |                                         13       
         4194304 |                                         0        
         8388608 |                                         1        
        16777216 |                                         3        
        33554432 |                                         15       
        67108864 |                                         103      
       134217728 |                                         15       
       268435456 |                                         4        
       536870912 |                                         11       
      1073741824 |                                         2        
      2147483648 |                                         0        

root@ubuntu:~/linux# uname -a
Linux ubuntu 3.5.0-23-generic #35~precise1-Ubuntu SMP Fri Jan 25 17:13:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

@brendangregg
Copy link
Author

This is also in a VM (could make a difference in how TSC is made available).

@dtrace4linux
Copy link
Owner

I'm experimenting with different mechanisms to get to the TSC clock. If you
examine /proc/dtrace/trace, which timestamps messages using the
dtrace_gethrtime() - you can see theres no precision at all - looks to be
10ms or worse. I have just tried something that goes to the TSC - but that
seems to be causing my VM running Xen to force restart - so my experiment
is very dodgy.

Let me try a few things and get back to you.

thanks

On 28 May 2013 17:37, Brendan Gregg notifications@github.com wrote:

This is also in a VM (could make a difference in how TSC is made
available).


Reply to this email directly or view it on GitHubhttps://github.com//issues/55#issuecomment-18562994
.

@dtrace4linux
Copy link
Owner

Ok - new release with the "fix". It works on my 3.4.6 kernel by going for
native_sched_clock() as the source of the nsec clock. It seems to work, e.g.

ns
value ------------- Distribution ------------- count
32 | 0
64 | 1
128 | 0
256 | 0
512 | 0
1024 |@@@@@@ 82
2048 |@@ 20
4096 |@@@@@@@@@@@@@@@@@@@@ 266
8192 |@@@ 34
16384 |@@ 28
32768 |@@@@@ 65
65536 |@ 9
131072 |@ 11
262144 | 4
524288 | 0
1048576 | 0
2097152 | 0
4194304 | 0
8388608 | 0
16777216 | 1
33554432 | 0
67108864 | 1
134217728 | 1
268435456 | 0
536870912 | 2
1073741824 | 0

On 28 May 2013 22:54, Paul Fox paul.d.fox@gmail.com wrote:

I'm experimenting with different mechanisms to get to the TSC clock. If
you examine /proc/dtrace/trace, which timestamps messages using the
dtrace_gethrtime() - you can see theres no precision at all - looks to be
10ms or worse. I have just tried something that goes to the TSC - but that
seems to be causing my VM running Xen to force restart - so my experiment
is very dodgy.

Let me try a few things and get back to you.

thanks

On 28 May 2013 17:37, Brendan Gregg notifications@github.com wrote:

This is also in a VM (could make a difference in how TSC is made
available).


Reply to this email directly or view it on GitHubhttps://github.com//issues/55#issuecomment-18562994
.

@brendangregg
Copy link
Author

Excellent, thanks - works for me!

root@ubuntu:~/linux# dtrace -n 'syscall:::entry { self->ts = timestamp; } syscall:::return /self->ts/ { @["ns"] = quantize(timestamp - self->ts); self->ts = 0; }'
dtrace: description 'syscall:::entry ' matched 1318 probes
^C

  ns                                                
           value  ------------- Distribution ------------- count    
             256 |                                         0        
             512 |@                                        95       
            1024 |@@@@@@@@@@@@@@@@                         1281     
            2048 |@@@                                      217      
            4096 |@                                        61       
            8192 |@@@@@@@@@@@@@@@@                         1263     
           16384 |@                                        90       
           32768 |                                         9        
           65536 |                                         8        
          131072 |                                         17       
          262144 |                                         0        
          524288 |                                         0        
         1048576 |                                         0        
         2097152 |                                         0        
         4194304 |                                         0        
         8388608 |                                         3        
        16777216 |                                         1        
        33554432 |                                         7        
        67108864 |@                                        48       
       134217728 |                                         10       
       268435456 |                                         4        
       536870912 |                                         6        
      1073741824 |                                         0        

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants