New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scheduler: using inefficient math library causes significant slowdown #18126
Comments
Also /cc @davidopp @wojtek-t @gmarek @timothysc |
Probably just doing the rounding once per pod (instead of per pod-node combo as I'm sure it does now) would be sufficient--caching or doing something like a constructor. Alternatively, writing a special-purpose Value/MilliValue function probably could solve this. @timstclair is making some changes to quantity right now. |
That is what I did for the mock up. This seems not to be very hard. Or we can fix the upstream library if we want. @timstclair Any idea? |
I think once #17548 goes in, we should just optimize |
@timstclair Great! Let me know when it is merged. I am happy to work on the optimization. |
@xiang90 - this sounds very promising. Thanks a lot! |
@xiang90 - great job! |
👍 |
The original scale function takes around 800ns/op with more than 10 allocations. It significantly slow down scheduler and other components that heavily relys on resource pkg. For more information see kubernetes#18126. This pull request tries to optimize scale function. It takes two approach: 1. when the value is small, only use normal math ops. 2. when the value is large, use math.Big with buffer pool. The final result is: BenchmarkScaledValueSmall-4 20000000 66.9 ns/op 0 B/op 0 allocs/op BenchmarkScaledValueLarge-4 2000000 711 ns/op 48 B/op 1 allocs/op I also run the scheduler benchmark again. It doubles the throughput of scheduler for 1000 nodes case.
ref/ #18418 |
/cc @HardySimpson |
Closed by #18170 |
simplify openshift controller lease Origin-commit: 2166a2a977f12b6ce930a8fefdf67ea6c266064a
simplify openshift controller lease Origin-commit: 2166a2a977f12b6ce930a8fefdf67ea6c266064a
We did a few benchmark and profiling tests for the scheduler for 1000 nodes cluster.
We found the scheduler component is the bottleneck of starting pods. And it spents most of the time in an external pkg (https://github.com/kubernetes/kubernetes/blob/master/pkg/api/resource/quantity.go#L27) that handling math operations.
Profiling result for schedule 1000 pods over 1000 nodes:
https://storage.googleapis.com/profiling/org.svg
By giving the library a closer look, I found it does too many unnecessary allocations and causes GC pressure.
By mocking an efficient version of rounding function (do not handle overflow), we reduced nearly 50% of the latency for secluding 1000 pods(23s vs 53s). As we scheduled more pods, the effect getting more significant since it generates more objects per schedule.
New profiling result for schedule 1000 pods over 1000 nodes:
https://storage.googleapis.com/profiling/fix.svg
Any suggestion on how we should improve this?
/cc @hongchaodeng @lavalamp @philips
The text was updated successfully, but these errors were encountered: