|
|
Subscribe / Log in / New account

PyParallel

Please consider subscribing to LWN

Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net.

By Jake Edge
April 14, 2015
Python Language Summit

PyParallel is an alternative version of Python that is aimed at removing the global interpreter lock (GIL) to provide better performance through parallel processing. Trent Nelson prefaced his talk by saying that he hadn't made much progress on PyParallel since he presented it at PyCon 2013. He did give a few talks in the interim that were well-received, however. He got started back working on the code in December 2014, with a focus on making it stable while running the TechEmpower Frameworks Benchmark, which "bombs the server" with lots of clients making simple requests that the server responds to with JSON or plaintext. The benchmark has lots of problems, he said, but it is better than nothing.

Because it focuses on that benchmark, PyParallel performs really well when running it, Nelson said. So it is really good at stateless HTTP and maintains low latency even under a high load. It will saturate all CPUs available, with 98% of that in user space and just 2% in the Windows kernel.

The latency is low, and it also has low variance. On a normal run of the benchmark with clients attempting to make 50,000 requests per second, PyParallel shows a fairly flat graph, with relatively few outliers. Nelson displayed the graphs, which report the following. Tornado and Node.js servers on the same hardware showed a lot more variance in latency (as well as higher latency than PyParallel overall). Node.js performed better than Tornado, but had some outliers that were seven times the size of the mean (Tornado and PyParallel had worst-case latencies less than three times their mean). Both the Tornado and Node.js benchmarks were run on Linux, since they are targeted at that operating system, while PyParallel was run on Windows for the same reason.

Nelson is working on another test that is more complicated than the simple, stateless HTTP benchmark. It is an instantaneous search feature for 50GB of Wikipedia article data, but it is not working yet.

PyParallel is running on Python 3.3.5. He plans to use the customizable memory allocators that are provided by Python 3.4 and would like to see that API extended so that the reference-count-management operations could also be customized.

Effectively, PyParallel tests to see if an operation is happening in parallel and, if so, performs a thread-safe version. In the common case where the operations have naturally been serialized, it takes the faster normal path. Minimizing the overhead of that test is one of the best ways to increase performance.

In the process of his work, he broke generators and exceptions, at least temporarily. He purposely disabled importing and trace functions. He also "destroyed the PyObject structure" by adding a bunch of pointers to it. Most of those pointers are not needed, so he plans to clean it all up.

People can get the code at download.pyparallel.org. "At the very least, it is very very fast", he said. He has also hacked up the CPython code to such an extent that it makes a good development testbed for others.


Index entries for this article
ConferencePython Language Summit/2015


(Log in to post comments)


Copyright © 2015, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds