Python 3.5 is on its way
Please consider subscribing to LWN Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net. |
It has been nearly a year and a half since the last major Python release, which was 3.4 in March 2014—that means it is about time for Python 3.5. We looked at some of the new features in 3.4 at the time of its first release candidate, so the announcement of the penultimate beta release for 3.5 seems like a good time to see what will be coming in the new release. Some of bigger features are new keywords to support coroutines, an operator for matrix multiplication, and support for Python type checking, but there are certainly more. Python 3.5 is currently scheduled for a final release in September.
As usual, the "What's New In Python 3.5" document is the place to start for information about the release. At this point, it is still in draft form, with significant updates planned over the next couple of months. But there is lots to digest in what's there already.
Type hints
One of the more discussed features for 3.5 only showed up in the first beta back in May: type hints (also known as PEP 484). It provides an way to optionally annotate the types of Python functions, arguments, and variables in a way that can be used by various tools. Type hints have been championed by Python benevolent dictator for life (BDFL) Guido van Rossum and came about rather late in the 3.5 development cycle, so there was always a bit of a question whether it would make the feature freeze that came with the first beta release.
But, just prior to that deadline, BDFL-Delegate Mark Shannon accepted the PEP on May 22, with a nod to some of the opposition to the feature that has arisen:
Shannon continued by noting that other languages have an operator-overloading
feature that often get used in ugly ways, but that Python's is generally
used sensibly. That's because of Python's culture, which promotes the idea
that "readability matters
". He concluded:
"Python is your language, please use type-hints
responsibly :)
".
Matrix multiplication
A new binary operator will be added to Python to support matrix multiplication (specified by PEP 465). "@" is a new infix operator that shares its precedence with the standard "*" operator for regular multiplication. The "@=" operator will perform the matrix multiplication and assignment, as the other, similar operators do.
The @ operator will be implemented with two "dunder" (double underscore) methods on objects: __matmul__() and __rmatmul__() (the latter is for right-side matrix multiplication when the left-side object does not support @). Unsurprisingly, @= is handled with the __imatmul__() method. That is all the new feature defines, since Python does not have a matrix type, nor does the language define what the @ operator actually does. That means no standard library or builtin types are being changed to support matrix multiplication, though a number of different scientific and numeric Python projects have reached agreement on the intended use and semantics of the operator.
Currently, developers of libraries like NumPy have to make a decision about how to implement Python's multiplication operator (*) for matrices. All of the other binary operators (e.g. +, -, /) can only reasonably be defined element-wise (i.e. applying the operation to corresponding elements in each operand), but multiplication is special in that both the element-wise and the specific 2D matrix varieties are useful. So, NumPy and others have resorted to using a method for 2D matrix multiplication, which detracts from readability.
The only other use of @ in Python is at the start of a decorator, so there will be no confusion in parsing the two different ways of using it. Soon, users of NumPy will be able to multiply matrices in a straightforward way using statements like:
z = a @ b a @= b
async and await
Another fast-moving feature adds new keywords for coroutine support to the language. PEP 492 went from being proposed in early April to being accepted and having its code merged in early May. The idea is to make it easier to work with coroutines in Python, as the new async and await keywords don't really add new functionality that couldn't be accomplished with existing language constructs. But they do provide readability improvements, which is one of the main drivers of the addition.
Essentially, functions, for loops, and with statements can be declared as asynchronous (i.e. may suspend their execution) by adding the async keyword:
async def foo(...): ... async for data in cursor: ... async with lock: ... await foo() ...
The await statement that can be seen in the with example is similar to the yield from statement that was added to Python 3.3 (described in PEP 380). It suspends execution until the awaitable foo() completes and returns its result.
While async and await will eventually become full-fledged keywords in Python, that won't happen until Python 3.7 in roughly three years. The idea is to allow programmers time to switch away from using those two statement names as variables, which is not allowed for Python keywords. As it turns out, await can only appear in async constructs, so the parser will keep track of its context and treat those strings as variables outside of those constructs. It is a clever trick that allows the language to "softly deprecate" the keywords.
Zip applications
Over the last few years there has been a lot of discussion on python-dev, python-ideas, and elsewhere about how to easily distribute Python programs; the zipapp module (specified in PEP 441) is a step along that path. Partly, the PEP authors simply want to promote a feature that was added back in the Python 2.6 days. It provides the ability to execute a ZIP format file that contains a collection of Python files, including a __main__.py, which is where the execution starts.
Besides just publicizing a "relatively unknown
" language
feature, the zipapp module would be added to provide some tools to help
create and maintain
"Python Zip Applications", which is the formal name for these application
bundles. The bundles will have a .pyz extension for console
applications, while windowed applications (i.e.
won't require a console on Windows) will end in .pyzw . The
Windows installer will associate those extensions with the Python
launcher. It is hoped that .pyz will be connected with Python on
Unix-like systems (by way of the MIME-info
database), but that is not directly under the control of the Python
developers.
The appropriate interpreter (i.e. Python 2 or Python 3) can be associated with the file by prepending a "shebang" line to the ZIP format (which ZIP archivers will simply skip). For example (from the PEP):
#!/usr/bin/env python3 # Python application packed with zipapp module (binary contents of archive)
In addition, the zipapp module can be directly accessed from the command line. It provides two helper functions (create_archive() and get_interpreter()) but will likely be used as follows:
$ python -m zipapp dirnameThat creates a dirname.pyz file that contains the files in the directory (which must include a __main__.py). There are also command-line options that govern the archive name, the interpreter to use, or the main function to call from a __main__.py that zipapp will generate.
Formatting bytes
The distinction between bytes and strings in Python 3 is one of the defining characteristics of the language. That was done to remove some of the ambiguities and "magic" from handling Unicode that are present in Python 2. "Bytes" are simply immutable sequences of integers, each of which is in the range 0–255, while "bytearrays" are their mutable counterpart. There is no interpretation placed on the byte types; strings, on the other hand, always contain Unicode.
But there are a number of "wire protocols" that combine binary and textual data (typically ASCII), so when writing code to deal with those, it would be convenient to be able to use the Python string formatting operations to do so. Today, though, trying to interpolate an integer value into a bytes type does not work:
>>> b'%d' % 5 Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for %: 'bytes' and 'int'
The adoption of PEP 461 will change that. All of the numeric formatting codes, when applied to bytes, will be treated as if they were:
('%x' % val).encode('ascii')Where x is any of the dozen or so numeric formatting codes. In addition, %c will insert a single byte (either from an integer in the 0–255 range or from a byte type of length 1) and %b can be used to interpolate a series of bytes. Neither of those will accept string types since converting a string to bytes requires an encoding, which Python 3 will not try to guess.
In addition, bytes and bytearrays will be getting a hex() method. That will allow turning bytes into hexadecimal strings:
>>> bytes([92, 83, 80, 255]).hex() '5c5350ff'
The grab bag
There are, of course, lots more features coming in Python 3.5—way more than we can cover here—but there are some additional ones that caught our eye and we will look at briefly. For example, the ordered dictionary in collections.OrderedDict has been reimplemented in C, which provides a 4–100x performance increase. PEP 485, which has been accepted for 3.5, proposes adding a function to test approximate equality to both the math and cmath modules: math.is_close() and cmath.is_close().
The "system call interrupted" error code on POSIX systems (EINTR, which gets turned into the Python InterruptedError exception) is often unexpected by applications when they are making various I/O calls, so those exceptions may not be handled correctly. It would make more sense to centralize the handling of EINTR. So, in the future, the low-level system call wrappers will automatically retry operations that fail with EINTR (which is a feature that is described in PEP 475).
The process of initializing extension modules (which are implemented as shared libraries), built-in modules, and imported modules is subtly different in ways that can lead to import loops and other problems. PEP 489 was proposed to regularize this process so that all of the different "import-like" mechanisms behave in the same way; it will be part of Python 3.5.
Eliminating .pyo files is the subject of PEP 488 and it will be implemented for the upcoming release. Those files were meant to hold optimized versions of the Python bytecode (unoptimized bytecode is usually placed into .pyc files), but the extension was used for multiple different optimization levels, which meant manually removing all .pyo files to ensure they were all at the same level. In the future, these __pycache__ file names will all end in .pyc, but will indicate both the optimization level and interpreter used in the name, for example: foo.cpython-35.opt-2.pyc.
Lastly, at least for our look, Python 3.5 will add an
os.scandir() function to provide a "better and faster
directory iterator
". The existing os.walk() does
unnecessary work that results in roughly 2x the number of system calls
required. os.scandir() will return a generator that produces file
names as needed, rather than as one big list, and os.walk() will
be implemented using the new function. That will result in performance
increases of 2–3x on POSIX systems and 8–9x for Windows.
As can be seen, there is lots of good stuff coming in Python 3.5. Over the next few months, testing of the betas and release candidates should hopefully shake out all but the most obscure bugs, leading to a solid 3.5 release in mid-September.
(Log in to post comments)
Python 3.5 is on its way
Posted Jul 15, 2015 23:33 UTC (Wed) by hazmat (subscriber, #668) [Link]
Python 3.5 is on its way
Posted Jul 16, 2015 6:40 UTC (Thu) by edomaur (subscriber, #14520) [Link]
And Go, well, has also some really bad semantic sugar ( ":=" ???? ) and does seems to walk a path similar to Python, only ten-year later. (except for the compiler)
Python 3.5 is on its way
Posted Jul 16, 2015 7:09 UTC (Thu) by k8to (guest, #15413) [Link]
There's obviously a potential upside as well, though my experiences with tools that try to reason about python have been pretty poor overall. Typehinting isn't nearly enough, and I suspect the practical gains aren't that big.
Python 3.5 is on its way
Posted Jul 16, 2015 8:43 UTC (Thu) by edomaur (subscriber, #14520) [Link]
Python 3.5 is on its way
Posted Jul 16, 2015 12:20 UTC (Thu) by smitty_one_each (subscriber, #28989) [Link]
Python 3.5 is on its way
Posted Jul 22, 2015 15:23 UTC (Wed) by k8to (guest, #15413) [Link]
Python 3.5 is on its way
Posted Jul 23, 2015 14:50 UTC (Thu) by mister_m (guest, #72633) [Link]
Python 3.5 is on its way
Posted Jul 16, 2015 16:34 UTC (Thu) by iabervon (subscriber, #722) [Link]
I think it really comes down to whether the type annotation system is able to express what the code cares about, rather than particular concrete types. If people use Mapping[str, int] rather than just dict, it'll probably actually be helpful.
For that matter, it would be super-helpful if objects like Mapping where able to do some basic type checks, so that easy-to-write code could raise an informative TypeError instead of an uninformative AttributeError, and people wouldn't write the code that's easy and informative but overly limiting.
Python 3.5 is on its way
Posted Jul 17, 2015 22:18 UTC (Fri) by mathstuf (subscriber, #69389) [Link]
Python 3.5 is on its way
Posted Jul 18, 2015 4:33 UTC (Sat) by iabervon (subscriber, #722) [Link]
Python 3.5 is on its way
Posted Jul 16, 2015 17:29 UTC (Thu) by dashesy (guest, #74652) [Link]
Python 3.5 is on its way
Posted Jul 16, 2015 7:06 UTC (Thu) by kleptog (subscriber, #1183) [Link]
> Applications relying on the fact that system calls are interrupted with InterruptedError will hang. The authors of this PEP don't think that such applications exist, since they would be exposed to other issues such as race conditions (there is an opportunity for deadlock if the signal comes before the system call). Besides, such code would be non-portable.
The case this will probably break is the use of alarm() to escape out of a blocking read(), so I think it's a bit of a stretch to say such applications don't exist. I've written such programs in the past, though this kind of thing is too simple for production generally. I just wonder how you could emulate the old behaviour. The PEP suggests signal.set_wakeup_id() but you can't use that in a library because the main program might be using it. Ofcourse you can't use alarm() in a library anyway for the same reason.
AFAICT there is no simple way to emulate the old behaviour, each such site will require manual fixing, assuming you can even find all the places this change will break.
Python 3.5 is on its way
Posted Jul 16, 2015 7:54 UTC (Thu) by pbonzini (subscriber, #60935) [Link]
Indeed, isn't it racy?
Python 3.5 is on its way
Posted Jul 16, 2015 23:02 UTC (Thu) by wahern (subscriber, #37304) [Link]
Of course, these things are horrible hacks. But they were very common once upon a time, and perhaps still useful in situations where you're stuck with a FILE handle instead of a socket descriptor.
It's not that you can't poll on the FILE handle (you can through fileno, assuming it's accessible), but that you don't control the buffering layer. Even if poll returns read readiness _and_ you only wanted to read a single byte, read readiness isn't any kind of a guarantee that there'll be something to actually read. Especially on Linux, which will wakeup a sleeping process on an incoming packet before checking the CRC. You would think CRC errors are rare, but, for example, in high-traffic UDP environments (think DNS server), naive applications that didn't think to put their UDP socket in non-blocking mode can easily stall the process.
Python 3.5 is on its way
Posted Jul 17, 2015 7:16 UTC (Fri) by kleptog (subscriber, #1183) [Link]
But like you say you can only do that on network sockets, select() on a real file always returns true, even if the actual read() is going to block for 5 seconds due to other stuff going on. Which means you really need to use the whole clusterf*k that is async I/O. Or threads (which is probably what people do in practice).
I've wondered if it'd be possible to make disk I/O look more like network I/O. (pseudocode)
sendmsg(fd, "", 0, { id=1, op=READ, pos=CURRENT_POS, len=1000 })
select(... wait for fd to be readable ...)
recvmsg(fd, &databuffer, 1000, &msghdr)
if(msghdr.id == 1)
process(databuffer)
Then it would integrate far more naturally in many coding frameworks and also AIUI match much more closely with how disks actually work these days.
Python 3.5 is on its way
Posted Jul 20, 2015 2:53 UTC (Mon) by zev (subscriber, #88455) [Link]
Python 3.5 is on its way
Posted Jul 21, 2015 0:29 UTC (Tue) by zlynx (guest, #2285) [Link]
Python 3.5 is on its way
Posted Jul 21, 2015 2:35 UTC (Tue) by zev (subscriber, #88455) [Link]
That said, alarm(2) is documented as being async-signal-safe, so presumably your SIGALRM handler could re-arm itself just as well, no? (It just wouldn't happen automatically.)
Python 3.5 is on its way
Posted Jul 28, 2015 9:53 UTC (Tue) by vstinner (subscriber, #42675) [Link]
The following code hangs on Python 3.5, whereas it fails with OSError(4, "Interrupted system call") on Python 2.7 and Python 3.4:
import os, signal
def noop(signum, frame): pass
signal.signal(signal.SIGALRM, noop)
signal.alarm(1)
os.read(0, 100)
The following code displays "interrupted" on all Python versions:
import os, signal
def noop(signum, frame): raise KeyboardInterrupt # <~~~ HERE
signal.signal(signal.SIGALRM, noop)
signal.alarm(1)
try: os.read(0, 100)
except KeyboardInterrupt: print("interrupted")
The difference is that the Python signal handler of the second example raises an exception.
So to be clear: it is still possible to interrupt blocking syscalls (like syscalls on I/O), as it was before. But your Python signal handler must raise an exception.
The code is not magic. When the alarm is trigerred, the C signal handler is called which sets a flag "SIGALRM was trigerred". Then read() fails with EINTR. Python calls the Python signal handler. If the Python signal handler does not raise an exception, read() is retried. Otherwise, os.read() fails with the exception raised by the Python signal handler.
C code of the _Py_read() function used by os.read(): https://hg.python.org/cpython/file/e01289b08ca8/Python/fi...
Python 3.5 is on its way
Posted Jul 16, 2015 17:38 UTC (Thu) by dashesy (guest, #74652) [Link]
Python 3.5 is on its way
Posted Jul 17, 2015 8:12 UTC (Fri) by tdalman (guest, #41971) [Link]
Which wrong connotations do you expect by using '@' ?
Python 3.5 is on its way
Posted Jul 17, 2015 16:13 UTC (Fri) by dashesy (guest, #74652) [Link]
Python 3.5 is on its way
Posted Jul 18, 2015 3:21 UTC (Sat) by njs (subscriber, #40338) [Link]
FWIW I've found that the @ notation grew on me pretty quickly, and that it's *really nice* to have an operator that unambiguously means matrix multiplication -- I've already found myself reaching for it in talks and such.
Python 3.5 is on its way
Posted Jul 17, 2015 19:05 UTC (Fri) by zuki (subscriber, #41808) [Link]
>>> x = 1
>>> x . numerator
1
Python 3.5 is on its way
Posted Jul 17, 2015 19:13 UTC (Fri) by zuki (subscriber, #41808) [Link]
> APL apparently used +.× , which by combining a multi-character token, confusing attribute-access-like . syntax, and a unicode character, ranks somewhere below U+2603 SNOWMAN on our candidate list.
Python 3.5 is on its way
Posted Jul 24, 2015 15:00 UTC (Fri) by kata198 (guest, #103726) [Link]
>>> datetime.\
... datetime.\
... now()
datetime.datetime(2015, 7, 24, 10, 59, 46, 7099)
>>> datetime . datetime .now()
datetime.datetime(2015, 7, 24, 11, 0, 12, 950078)
Python 3.5 is on its way
Posted Jul 18, 2015 15:30 UTC (Sat) by tack (guest, #12542) [Link]
As far as I remember, the new async/await stuff isn't just syntax. They allow for asynchronous context managers and async iterators which I understand isn't possible with tulip.
Also, just wanted to give a tip of the hat to zipped applications. Ever since I discovered them not long ago (I think it was when poking at the youtube-dl app), I've used them extensively. Not just useful for bundling code, but also data. For example, an embedded webserver and all the static files (javascript, images, etc.) can be bundled inside a single downloadable file. (I use this for personal projects as well as work projects where it's common for me to include supporting Java jars to be spawned by Python.)
Python 3.5 is on its way
Posted Jul 21, 2015 22:05 UTC (Tue) by sjj (subscriber, #2020) [Link]
Type hints
Posted Jul 23, 2015 9:25 UTC (Thu) by swilmet (subscriber, #98424) [Link]
But writing the type explicitly is a way to self-document the code. Knowing the type of a variable is important to understand the code, to know what the variable represents and which methods can be called on it. It is even more important for function parameters, because for those it takes more time to look at a possible assignment (one has to go through the function calls to find where the parameter is assigned).
For the author of the software, this may not be a problem, because she/he knows well the code and the variable name is sufficient to know which class it represents. But it isn't true (1) when working again on the code after a pause of say, one year (2) for someone discovering the code (3) for a really big codebase.
Python 3.5 is on its way
Posted Jul 24, 2015 1:29 UTC (Fri) by sstewartgallus (guest, #99898) [Link]
> os.close , close() methods and os.dup2() are a special case: they will ignore EINTR instead of retrying. The reason is complex but involves behaviour under Linux and the fact that the file descriptor may really be closed even if EINTR is returned.
I knew close was a clusterfuck but I didn't know that the shit invaded dup2 as well. This is disconcerting.
Python 3.5 is on its way
Posted Jul 24, 2015 2:17 UTC (Fri) by mathstuf (subscriber, #69389) [Link]
Python 3.5 is on its way
Posted Jul 24, 2015 14:58 UTC (Fri) by kata198 (guest, #103726) [Link]
Python 3.5 is on its way
Posted Jul 24, 2015 15:35 UTC (Fri) by mathstuf (subscriber, #69389) [Link]
Python 3.5 is on its way
Posted Jul 26, 2015 19:04 UTC (Sun) by nix (subscriber, #2304) [Link]
Python 3.5 is on its way
Posted Jul 26, 2015 21:35 UTC (Sun) by mathstuf (subscriber, #69389) [Link]
Python 3.5 is on its way
Posted Jul 25, 2015 7:50 UTC (Sat) by tuna (guest, #44480) [Link]
Python 3.5 is on its way
Posted Jul 31, 2015 13:00 UTC (Fri) by ceplm (subscriber, #41334) [Link]
Python 3.5 is on its way
Posted Jul 31, 2015 19:45 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]
Python 3.5 is on its way
Posted Sep 14, 2015 7:35 UTC (Mon) by MKesper (subscriber, #38539) [Link]
I absolutely see 2.x as a mess in that area. Could the new PEP461 be of help to you?
Python 3.5 is on its way
Posted Sep 15, 2015 8:07 UTC (Tue) by Otus (subscriber, #67685) [Link]
Python 3 is catching up to 2 in my book with these reintroduced features.
Python 3.n+1 on RHEL by way of Software Collections
Posted Sep 13, 2015 18:44 UTC (Sun) by scottt (guest, #5028) [Link]
I think more people who use Python on RHEL should give Software Collections a try. It certainly helped on the systems I maintain.
Python 3.5 is on its way
Posted Sep 14, 2015 10:55 UTC (Mon) by pboddie (subscriber, #50784) [Link]
It's also interesting to note that various compiler-related performance enhancements are likely to be supported in Python 2.x because Intel apparently have an interest in maintaining support for it: another area in which Python 2.x was neglected because "it isn't the future" until someone finally synchronised with reality.