Idiomatic usage of Python assignment expressions (PEP 572)
Most PEPs (Python Enhancement Proposal) tend do go under the radar of most developers, but PEP 572 has caused a lot of controversy in the Python community, with some developers expressing an intense dislike for the new syntax (and that is sugar coated).
Personally, I don't think it deserves the vitriol it received, and I'm looking forward to its addition in Python 3.8.
So what are assignment expressions? Currently in Python assignments have to be statements, so you can do foo = bar()
, but you can't do that assignment from within an if
or while
statement (for example). Which is why the following is a syntax error:
if foo = bar():
print(f'foo is {foo}')
else:
print('foo was non zero')
That's often been considered a plus point for Python, because a common error in languages that support this syntax is confusing an assignment =
for a comparison ==
, which too often leads to code that runs but produces unexpected results.
New Assignment Operator
The PEP introduces a new operator :=
which assigns and returns a value. Note that it doesn't replace =
as the assignment operator, it is an entirely new operator that has a different use case.
Let's take a look at a use case for assignment expressions. We have two file-like objects, src
and dst
, and we want to copy the data from one to the other a chunk at a time. You might write a loop such as the following to do the copy:
chunk = src.read(CHUNK_SIZE)
while chunk:
dst.write(chunk)
chunk = src.read(CHUNK_SIZE)
This code reads and writes data a chunk at a time until an empty string is read which indicates that the end of the file has been reached.
I think the above code is readable enough, but there is something awkward about having two identical calls to read
. We could avoid that double read with a loop and a break such as this:
while True:
chunk = src.read(CHUNK_SIZE)
if not chunk:
break
dst.write(chunk)
This works fine, but it's a little clumsy and I don't think it expresses the intent of the code particularly well; five lines seems excessive to accomplish something that feels trivial.
Yet another solution would be to use the lesser known second parameter to iter
, which calls a callable until a sentinel value is found. We could implement the read / write loop as follows:
for chunk in iter(lambda: src.read(CHUNK_SIZE) or None, None):
dst.write(chunk)
This is pretty much how PyFilesystem copies data from one file to another.
I think the iter
version expresses intent quite well, but I wouldn't say it is particularly readable. Even at only two lines, If I was scanning that code, I would have to pause to figure those lines out.
Writing this loop with the assignment operator is also two lines:
while chunk:=src.read(CHUNK_SIZE):
dst.write(chunk)
The assignment to chunk
happens within the while loop expression, which allows us to read the data and also decide wether to exit the loop.
I think this version expresses intent the best. The first line is a nice idiom for read chunks until empty, which I think developers could easily learn to use and recognise.
I don't see this syntax causing much confusion because in nearly every situation, a regular assignment is best. It's true that you could create some difficult to understand code with this feature (especially in list / generator expressions), but that's true of most language constructs. Developers will always be faced with balancing expressiveness and brevity for clarity.
Other examples
The following tweet has links to other places where assignment expressions could be used to simplify common constructs:
FYI @VictorStinner opened some pull requests to CPython for __demonstration purposes__ how PEP 572 assignment expression can be used to make CPython library code more readable/shorter: https://t.co/Gc5IwdqwD1 https://t.co/IxkNMJvo9G https://t.co/eNPKiRgeI6 pic.twitter.com/v6noX5iV6i
— Squeaky (@squeaky_pl) July 5, 2018
Conclusion
I hope that devs will give this syntax another chance. Some developers on the Reddit thread have suggested that they would ban assignment expressions in their code. Hopefully by the time Python3.8 has become more mainstream and assignment expression idioms are more common, they will reconsider.
Interesting idea. I hope it will make it into 3.8. Does it also help to improve the performance? Have you made some benchmarks?
I haven't got benchmarks, but I would expect it to be a little faster. Although I don't think speed was a primary concern here.
Thank you for your detailed explanation! P.S. It seems to me that in the example with
while True:
functiondst.write
should takechunk
instead ofCHUNK_SIZE
.You're absolutely correct. Fixed...
In your example, under New Assignment Operator, should the last line be:
chunk = src.read(CHUNK_SIZE)
instead of
chunk = read(CHUNK_SIZE)
Good spot. Fixed.