Uncategorized

Propose Review of Single Element Tuple Handling in Python – Ideas


120% agree. I doubt anyone would argue too strongly that the way of spelling a tuple which we have is the best, especially for beginners.

But adding a new alternative doesn’t necessarily improve the situation.



4 Likes

The syntax is weird because there’s no real corresponding mathematical notion of a 1-tuple to draw from. We had to make it up.

In math, there are ordered pairs, ordered triplets, etc that arise from the cross product of 2, 3, etc sets. But what’s the product of 1 set (or of zero sets, for that matter: what’s an empty tuple)?

Python drew on this idea to define a single tuple type that is essentially more of an immutable list than a representation of mathematical tuples. In some sense, there is no need for either 0-tuples or 1-tuples: the 0-tuple is just None in disguise, and a 1-tuple is just the first element dressed up for dinner. They only really exist so that we can have a single tuple type instead of multiple 2-tuple, 3-tuple, 4-tuple, etc types.

Practically speaking, there are situations where a 1-tuple can be semantically different from its sole element, but that goes hand-in-hand with other design decisions in the type system. (Haskell, for example, does quite well without a single tuple type, and has no real notion of 1-tuples. The lone value of the unit type is effectively the empty tuple, as suggested by the notation () for both the type and its value, but other higher-order types pretty much fill any imaginable need for a 1-tuple.)



2 Likes

I completely agree.

It’s unfortunate that a precious punctuation character, the comma, has been used for an unusual, albeit cool and interesting construct, the one-item tuple.

Overall I love Python’s syntax and nearly all the decisions and additions over the years. I even like the walrus!

We have error hints now that often correctly suggest “perhaps you forgot a comma?” – maybe there is a way to detect common mistakes with an extra comma?

I’m showing my pedantic streak today, but I’m going to insist on not thinking of it this way.

1, 2 is a tuple in python (as an expression). It’s great! It’s symmetric with an unpacking assignment (statement):

x = a, b
a, b = 1, 2

1, 2 is a tuple because there’s a comma. Parentheses are optional. This is nice.

So 1, being a single-element tuple is just a “predictable” degenerate case of the existing comma-based tuple syntax.

I do agree that the result is unintuitive for beginners and an occasional “gotcha” even for experienced developers. But why it’s there and how it’s actual consistent with the rest of the language is not so simple as this phrasing suggested.

We have error hints now that often correctly suggest “perhaps you forgot a comma?” – maybe there is a way to detect common mistakes with an extra comma?

Linting and autofixing isn’t part of the language. That’s the domain of tools – CLIs, IDEs, etc. So no discussion of Python changes is needed.

It’s actually a pretty accessible space. You can write a linter.
Try writing something using the ast package from the stdlib or libcst. (ast is pretty great and easy to use, I recommend actually doing this. libcst is harder to use but also great.)
Then, test your linter: clone some projects – maybe even cpython – and run the linter on them. Are the results good?

Probably you’ll just be flagging tons of valid and correct usage though.
If you can find a rule which is consistently providing value and which linters should implement, but which mainstream tools (e.g., black, ruff, flake8, flake8-bugbear) don’t have today, you can go and advocate for it.



1 Like

It’s elegant, however commas do not only create tuples.
For example, it is incorrect to use tuple(1,)

This is part of the language, and very nice. Are there people who do not like it? Probably.

$ python -c 'def foo(a=1 b=2)'
  File "<string>", line 1
    def foo(a=1 b=2)
              ^^^
SyntaxError: invalid syntax. Perhaps you forgot a comma?

I don’t think the error message is part of the language. It’s part of one implementation of the language, but not required to be compliant with the language specification.



2 Likes

What you’re seeing there is two somewhat distinct pieces of behaviour.

  1. It is an error to have def foo(a=1 b=2):
  2. Having determined that the code is definitely wrong, what MIGHT you have intended?

The second one is closely related to linting, but linting is done on valid code. There’s very little provision in the language for valid code to produce messages like this (SyntaxWarning is used extremely sparingly, and only for constructs that are almost certainly wrong, often ones that are going to become outright errors in the future).



1 Like

For sure! Earlier in the thread I was suggesting that it would be nice if a common category of extra-comma errors could be detected, as with the missing comma case.

Ah. To be clear here, I did not mean “error” in the sense of “bug”, but in the much narrower sense of the SyntaxError. It is impossible to parse the token sequence def foo(a=1 b=2) into a valid Python syntax tree. In contrast, python = 1, from your original concern is NOT a syntactic error; it may very well be a bug, but it is perfectly meaningful. Thus it is valid code – code that has a real and reasonable meaning, albeit perhaps not the one you wanted.



1 Like

Good point. It’s not possible to detect common problems on the spot, since it’s valid code. There are often exceptions very closely related to the problem, but then that probably is entering the domain of linters.
If it were possible to augment an exception message for something like this without much risk of being incorrect, that would be great – but I see it’s not as simple. I suppose it’s already too late, when the exception happens:

>>> class Foo:
...   a=1
...   b=2,
...   c=3
... 
>>> x = Foo()
>>> n = x.a + x.b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'int' and 'tuple'

Precious? There are a number of places where commas are used without defining a tuple, one-item or otherwise. Parentheses are used to disambiguate these. 1, vs (1,) is more like 3+5 vs (3+5): only necessary in certain contexts, but harmless in others.

  • [1,2]: a list with two elements
  • [(1,2)]: a list with one element
  • print(1,2): a function call with two arguments
  • print((1,2)): a function call with one argument.

Rather, I think self-imposed grammar restrictions made it impossible to find a suitable set of explicit delimiters for tuples, relegating them to the parentheses-optional form that we do use. (The only pair of matched delimiters available without going to Unicode are <...>, and using them raises issues, for example, of identifying < as the opening of a tuple or as a comparison operator.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *