Python's lambda is broken!
- 09 April 2009
- General
I quite like Python for teaching. And people praise it for the lambda
construct which is a bit like $\lambda$-abstraction in functional languages. However, it is broken!
To see how
lambda
is broken, try generating a list of functions $[f_0, ..., f_9]$ where $f_i(n) = i + n$. First attempt:
>>> fs = [(lambda n: i + n) for i in range(10)]
>>> fs[3](4)
13
Wait a minute, fs[3](4)
ought to be 3 + 4 = 7
! It looks like all 10 functions share the same “last” value of i
, which is 9
. Indeed:
>>> [f(4) for f in fs]
[13, 13, 13, 13, 13, 13, 13, 13, 13, 13]
This is certainly unexpected. Let us try to get around the problem by not using lambda
:
>>> fs = []
>>> for i in range(10):
... def f(n): return i+n
... fs.append(f)
...
>>> [f(4) for f in fs]
[13, 13, 13, 13, 13, 13, 13, 13, 13, 13]
Still not working, so the reason is deeper, probably in the way environments are handled. Maybe like this:
>>> fs = []
>>> for i in range(10):
... def f(n, i=i): return i+n
... fs.append(f)
...
>>> [f(4) for f in fs]
[4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
Victory! But try explaining to students what is going on.
Just to be sure, Haskell does the right thing, of course!
Prelude> let fs = [(\n -> i + n) | i <- [0..9]]
Prelude> [f(4) | f <- fs]
[4,5,6,7,8,9,10,11,12,13]
What were the implementors of Python thinking?!
andrejbauer@mathstodon.xyz
, I will
gladly respond.
You are also welcome to contact me directly.
Comments
This is not just a Pythonic phenomenon. IIRC, both Ruby and C# (and doubtless more languages besides) all make this choice for closures over mutable variables. It definitely confuses you when you see it for the first time, but the alternative behaviour you want has the potential to be even more confusing in some situations IMHO.
Note also that:
for (int i = 0; i i' + x); }
-- Now I want the 'incorrect' behaviour: for (int[] i = { 0 }; i[0] i[0] + x) }
It's swings and roundabouts as usual :-)
I'm not particularly familiar with Python, but Lisp folks sometimes run into the same problem using lambda with loop. It's not a problem with lambda, but rather a question whether loop variables are assigned to on each iteration or whether a new binding is established. (See http://groups.google.com/group/comp.lang.lisp/msg/3838212e836ff0ca for a recent manifestation.) The difference is essentially:
Establish a binding i; For each integer in the range do: set i to the integer and evaluate the body of the iteration construct.
vs.
For each integer in the range: establish a new binding from i to the integer and evaluate the body the iteration construct.
I think the same issue is described here: http://mail.python.org/pipermail/python-list/2003-September/224902.html . It seems that Python's lambda isn't broken, but that there's not a convenient way to establish local bindings.
Indeed, consider:
>>> def makeAdder(i): return (lambda n: i + n) ...
>>> fs = [makeAdder(i) for i in range(10)]
>>> [f(4) for f in fs]
[4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
Isn't this the same thing many languages do, and a "feature" of the implementation idea known as "closures"?
For example, in JavaScript:
var fs = {}; for(var i=0; i-5; i++) { fs[i]= function(n) { return n+i; } } for(var j in fs) { console.log("fs[" + j +"] = ",fsj); }
(used i-5 just in case Wordpress chokes on the less-than sign) gives fs0 = 110 fs1 = 110 fs2 = 110 fs3 = 110 fs4 = 110
Even worse,
var fs = {}; for(var i=0; i-5; i++) { fs[i]= function(n) { return n+i; } }
for(var i in fs) { console.log("fs" + i +" = ",fsi); }
gives
fs0 = 1000 fs1 = 1001 fs2 = 1002 fs3 = 1003 fs4 = 1004
but that can be blamed on JavaScript.
You could call it either a bug in the languages, or a failure on our part to understand the implications of late-binding and the dynamic goodness.
(Quite a pain to have to register before posting comments: have you considered OpenID?)
Oops, those 110s should be 105s. (I had a loop running until 10 originally, then I changed it in the comment...)
Actually, the implementors of Python are no crazy (or maybe they are, but not for that reason). The fact is, Python is not primarily a functional language and thus variables are by default mutable object. When in your examples the closure is created, i is a reference in a way, not a value. Take the following Python example : >>> i = 1 >>> def f(): return i >>> f() 1 >>> i = 2 >>> f() 2 I think that everybody expect this behavior and nobody would want the second call to f() to return 1. So yes, I can get that in the example you show the comportement may seems surprising and yes I'm sure it is not the simpliest thing to explain to student but lambda in Python are not broken. They are just lambda in a non functional language.
Note that you can "emulate" the same comportement in Haskell by using references (but it will involve a monad and not be as clean as the 2 lines you give).
The problem is not the lambda expression but the fact that list/ comprehensions leak variables. AFAIK this is fixed in Python 3.0.
This works:
fs = [(lambda n, i=i: i + n) for i in range(10)]
It's sad, and it's a known issue. Read, for example this thread:
http://mail.python.org/pipermail/python-list/2002-April/thread.html#142190
The messages are dated 2002... :-(
Javascript is the same way (and Perl iirc). The brokenness is more to do with the for loop than with lambda: i points to the same location at each iteration, and the location is mutated. That makes perfect sense if you're coming from C, but interacts badly with lambda as you point out.
Looks like (as you say) the problem is not with lambda, but with the environment - the 'i' in the lambda, as with the named function, is bound to the global 'i':
>>> fs = [(lambda n: i + n) for i in range(10)] >>> fs3 13 >>> i=38 >>> fs3 42
The same 'fix' as in your def works for lambda as well: >>> fs = [(lambda n, i=i: i + n) for i in range(10)] >>> fs3 7
Lua, one of my favorite little languages, also gets it right: > f={} > for i = 0, 9 do f[i] = function(n) return i + n end end > print(f3) 7 > i=38 > print(f3) 7
At least there are many other ways to solve this problem. Your last example makes absolute sense as to how it should be done.
Before making such a sensational claim, you should do your homework. What's broken is your understanding of Python.
Python looks up names at run-time. Your lambdas functions reference the non-local name "i" (i.e., the name "i" is not defined in the local scope), so Python has to look outside the local scope. It finds that the current value of "i" is 9. In order to retain the value of "i" at lambda definition time, you need to declare local names:
>>> fs = [(lambda n, j=i: j + n) for i in range(10)] >>> fs3 7
Hint: default values are calculated at definition time, not run time. So "j=i" effectively curries the value of "i" at definition time. I used "j" inside the lambda to make it easier to understand; you should avoid using the same name in multiple scopes.
And BTW, requiring a login for an individual blog is super-arrogant. No wonder you aren't getting more comments. See the reddit page for this article: http://www.reddit.com/r/Python/comments/8baq0/pythons_lambda_is_broken/
This is the very same lambda as is in Perl, Ruby, C#, (probably lisp and scheme, too). The brokenness here is more about list comprehensions/iterators using a single mutable variable instead of a new binding. In python it is particularly hard to get it right, because variables are def-scoped, rather than block scoped. "Obvious" optimizations for imperative programmers = hell for functional programmers.
Lambdas closing over mutable variables get hard to think about really fast.
Not broken, though, admittedly, when I first ran into this I threw up in my mouth a little. C# actually handles lambdas the way you'd expect but I believe Actionscript 3 has the same issue as Python here. A much simpler looking change would be to do this:
fs = [(lambda n, i = i: i + n) for i in range(10)]
or for clarity
fs = [(lambda n, k = i: k + n) for i in range(10)]
which is also the same as doing this:
def statefulLambda(i): return lambda n: i + n
fs = [statefulLambda(i) for i in range(10)]
As luqui said, you basically need to capture the state of the i variable in a block. Since Python doesn't do that for you, it means you need to initialize a new variable to store that data in with each function you create. At least that's how I think about it.
If you're teaching imperative languages to students with FP background, you definitely want to give the example to them as a challenge ;)
Actually, a better way of doing it is to use lambdas the way Church made them:
Shouldn't that Haskell list comprehension read [(\i -> i + n) | n <- [0..9]], instead?
This has nothing to do with lambdas, but with the scoping rules of Python. Not being privy to the inner workings of the implementors' minds, I can only guess at 'what they were thinking', but it's probably something along the lines of 'simple to implement and easy to explain'.
Each call to a function/lambda creates a scope. Any function/lambda defined within that scope has access to that scope (albeit strictly read-only in Python 2.x) but all functions/lambdas defined within that call share the same outer scope.
Anyway, the fix is quite simple (and you basically found that solution yourself already):
fs = [(lambda n, addend=i: addend + n) for i in range(10)]
The trick here is that default values of parameters are evaluated just once when the function/lambda is created. This, too, is sometimes unknown to less experienced Python programmers. To quote a co-worker completely out of context: "It's really intuitive once it's explained to you!"
First a reply to ludwig: yes, thank you, that's what I get for changing the source without rechecking if it still works. I fixed it.
A reply to goodger: this is not a popularity contest. My motivation is not to get a maximum number of replies. I prefer fewer well though out replies than a lot of fly-by comments. If registration is too much burden for you, then don't register. Anyhow, thank you for your comment.
I thank everyone else for their explanations. I still wonder how I should explain this to my students (beginners, some of them have not seen any other programming language in their lives).
After seeing some of your posts, I discovered a much bigger horror in Python:
Oh my,
i
is still defined after the list comprehension. This is certainly unexpected. Same thing for loops:While I could accept the explanation about how closures capture the location and not the value, this really is a design error. The loop variable ough to be local to the loop. But I suppose I get what I deserve for using a dynamic language.
Reply to Andrej: no, it's not because of Python is dynamic. For example in Lua, a dynamic tiny language, but actually a pearl, this is what happens:
~ $ lua Lua 5.1.3 Copyright (C) 1994-2008 Lua.org, PUC-Rio > print(foo) nil > for foo = 1,3 do print(foo) end 1 2 3 > print(foo) nil
As you can see, the first time you try to print the foo variable you obtain a nil value, because in Lua any name is binded to nil. In the for loop the 'foo' name is binded to the correct integer value. After the loop, foo returns to it's non-value, nil. And our souls can rest :-)
How about this variation
>>> def def_f(i): return lambda n: i + n >>> fs = [def_f(i) for i in range(10)] >>> fs3 7
http://www.python.org/doc/2.5.2/ref/naming.html
4.1 Naming and binding
Names refer to objects. Names are introduced by name binding operations. Each occurrence of a name in the program text refers to the binding of that name established in the innermost function block containing the use.
A block is a piece of Python program text that is executed as a unit. The following are blocks: a module, a function body, and a class definition. A script file [...] is a code block. A script command [...] is a code block. The file read by the built-in function execfile() is a code block. The string argument passed to the built-in function eval() and to the exec statement is a code block. The expression read and evaluated by the built-in function input() is a code block.
List comprehension control variable doesn't leak anymore in py3k, but loop variable has been keep as it in 2.0. There is a discussion about this topic http://markmail.org/thread/cooz3lwiwllsnmug
I am trying to respond in an honest constructive manner here and having a bit of a time phrasing things politely. I will just try to be succinct, please accept that my goal is not to offend.
Please stop your sensationalism. "Python's Lambda is broken!" and in your comment #20, "this really is a design error." are really just indications of your lack of programming knowledge. It seems like you are familiar with math and think that programing languages should behave by the same rules. This is not the case.
The examples you have given show Python operating exactly as designed, as you would know if you understood the language. If you are just trying to use python as a functional calculator then this is probably not what you want, but keep in mind that Python handles problems from a much larger domain. Python also has clearly defined scoping rules, which you can look up if you are interested in actually understanding the behavior that "you" think is incorrect.
Dear DriveByComment: you did manage to respond in an offending manner. Before you accuse me of not knowing about programming languages, you should perhaps look around to see who I am and what I do. I do know a bit about the theory of programming languages, you know.
But leaving personal attacks aside, I do not understand your comment. I claim the features that lead to the behavior of Python, as demonstrated above, are design errors. You interpreted that as my not understanding Python (its scoping rules). I repeat: the decisions in the design of Python which lead to the above behavior are bad, i.e., the way Python actually behaves (not the way I think it behaves) is wrong.
I see two problems. The one that is easy to fix is the craziness with variables escaping the scope of a list comprehension or a loop. As jaux (comment 24) pointed out Python 3.0 fixes list comprehensions, and probably leaves the loops broken for backward compatibility.
The second problem causes lambda to behave unexpectedly, but trouble really lies with Python's idea of how scoping and environments work, so there is no easy fix. For example, in a sane programming language this should cause an error of the kind "I don't know what
i
is":But Python happily accepts the definition and emits an error message when
g
is applied. This is going to confuse the programmer, because the error message points to the wrong place. I understand what the benefits are (such as allowing mutually recursive definitions without forward declarations), but they do not outweigh the trouble that the design decision causes.Hi Andrej, In your response #26 can you expand on why you think the acceptance of the definition of g to be "crazy"?
I don't see it at first glance. as long as i is defined when g is called, their is no problem. Remember that Python is a dynamic language and it could be quite correct to change the global context in which g is run and supply a definition for i; or trap the resultant exception as part of a programs functionality.
I find that if I have a problem with Python, then a calm post to comp.lang.python usually allows me to discuss the issue with the community and find out more about the issue and its history. I've also seen other peoples issues discussed, enough times to know that language design is hard. Everyone knows the mechanics it seems, but the devil is in the detail.
Dear Paddy, I have expanded on some of my views in this post.
@Andrej (20)
This isn't a design error. It's quite intentional actually. It's just a matter of Python having different scoping rules and treating all statements as executable. Guido has a pretty good overview of how this came about.[1]
Think about it this way. In a language like C, a statement like this:
is a directive to the compiler. Where as in python, this statement:
is an executable statement that's run at runtime. I would argue that the case that you mention is one of those "just because you can doesn't mean you should" situations. However, this does allow for some more natural syntax. Consider the following:
This is more natural than the alternative even if only by a little bit.
[1] http://python-history.blogspot.com/2009/03/how-everything-became-executable.html
Reply to Jason (29): Design choices are (almost) always intentional. I really am saying that the intentional design choice is mistaken. But I understand that it fits in with the rest of Python.
It seems like people think I hate Python. I teach python programming (by my choice). I also do web programming with Django, and it's much better than other things I've tried (perl, ocaml, java).
Hi Andrej,
You state: "It seems like people think I hate Python."
Your title: "Python’s lambda is broken!" is emotive.
It is not an issue i think.. it is just variable scoping... try this, it should work
fs= map(lambda i:lambda n: i + n, range(10))
isn't map deprecated? It's dangerous out there, use this:
>>> fs = [(lambda i: lambda n: i + n)(i) for i in range(10)] >>> fs3 7
C# does the same thing. A damn shame.
En Scheme la semántica básica de las capturas es la misma que en Python y JavaScript
(define (ruma) (let ((k 10)(fs '())) (define (r) (if (= k 1) fs ((lambda () (set! k (- k 1 )) (set! fs (cons (lambda () write k) fs)) (r) )) ) ) (r) ) )
Todas las funciones de la lista devuelta por esta función nos dan el valor 1, el último valor que toma k. Esto implica que las clausuras de la lista capturan la referencia a k, no su valor.
[...] closure-by-reference can produce side effects that surprise developers, such as: def surprise(): funs = [lambda: x ** 2 for x in range(6)] assert [...]
I don't see the big problem here. Obviously, it's a bit confusing, but as others have stated, this is often the way closures are implemented in modern multi-paradigm languages. In my opinion, this behaviour should be expected in this language with so much mutable state. Anyway, the problem here is not that the lambdas capture values, but how the for loop works.
However, to repeat earlier posters, there are other solutions.
>>> a = lambda n: lambda nn: n + nn >>> a(3)(5) 8
Starting with this approach is imho more flexible, clearer, reads better and possibly more memory efficient. If one then should need/want a list of a specific lambdas of this form, one could do so with classic list comprehension as suggested earlier. Stuff like "i=i" etc. are hacks that can easily be avoided :)
If a lambda function can be converted to work as a non-lambda function with little effort, why add the unnecessary complexity.
It's pretty obvious, that from a practical standpoint, most people can't inherently grasp the idiosyncrasies of lambda expressions. Otherwise, you wouldn't have so many comments on this thread explaining what it tries to accomplish.
I think, GvR has a pretty good understanding that his language is used extensively in production environments (not just academic); and, adding features for features sake is a mistake.
If you want a dynamically-typed, light-weight, functional programming language that supports useless constructs like lambda expressions why don't you write your own. Wait, there are already 11 different dialects of Lisp.
Don't forget, some of us actually use python to produce code to make money.
@Evan: your argument can be applied to other constructs as well. In Python we do not really need dictionaries because they can be replaced by association lists. I can tell you from personal experience that dictionaries are far more confusing for students than lists. This line of reasoning does not work very well. But yes, a language needs a balanced set of features.
The question is whether lambda belongs to such a set. There are many reasons why the answer is positive. For example, lambda abstraction is one part of a fundamental mathematical concept, namely that of a function, the other part being application (of a function to an argument). It seems clear to me that such a primitive concept can be useful and probably should be part of a language.
If we designed languages according to what "production environment workers" can comprehend, we'd still be hunting with sticks and stones.
[...] http://www.ymeme.com/python-vs-lisp.html http://math.andrej.com/2009/04/09/pythons-lambda-is-broken/ [...]
[...] [2] http://math.andrej.com/2009/04/09/pythons-lambda-is-broken/ [...]
[...] http://docs.python.org/reference/expressions.html#lambda [2] http://math.andrej.com/2009/04/09/pythons-lambda-is-broken/ [3] http://diveintopython.org/power_of_introspection/lambda_functions.html [4] [...]
I know this discussion is long since over, but I recall talking about the very same issue recently on somebody else's blog and, as far as I can see, all the explanations here are incorrect. (All the ones there were too; it's tricky.) The issue is not scoping or broken lambda-expressions; rather, it depends on the fact that Python implements list comprehensions via mutation.
For my reference and yours, here is your first example:
>>> fs = [(lambda n: i + n) for i in range(10)] >>> fs[3](4) 13
When the lambda-expression is evaluated, the variable
i
is properly captured in a closure. In Python, a variable is, as you know, a reference to a mutable cell. However, Python list comprehension is evaluated by updatingi
for each element in the domain, not rebinding it. This means that you are creating 10 closures with 10 copies of references to the same mutable cell. By the timefs
is bound, the comprehension has been evaluated, so the value in the cell is 10, so each dereference will return 10.In contrast, in Ocaml the
for-do
construct is implemented by rebinding the index for each evaluation of the body. However, if you were to implement list comprehensions using camlp4 in the same way as Python, by mutation, you might encounter the same problem in Ocaml, depending on whether you you capture the cell or its contents.This code, which is the natural and erroneous solution:
[ (fun n -> !i + n) for i in 1 to 10]
would behave like the Python code, because the cell is captured and only dereferenced after the list is built.
This code:
[ let i' = !i in (fun n -> i' + n) for i in 1 to 10 ]
would behave as you expected, because
i
is dereferenced while the list is being built.I don't know if Python provides a way to get the same effect (I am a Haskeller); it displays the usual imperative confusion between variables and cells. But I think I've demonstrated that, at least in this respect, Python's lambda does not behave any differently than you would expect an ML language's to behave.
Andrej, was just checking back and noticed someone had already given my explanation about location vs. value. Then I noticed your remark about the comprehension leaving
i
in the global scope. (I agree this is laughable.) I wondered if my explanation was wrong and Python was just referencing the leftover globali
. Actually, it seems both explanations are right, since you can alter the value of the closure by assigning to the leftover global!>>> i=2 >>> i 2 >>> fs = [(lambda n: i + n) for i in range(10)] >>> fs[3](0) 9 >>> i 9 >>> i = 0 >>> fs[3](4) 4
Well this is my first try with Ruby:
fns = (0..9).collect {|i| lambda {|n| i + n}}
p fns[3].call(4)
indeed it prints 7 :-)
Following on from what Svat said, in introductory javascript books, there is always a warning about assigning event listeners in loops with the value for the loop exit condition being assigned, instead of the 1, 2, 3, etc expected. The solution given is to use a closure and the
(function(n){})(n);
construct. [sourcecode gutter="false"]</pre> var fs = {}; for(var i=0; i<5; i++) { fs[i]= (function(i){ return function(n) { return n+i; } })(i); }for(var j in fs) { console.log("fs[" + j +"] =" ,fsj); }
I am reasonably new to functional programming, but the supposed "issue" with python seems quite reasonable when viewed through javascript eyes.
I realize this conversation is long over, but for the sake of the people who may travel the same muddy path of the Python lambda and land on the same page, a little bit of clarification is necessary.
@Frank List comprehensions "leaking" variables has to do with their implementation in Python 2.X, namely, using Python's for loops, which do not make their own local scope and thus "leak" variables by design. This has been fixed in Python 3.X.
The problem with lambdas in list comprehensions has to do with references and mutation-based loops.
strange... tested 4 years later in python 3.3.2 still broken..
Whether it is wrong, I won't discuss (Andrej is smart, and he says it is, he probably sees something I don't). But the thing is, it cannot be "fixed" (made to work like Andrej expects it to) without completely changing the rules for variables lookup. And it probably won't be Python anymore, it would be too different (for example, I have a strong hunch it would have to require declarations for variables). Because you'd have to close around everything accessible at the moment of definition of function, and you don't really know what that could be (it can call some other function, that sees something else, and whether that function is called might be undecidable).
Besides, I see mcmanus still didn't get a reply. I'll try once more.
i=1 f=lambda:i i=2 f()
What do you expect?
from random import randrange f=lambda:randrange(1e5) f()!=f()
And now?
@Aaron: yes, I think someone else pointed out a similar solution. But really, that doesn't clarify anything. It's a workaround.
[…] is similar to the topic discussed here: http://math.andrej.com/2009/04/09/pythons-lambda-is-broken/ . In some programming languages, when you loop over a variable, the body of the loop construct is […]
[…] You just have to think a little bit different. I’m sure soon you will love it. But be careful if you deal only with python. Because the lambda is not a real closure, it is “broken” somehow: pythons lambda is broken […]
[…] You just have to think a little bit different. I’m sure soon you will love it. But be careful if you deal only with python. Because the lambda is not a real closure, it is “broken” somehow: pythons lambda is broken […]
[…] For a completely random example, from the article “Python’s lambda is broken!”: […]
[…] Alex is right, it's not a bug. But it's a "gotcha" that traps many lambda enthusiasts. For the "bug" side of the argument from a haskel/functional type, see Andrej Bauer's post: math.andrej.com/2009/04/09/pythons-lambda-is-broken […]
[…] might find that Andrej Bauer got similar lambda problems. What’s interesting on that blog is the comments, where […]