Naming values with local bindings
Algorithm designers regularly find it useful to name the values their algorithms process. We consider why and how to name new values that are only available within a procedure.
Introduction
When writing programs and algorithms, it is useful to name values we compute along the way. For example, in an algorithm that, given a list of numbers, sorts that list of numbers, it may be useful to name the sorted list along the way. When we associate a name with a value, we say that we bind that name to the value.
So far we've seen three ways in which names are bound to values in Scheme.
- The names of built-in procedures, such as
+andquotient, are predefined. There are also some predefined values, such aspi. When the Scheme interpreter starts up, these names are already bound to the procedures they denote. - The programmer can introduce a new binding by means of a definition. A definition may introduce a new equivalent for an old name, or it may give a name to a newly computed value.
- When a programmer-defined procedure is called, the parameters of the procedure are bound to the values of the corresponding arguments in the procedure call. Unlike the other two kinds of bindings, parameter bindings are local -- they apply only within the body of the procedure. Scheme discards these bindings when it leaves the procedure and returns to the point at which the procedure was called.
As you develop more and longer procedures, you will find that there are many times you want to create local names for values that are not parameters. We will consider such names in this reading.
Redundant work
You will find that there are many times when you are designing algorithms that you end up telling the computer to do the same thing again and again and again. For example, here's a bit of code that determines the ratio of vowels to consonants and let's see how well it works.
;;; (tally f l) -> number?
;;; f : function?, a predicate over elements of l
;;; l : list?
;;; Returns the number of occurrences that f returns
;;; #t for each element of the list.
(define tally
(lambda (f l)
(if (null? l)
0
(+ (if (f (car l))
1
0)
(tally f (cdr l))))))
;;; (vowel? ch) -> boolean?
;;; ch : char?
;;; Determine whether ch is a vowel.
(define vowel?
(lambda (ch)
(or (char=? (char-downcase ch) #\a)
(char=? (char-downcase ch) #\e)
(char=? (char-downcase ch) #\i)
(char=? (char-downcase ch) #\o)
(char=? (char-downcase ch) #\u))))
;;; (consonant? ch) -> boolean?
;;; ch : char
;;; Determine if ch is a consonant
(define consonant?
(lambda (ch)
(and (char-alphabetic? ch) (not (vowel? ch)))))
;;; (v2c-ratio str) -> rational?
;;; str : string
;;; Determine the ratio of vowels to consonants in str
(define v2c-ratio
(lambda (str)
(/ (tally vowel? (string->list str))
(tally consonant? (string->list str)))))
(v2c-ratio "Hello")
(v2c-ratio "Aaargh!")
(v2c-ratio "aeiouxy")
; Whoops!
(v2c-ratio "a")
'Eh. It's good enough for now.
But there's a problem with the design. We're repeating some work.
Please identify some points at which we repeat work.
We mean it.
That is, stop here and ask yourself, "Where does the code duplicate work?"
Are you done?
Are you sick of these pauses? If more students paused when we asked questions, we wouldn't have to insert these kinds of notes.
Anyway, work is being repeated in at least three ways.
- First, we call
(char-downcase ch)as many as five times. It feels like we should be able to do it once. - Second, we call
(string->list str)twice. It feels like we should be able to do it once. This may even have a bigger effect than the five calls tochar-downcasebecause building lists can be "expensive". - Third, we ask if many letters are a vowel twice, once when counting the vowels and once when counting the consonants.
That last problem is going to be hard to deal with, particularly as we
try to keep our code readable. So let's focus on the first two. Can
we avoid calling (char-downcase ch) so many times?
Yes, you can fill in the annoying delay now.
Here's one approach: We can decompose the task into two tasks. We'll write one procedure that determines if a character is a lowercase vowel.
;;; (lower-case-vowel? ch) -> boolean?
;;; ch : char?
;;; Determine whether ch is a lowercase vowel.
(define lower-case-vowel?
(lambda (ch)
(or (char=? ch #\a)
(char=? ch #\e)
(char=? ch #\i)
(char=? ch #\o)
(char=? ch #\u))))
That seems straightforward enough, doesn't it? Our vowel? procedure
can then call that other procedure using a letter converted to lowercase.
;;; (vowel? ch) -> boolean?
;;; ch : char?
;;; Determine whether ch is a vowel.
(define vowel?
(lambda (ch)
(lower-case-vowel? (char-downcase ch))))
About as easy to read as before. A little more typing on our part. And we've saved some computation. In fact, this kind of decomposition is a strategy programmers use fairly frequently: Rewrite procedures that do a sequence of steps by using one procedure for each step.
In fact, now that we've phrased it as a sequence of steps, we can take advantage of composition.
;;; (vowel? ch) -> boolean? ;;; ch : char? ;;; Determine whether ch is a vowel. (define vowel? (o lower-case-vowel? char-downcase))
Detour: Why are we repeating the documentation each time we show
you a new imlementation of the vowel? predicate? Mostly to
remind you that there's a value to writing documentation, even
when our procedures are short and simple.
Will anyone ever use lower-case-vowel? by itself? Possibly.
But if not, we should try something else. Is there something
else?
Racket's let Expressions
There is. Racket provides let expressions as a way to bind
values to names. A let-expression contains a binding list and
a body. The body can be any expression, or any sequence of
expressions, to be evaluated with the help of the local name bindings.
The binding list takes the form of a parentheses enclosing zero or
more binding expressions of the form (name value).
That precise definition may have been a bit confusing, so here's the
general form of a let expression
(let ([name1 exp1]
[name2 exp2]
...
[namen expn])
body
As shorthand, we call each of the name-expression pairs of a
let-expression a binder.
When the Racket evaluator encounters a let-expression, it begins
by evaluating all of the expressions inside its binding specifications.
Then the names in the binding specifications are bound to those
values. Next, the expressions making up the body of the let-expression
are evaluated, in order. The value of the last expression in the
body becomes the value of the entire let-expression. Finally, the
local bindings of the names are cancelled. (Names that were unbound
before the let-expression become unbound again; names that had
different bindings before the let-expression resume those earlier
bindings.)
What do we mean by "bound"? As you may recall, the Racket evaluator maintains a table of names and corresponding values. We call that association a "binding". When the evaluator encounters a name, it looks up the binding in the table.
Here's one way to write vowel? with let and
without helpers.
;;; (vowel? ch) -> boolean?
;;; ch : char?
;;; Determine whether ch is a vowel.
(define vowel?
(lambda (ch)
(let ([lc (char-downcase ch)])
(or (char=? lc #\a)
(char=? lc #\e)
(char=? lc #\i)
(char=? lc #\o)
(char=? lc #\u)))))
Important! Note that even though binding lists and binding
specifications start with parentheses, they are not procedure
calls; their role in a let-expression simply to give names to
certain values while the body of the expression is being evaluated.
The outer parentheses in a binding list are structural -- they are
there to group the pieces of the binding list together.
As we've seen, using a let-expression often simplifies an expression
that contains two or more occurrences of the same subexpression.
The programmer can compute the value of the subexpression just once,
bind a name to it, and then use that name whenever the value is
needed again. Sometimes this speeds things up by avoiding such
redundancies as the recomputation of values. In other cases, there
is little difference in speed, but the code may be a little clearer.
Comparing let and define
You may have missed it, but there are a few subtle and important
issues with the use of let rather than define to name values
and procedures. One difference has to do with the availability
(or scope) of the name. Values named by define are available
essentially everywhere in your program. In contrast, values named by
let are available only within the let expression. (In case you were
wondering, the term scope has nothing to do with the mouthwash.)
In addition, local variables (given by let) and global variables (given by our
standard use of define) affect previous uses of the name differently (or at
least appear to). When we do a new top-level define, we permanently replace
the old value associated with the name. That value is no longer accessible. In
contrast, when we use let to override the value associated with a name, as
soon as the let binding is finished, the previous association is restored.
Finally, there's a benefit to using let instead of define according to
the principle of information hiding. Evidence suggests that programs
work better if the various pieces do not access values relevant primarily
to the internal workings of other pieces. If you use define for your
names, they are accessible (and therefore modifiable) everywhere. Hence,
you enforce this separation by policy or obscurity. In contrast, if
you use let to define your local names, these names are completely
inaccessible to other pieces of code. We return to this issue in our
discussion of the ordering of let and lambda below.
Our mental model of computation and let
Now that we've discussed how let works at a high-level, let's consider how let behaves precisely within the context of our mental model of computation.
For example, consider the following expression:
(let ([x (+ 1 1)]
[y (* 2 3)]
[z (- 8 5)])
(+ x y z))
We think about evaluating a let-expression in multiple stages.
- First, we evaluate the expression of each binding of the
letto values. - We then (a) substitute the resulting value for the binder's corresponding variable everywhere that variable occurs in the body of the
letand (b) substitute the body of theletfor the overalllet-expression. - Finally, we continue evaluating the substituted
let-body as normal.
Let's see how this works in our example above.
First, we must evaluate each of the binding expressions in turn:
(let ([x (+ 1 1)]
[y (* 2 3)]
[z (- 8 5)])
(+ x y z))
--> (let ([x 2]
[y (* 2 3)]
[z (- 8 5)])
(+ x y z))
--> (let ([x 2]
[y 6]
[z (- 8 5)])
(+ x y z))
--> (let ([x 2]
[y 6]
[z 3])
(+ x y z))
Now, we must substitute values-of-bound-variables in the body of the let and then substitute the let-body itself.
From the above expression, we will substitute 2 for x, 6 for y, and 3 for z everywhere in the expression (+ x y z).
The resulting expression is what the entire let evaluates to.
From there, we evaluate the expression normally to arrive at a final result.
--> (let ([x 2]
[y 6]
[z 3])
(+ x y z))
--> (+ 2 6 3)
--> 11
Note that we only substitute for the variable as it appears in the body of the let.
We do not substitute for the variable if it occurs outside of the body!
For example, if we nest a let-binding in larger computation:
(+ (let ([x 1])
(* x 5))
(- x 1))
There are two occurrences of x here:
(* x 5): this is the body of theletso we will eventually substitute1for it.(- x 1): this is the second argument to+and is outside the body of thelet, so we do not substitute for it. Presumably, thisxis bound by a different construct, e.g., adefine.
Sequencing bindings with let*
Sometimes we may want to name a number of interrelated things. For
example, suppose we wanted to square the average of a list of numbers.
(It may not sound all that interesting, but it's something that people do
sometimes). Since computing the average involves summing values, we may
want to name three different things: the total (the sum of the values),
the count (the number of values), the mean (the average of the values). We
can nest one let-expression inside another to name both things.
(let ([total (reduce + values)]
[count (length values)])
(let ([mean (/ total count)])
(* mean mean)))
One might be tempted to try to combine the binding lists for the nested
let-expressions, thus:
; Combining the binding lists doesn't work!
(let ([total (reduce + values)]
[count (length values)]
[mean (/ total count)])
(* mean mean))
This approach won't work. (Try it and see!). It's important to
understand why not. The problem is as follows. Within one binding list,
all of the expressions are evaluated before any of the names are
bound. Specifically, Scheme will try to evaluate both (reduce + values)
and (/ total count) before binding either of the names total and
mean; since (/ total count) can't be computed until total and
count have value, an error occurs.
The takeaway message is thus: You have to think of the local bindings coming into existence simultaneously rather than one at a time.
Because one often needs sequential rather than simultaneous binding,
Scheme provides a variant of the let-expression that rearranges the
order of events: If one writes let* rather than let, each binding
specification in the binding list is completely processed before the
next one is taken up:
; Using let* instead of let works!
(let* ([total (reduce + values)]
[count (length values)]
[mean (/ total count)])
(* mean mean))
The star in the keyword let* has nothing to do with multiplication. Just
think of it as an oddly shaped letter. It means do things in sequence,
rather than all at once. While someone probably knows the reason to
use * for that meaning, the authors of this text do not.
let* and our mental model
Our intuition is that let processes each of its bindings independently and in parallel.
In contrast, let* processes each of its bindings sequentially so that later bindings can refer to earlier bindings.
This is precisely how our corresponding mental model of let* works.
In contrast to let:
-
First, we process the bindings in-order by:
- Evaluating the expression of the binding
let*to a value. - Substituting that value everywhere that binder occurs in each successive binding and the body of the
let*. - Peel off the processed binding, reducing the number of binders by one.
- Evaluating the expression of the binding
-
Once we are done processing all the binders, note that we should have substituted for all of variables bound by the
let*(similarly tolet). We then substitute the body of theletfor the overalllet*-expression. -
Finally, we continue evaluating the substituted
let-body as normal.
Here is a simple let* expression and how it evaluates step-by-step in this model:
(let* ([x 5]
[y (* x 2)]
[z (+ x y)])
(+ x y z))
--> (let* ([y (* 5 2)]
[z (+ 5 y)])
(+ 5 y z))
--> (let* ([y 10]
[z (+ 5 y)])
(+ 5 y z))
--> (let* ([z (+ 5 10)])
(+ 5 10 z))
--> (let* ([z (+ 15)])
(+ 5 10 z))
--> (+ 5 10 15))
--> 30
Positioning let relative to lambda
In the examples above, we've tended to do the naming within the body of the procedure. That is, we write:
(define proc
(lambda (params)
(let (...)
exp)))
However, Scheme also lets us choose an alternate ordering. We can instead
put the let before (i.e., outside of) the lambda.
(define proc
(let (...)
(lambda (params)
exp)))
Why would we ever choose to do so? Let us consider an example. Suppose that we regularly need to convert years to seconds. (About a decade ago, Prof. Rebelsky said, "When you have sons between the ages of 5 and 12, you'll understand.") You might begin with
(define years-to-seconds
(lambda (years)
(* years 365.24 24 60 60)))
This produce does correctly compute the desired result. However, it is a bit hard to read. For clarity, you might want to name some of the values.
(define years-to-seconds
(lambda (years)
(let* ([days-per-year 365.24]
[hours-per-day 24]
[minutes-per-hour 60]
[seconds-per-minute 60]
[seconds-per-year (* days-per-year hours-per-day
minutes-per-hour seconds-per-minute)])
(* years seconds-per-year))))
(display (years-to-seconds 10))
We have clarified the code, although we have also lengthened it a
bit. However, as we noted before, a second goal of naming is to avoid
re-computation of values. Unfortunately, even though the number of seconds
per year never changes, we compute it every time that someone calls
years-to-seconds. How can we avoid this re-computation? One strategy
is to move the bindings to define statements.
(define days-per-year 365.24)
(define hours-per-day 24)
(define minutes-per-hour 60)
(define seconds-per-minute 60)
(define seconds-per-year
(* days-per-year hours-per-day minutes-per-hour seconds-per-minute))
(define years-to-seconds
(lambda (years)
(* years seconds-per-year)))
However, such a strategy is a bit dangerous. After all, there is nothing to prevent someone else from changing the values here.
(define days-per-year 360) ; Some strange calendar, perhaps in Indiana.
...
> (years-to-seconds 10)
311040000
What we'd like to do is to declare the values once, but keep them local
to years-to-seconds. The strategy is to move the let outside the
lambda.
(define years-to-seconds
(let* ([days-per-year 365.24]
[hours-per-day 24]
[minutes-per-hour 60]
[seconds-per-minute 60]
[seconds-per-year (* days-per-year hours-per-day
minutes-per-hour seconds-per-minute)])
(lambda (years)
(* years seconds-per-year))))
(years-to-seconds 10)
As you will see in the lab, it is possible to empirically verify that the bindings occur only once in this case, and each time the procedure is called in the prior case.
One moral of this story is *whenever possible, move your bindings
outside the lambda!. Let's return to the vowel?
procedure we wrote above.
;;; (vowel? ch) -> boolean?
;;; ch : char?
;;; Determine whether ch is a vowel.
(define vowel?
(lambda (ch)
(let ([lc (char-downcase ch)])
(or (char=? lc #\a)
(char=? lc #\e)
(char=? lc #\i)
(char=? lc #\o)
(char=? lc #\u)))))
That code is still somewhat repetitious. After all, we're doing the
same thing for each of the cases: comparing. For starters, we can
use lists and their associated functions to reduce the clutter of
the 5-way or call.
;;; (vowel? ch) -> boolean?
;;; ch : char?
;;; Determine whether ch is a vowel.
(define vowel?
(lambda (ch)
(let ([lc (char-downcase ch)]
[vowels (string->list "aeiou")])
(apply (lambda (b1 b2) (or b1 b2))
(map (section char=? lc _) vowels)))))
(vowel? #\t)
(vowel? #\e)
(vowel? #\0)
But that definition requires Scheme to build the list every time
we call the vowel? procedure. It may not
matter if we do that once, or twice, or even a hundred times. But
when we're tallying a list of 42,000 elements, e.g., comparing vowels
to consonants in The Wizard of Oz, that's a lot of
extra work. Hence, we might more sensibly write the following.
;;; (vowel? ch) -> boolean?
;;; ch : char?
;;; Determine whether ch is a vowel.
(define vowel?
(let ([vowels (string->list "aeious")])
(lambda (ch)
(let ([lc (char-downcase ch)])
(apply (lambda (b1 b2) (or b1 b2))
(map (section char=? lc _) vowels))))))
(vowel? #\t)
(vowel? #\e)
(vowel? #\0)
Unfortunately, it is not always possible to move the bindings outside of
the lambda. In particular, if your let-bindings use parameters, then you
need to keep them within the body of the lambda.
;;; (vowel? ch) -> boolean?
;;; ch : char?
;;; Determine whether ch is a vowel.
(define vowel?
(let ([vowels (string->list "aeious")]
[lc (char-downcase ch)])
(lambda (ch)
(apply (lambda (b1 b2) (or b1 b2))
(map (section char=? lc _) vowels)))))
(vowel? #\t)
(vowel? #\e)
(vowel? #\0)
If you try to run this, it will complain that it doesn't know what ch
is. (Or, worse yet, it will use some other ch that bears no relation
to the input to the procedure).
Local procedures
As you may have noted, let behaves somewhat like define in that
programmers can use it to name values. But we've used define to name
more than values; we've also used it to name procedures. Can we also use
let for procedures?
Yes, one can use a let- or let*-expression to create a local name
for a procedure. And we name procedures locally for the same reason that
we name values, because it speeds and clarifies the code.
(define hypotenuse-of-right-triangle
(let ([square (lambda (n) (* n n))])
(lambda (first-leg second-leg)
(sqrt (+ (square first-leg) (square second-leg))))))
Regardless of whether square is also defined outside this definition
(e.g., as a procedure that draws squares), the local binding gives
it the appropriate meaning within the lambda-expression that describes
what hypotenuse-of-right-triangle does.
Note, once again, that there are two places one might define square
locally. We can define it before the lambda (as above) or after the lambda
(as below). In the first case, the definition is done only once. In the
second case, it is done every time the procedure is executed.
(define hypotenuse-of-right-triangle
(lambda (first-leg second-leg)
(let ([square (lambda (n) (* n n))])
(sqrt (+ (square first-leg) (square second-leg))))))
So, which we should you do it? If the helper procedure you're defining
uses any of the parameters of the main procedure, it needs to come after
the lambda. Otherwise, it is generally a better idea to do it before
the lambda. As you practice more with let, you'll find times that each
choice is appropriate. It may be difficult at first, but it will become
clearer as time goes on.
Self checks
Check 1: Exploring let (‡)
What are the values of the following let-expressions? You may use
Scamper to help you answer these questions, but be sure you can
explain how it arrived at its answers.
a.
(let ([tone "fa"]
[call-me "al"])
(string-append call-me tone "l" tone))
b.
(let ([total (+ 8 3 4 2 7)])
(let ([mean (/ total 5)])
(* mean mean)))
c.
(let* ([total (+ 8 3 4 2 7)]
[mean (/ total 5)])
(* mean mean))
d.
(let ([total (+ 8 3 4 2 7)]
[mean (/ total 5)])
(* mean mean))
e.
(let ([inches-per-foot 12.0]
[feet-per-mile 5280])
(let ([inches-per-mile (* inches-per-foot feet-per-mile)])
(* inches-per-mile inches-per-mile)))
Check 2: Comparing let and let*
-
You may have discovered that Check 1b and Check 1c are equivalent. Which do you prefer? Why?
-
Rewrite Check 1e to use
let*.
Check 3: Ratios, revisited (‡)
-
Rewrite
v2c-ratiousing a helper procedure to avoid the redundant call tostring->list. -
Rewrite
v2c-ratiousingletto avoid the redundant call tostring->list.