Blocks in Ruby

A fundamental Ruby idiom explained.

When we talk about blocks in Ruby, we’re not usually talking about code blocks — or blocks of statements — as we might with other languages. We’re talking about a special syntax in Ruby, and one of its idioms. I’ll be discussing blocks in this article, plus a little about procs and lambdas.

Ruby’s blocks are always associated with methods, which are sets of recallable procedures. Blocks can’t get along very well by themselves. They are dependent on methods, which normally feed data to them. Without that data, a block can’t do anything useful. It needs a parent to look after it.

Blocks are anonymous and are sometimes referred to as nameless functions. Blocks are like methods within another method that grab data from an enclosing method. If all this is unfamiliar to you, it may not make sense. Keep reading and I’ll do my best to clear things up for you.

Block syntax

Here is an example that uses a block with the each method from Ruby’s builtin Array class. The each method is an iterator. An iterator munches data, usually in sequence, and with a little help, can actually do something useful with that data. A block doesn’t have to be an iterator, though that is how they are often used.

First, we’ll create an array containing the names of the Western states in the U.S., and then iterate over that array with Array‘s each method:

The block parameter e is surrounded by vertical bars. The parameter could have any name you want. (I tend to make mine short.) This particular block uses the parameter locally to keep track of each element in the array west_states, and later uses it to do something with each element of the array, in this case, tidily printing strings to standard output.

You can write a block with do and end, as shown, or with a pair of braces, as is most commonly done. The braces actually have higher precedence than do/end, and the syntax is more concise, as you can see:

Multiple parameters

A block may use more than one parameter. Multiple parameters are separated by commas. Here we’ll iterate over a hash with Hash‘s own each method, where multiple parameters make sense:

By the way, hashes are handy containers for key-value pairs, as you might have guessed. Also, each has a synonym in Hash: each_pair.

Life without blocks

What happens if you call the Array‘s or Hash‘s each method without a block? Well, iterators expect blocks. Without one, the each method simply returns an enumerator, nothing more.

Iterator methods like each don’t make much sense without blocks. For example, the upto or downto methods from Integer are fairly useless without blocks. Compare these calls, for example:

With these:

Nothing doing with the last two, except returned enumerators. Sort of like watching grass grow.

Scope

In Ruby 1.9 or later, if you use as a parameter as a variable name that already exists in the containing scope, the block assigns that parameter each successive value from the object, but the variable’s value is unchanged, as you see here (the to_a method converts the range to an array):

So don’t worry about variable and parameter names colliding in such instances, unless you are using a pre-1.9 interpreter.

The yield statement

As you know by now, a block must follow a method call. But something you might not know is that any method call may be followed by a block, and you can invoke code in such a block with a yield statement. We don’t always see yield at work — it is part of the underlying, implicit control structure of iterator methods. But here we’ll use it explicitly.

A yield statement executes a block associated with a method. I’ll use some really simple code from my recent book Ruby Pocket Reference, 2nd Edition to illustrate.

The following method, gimme, contains only a single yield statement and isn’t very exciting:

What so far does gimme do? Give gimme a call and find out (I’m doing this in irb, Ruby’s homegrown interactive programming environment):

Uh oh. This error showed up because yield‘s job is to execute the block that is associated with the method, and that’s missing in the code. Avoid this error by using the block_given? method from Kernel. Redefine gimme with an if statement:

Try gimme again with a very simple block (not an iterator!) and without:

Now redefine gimme to contain two yields, and then call it with a block:

Another thing you ought know is that after yield executes, control goes back to the statement immediately following it. There’s certainly more to say about yield, but I’ll leave it at that.

Do blocks have return values?

Just a note here, in closing, about return values and blocks. Blocks don’t really have return values, not in the same way their parent methods can. If you use a return statement in a block, the containing method will return, not the block. A block yields the value of its last expression. You don’t need to use return in a block, nor should you.

blocks, procs, and lambdas

A proc is a way to store a procedure in Ruby. Procs are often short, one-liners, though not always. One reason I’m bringing them up here is because a proc is not a proc without a block in Ruby.

First, a little background. A proc is a first-class object that comes complete with context. As a first-class object, a proc can be created at runtime, stored in data structures, passed as a parameter, and so on. To create a proc, you can call Proc::new, Kernel#lambda, or Kernel#proc.

The term lambda comes from Alonzo Church’s lambda calculus, which famously influenced the development of the Lisp programing language and more recent functional programming languages. Lambda logic can be found in a number of programming languages, including Lisp, Python, Swift, C#, and Ruby, among others. Generally, lambdas are anonymous functions that can be written inline and easily discarded.

What’s the difference between procs and lambdas? Lambdas behave more like methods and procs behave more like blocks, but both are instances of the Proc class. For brevity, I’ll only show a lambda here.

When creating a lambda with the methods mentioned, a block is required. Kernel‘s lambda method, for example, expects a block. A call to lambda is equivalent to calling Proc.new and both calls return a proc object. Here is a call to lambda which of necessity includes a block, followed by a call to the new proc:

By the way, since 1.9, you can use the following simplified, lambda literal syntax, with the same result:

There’s much to learn about lambdas. I just wanted to show, briefly, how blocks are used with procs. A fuller treatment of procs merits another article.

Summary

Let me wrap up with a brief summary of blocks. Blocks are essentially nameless functions that provide a concise way to iterate over objects. An iterator method such as each without a block will return only an enumerator. Blocks have one or more parameters. In addition, a block does not have a return value like a method. It yields the value of its last expression. Finally, stored procedures in Ruby — procs — use blocks as well.

Thanks for reading. Happy coding.


Note: If you’d like to get more detail, Mike suggests reading section 5.4 on blocks and section 6.5 on procs and lambdas in The Ruby Programming Language by David Flanagan and Yukihiro Matsumoto, plus chapter 8 on blocks in Lucas Carlson’s Ruby Cookbook. Both are from O’Reilly.

Public domain studs image via Pixabay.

tags: , , ,

Get the O’Reilly Programming Newsletter

Weekly insight from industry insiders. Plus exclusive content and offers.