Building APIs with Swagger

Getting an API design right demands far more than just figuring out which calls should do what. Public APIs — APIs meant to be used by people other than their creators — present a special set of challenges that can inform all API design. Even private APIs often find themselves with unexpected users, and can last far longer than was planned. Apigee faced the special challenge of creating a marquee API, an API for managing its APIs.

What comes first? The API or the code? Who is the API really for, and how important is the long-term maintenance of the API? Where does documentation fit? Answer these questions, and you can find the right approach.

First, we needed an API

Starting in early 2011, Apigee set out to redesign its product for managing APIs — a project that was expected to have a lot of requirements given the history of running the product in production.

Of course, given that “API” is literally in Apigee’s name, it was imperative that regardless of what customers wanted to do with the product, they needed to be able to do it using an API. Furthermore, the API couldn’t be an afterthought – it had to be as carefully designed as any other part of the user experience.

In the end, was the API project a success? Not completely. The resulting API is usable and serves its purpose well, but it’s not consistent.

In particular, there was trouble closing the loop between the team implementing the API and the people designing the API. By the time the API was coded, it didn’t match the text document that the design team had created, and switching back and forth between the “code” and “text” conceptions of the API design introduced errors and inconsistency.

As a result, in the process of improving an API management product, a new approach to closing the loop between design and code was born, based on the Swagger API specification format.

We can do better!

The first objective in the new effort was to continue to support Apigee’s philosophy of API design:

API design is important. It is the language that developers use to communicate with the API.
The style of APIs that Apigee designs is centered around well-known URIs and verbs.
Documentation is important and should be a first-class citizen.

At the same time, working more closely with Node.js, the team began to view it as an opportunity to quickly build APIs for a variety of situations.

After leveraging Node for internal projects and incorporating it into the Apigee product stack, it became evident that while Node isn’t the right choice for everything, it’s ideal for quickly building network-oriented code that performs well.

The final, and perhaps most important, piece of the puzzle was engaging with the Swagger community, and beginning to work with version 2.0 of Swagger. Swagger is a community-driven specification format for APIs. A Swagger document describes all the URI paths for an API, all the query parameters, the request and response bodies — basically everything that a client needs to know in order to successfully make API calls.

Apigee also worked closely with Reverb — the company that created Swagger — and others to evolve Swagger. Whereas the original Swagger was usually JSON that was generated from Java code annotations, Swagger 2.0 is a “human writable” format that lets a developer specify everything in the YAML format.

Evaluating the alternatives

In order to understand the approach, it’s important to first understand the alternatives.

Most existing frameworks for creating APIs fall into one of two basic categories:

The code defines the API
The API generates the code

Annotated source code: The code defines the API

The first category is represented by Java-based frameworks such as JAXRS, and extensions such as the original Swagger for Java. In these frameworks, the developer writes code, annotates the code to specify additional attributes of each API call (such as specific names and types of various parameters, descriptions, and additional validation rules).

For instance, here is an example of an API call written in Java using the Swagger framework:

@Path ("/my-resource")
@Api (value="/my-resource",
    description="Rest api for do operations on admin",
    produces=MediaType.APPLICATION_JSON)
@Produces ({ MediaType.APPLICATION_JSON })
class MyResource {
    @ApiOperation(value = "Get specific element",
        httpMethod = "GET",
        notes = "Fetch the selement of the collection",
        response = Response.class)
    @ApiResponses(value = {
    @ApiResponse(code = 200, message = "Element found"),
    @ApiResponse(code = 404, message = "Element not found"),
    @ApiResponse(code = 500, message = "Server error due to encoding"),
    @ApiResponse(code = 400, message = "Bad request: decoding error"),
    @ApiResponse(code = 412, message = "Prereq: Required data not found")
    })
    public Response get(

    @ApiParam(value = "UUID of the element", required = true)
    @PathParam("uuid") Sting uuid)

Once the code is written, a separate tool introspects the source code to generate documentation, client-side code, and other artifacts. (Sometimes this tool happens at runtime, as with the original Swagger, and sometimes it happens at deployment time or compile time, but the effect is the same.)

The advantage of this approach is that the code and documentation are never out of sync, as long as the documentation is always re-generated (and updated on the web site or wherever) whenever the code changes.

However, with this approach, there is no formal mechanism for the developer of the code to have a “conversation” with other parties regarding the API, other than via the source code itself.

There’s also not a great mechanism for the technical writer to tie quality documentation to the structure of the API. If the writing team wants to use the generated documentation as the basis of the “real” documentation, then that means that the tech writers need access to the code base in order to update the bits of documentation that are kept in the code.

With this kind of tool, the only real option is to have all parties collaborate on the code base itself, and to keep abreast of any changes (and if necessary, to stop them before going live). This works particularly poorly in closed source situations, or even in open source situations when the code that runs the API is sufficiently large or complex that the average “user” cannot be expected to keep track of it.

It is similarly cumbersome if the technical writers don’t want to have to learn how to build and test the code in order to check in documentation changes.

Also, what if you don’t want the documentation to match the code? Perhaps there are parts of the API that you don’t want to document, at least not right away. Or perhaps there is a need to change the docs without doing a code release. All of these things end up leading to the conclusion that, for “real products” at least, the API docs can’t simply be generated from the code and then put up on the website for everyone to see.

Generated source code: The API generates the code

The second category is represented by “IDL” (Interface Definition Language) systems such as SOAP, CORBA and the many RPC systems that have been developed over the years.

In these types of systems, the interface is formally defined in a separate file, and then used to generate client and server-side “stubs” that connect the bits sent over the network to actual code written by a developer. Developers on both sides then incorporate their stubs into the source code that they build and ship.

The advantages of this approach are performance and simplicity for the developer. Since the stub code is generated at compile time, the generator can take care to make it efficient and pre-compile it, and starting it up at runtime is fast since there is no need to parse the IDL. Furthermore, once the developer specifies the IDL, the interface that she needs to code to in order to make the stubs invoke her code is typically simple.

However, with generated stub code, the possibility of the code falling out of sync with the specification is still present, since the synchronization only happens when the code is re-generated. Good RPC systems make this simple, whereas more primitive systems that generate code and expect the user to modify it are much harder to maintain.

In addition, the same documentation problems as before still exist. If the IDL is annotated to contain the docs, and then generate the “real” docs from the IDL, then how do is everything kept in sync? By re-generating the stubs and re-building the product just because the docs were changed? If not, is there a risk that the re-generation is missed the next time a “real” change is made, resulting in clients and servers that don’t interoperate?

Plus, dealing with generated code in a modern software development process is painful. Do you manually run the stub generator and check in the results? If so, you had better not forget to run it, and remember never to modify the generated code manually. Or do you make the build system run it on every build? That may mean creating and testing custom build steps for whatever build system you use.

One advantage of this mechanism is the performance gain received by building the stubs at build time, rather than when the system starts up. That made a lot of sense in the 1990s. But with today’s immensely faster CPUs, the time needed to parse and validate an IDL file and generate a runtime representation is just not significant, and languages like JavaScript (and even Java) are dynamic enough that they can work without loads of generated code.

Node.js itself is a great example; even a simple Node.jsbased server loads and compiles tens of thousands of lines of JavaScript code when it is first started, and yet it is rare for the startup of a Node.jsbased application to take more than one or two seconds. (And yet, even a simple Java app, fully compiled, takes seconds if not more to start but I digress.)

No connection at all: The code is the API

Many other serverside frameworks, especially popular Node.jsbased frameworks like Express, do not have the concept of an API description at all. For instance, in the Express framework, the developer simply wires JavaScript functions to URI paths and verbs using a set of function calls, and then Express handles the rest.

This is a simple example of code written in the Express framework:

// GET method route
api.get('/', function (req, res) {
    res.send('GET request to the homepage')
})

// POST method route
api.post('/', function (req, res) {
    res.send('POST request to the homepage')
})

These frameworks are very nice for quickly building APIs and apps without a lot of pre-planning and configuration – that is why developers love them. However, they do not offer any model for automatically generating documentation or for sharing the API design with a larger community. They start great, but to sustain a complex, evolving API and collaborate with thousands of developers, a more formal description mechanism is needed.

So then why not write code like above, which admittedly every Node.js developer in the world knows how to write, and then annotate it with Swagger later? Because that approach results in an API defined in two places – once in Swagger and the other time in the code. While the code is obviously correct, what if it doesn’t implement the API design that everyone agreed on after carefully following the design principles? This is the issue described in the beginning: closing the loop between design and code teams.

A different philosophy using Swagger

Based on the team’s experience and on experience with popular frameworks, a fourth approach was proposed: the API design drives the code.

This idea resulted in the following philosophy:

The API must be designed first.
The artifact that represents the API design must drive the API runtime.
The API design will change, and the framework must make it possible to adapt
quickly without letting the code, design, and documentation fall out of sync.

Saying the design “drives” the code means just that the document that describes the API design is parsed and turned into a set of data structures that the runtime uses, in real time, to classify, validate, and route each incoming API call.

This bit is the part that most people misunderstand. This approach is not “model-driven,” in which there is some separate artifact that is generated from the code, or that is used to generate code. Rather, the API design is consumed every time the API server is started and used to decide how to process every incoming API call. If there were existing systems that worked this way in the past, the Apigee team is aware of them.

There are several advantages to this mechanism and very few costs. For instance, although the API definition is validated every time the API server is started, in this implementation the API server starts as quickly as any Node.js app.

Most importantly, with this approach it is not possible for the definition of an API and the implementation to fall out of sync.

The resulting bit of technology is called “swagger-node.” It consists of a validation component and a runtime component.

The validation component is a Node.js module that parses and validates the Swagger 2.0 API definition and turns it into an easily navigable data structure. Since it’s written in JavaScript, the same code is used to validate the API design in the server code and in the interactive Swagger Editor, which allows developers to see how their API documentation renders as they type.

The runtime component uses the validated data structure to wire the API design into whatever web framework the developer is using. Since there are so many web frameworks for Node.js (as I write this, someone is no doubt getting ready to post a new one to NPM and GitHub), swagger-node works with the most popular ones, including Express and HAPI.

For instance, when swagger-node is used with Express it works as “middleware” that plugs into the HTTP call processing chain, validates each API call, and routes it to some Node.js code that actually handles the API call.

Pulling it together

In addition to the basic API design, the API definition document can also include additional API metadata.

For instance, the Swagger 2.0 specification allows the API definition to be annotated with security information, such as what flavor of OAuth to use for particular API calls. It also allows for vendor-specific extensions. These can be used to specify additional information about the API contract, additional documentation fields, or information about policies that apply to the API traffic.

Based on these concepts, we used swagger-node as the basis for Apigee 127. “127” supports annotations in the Swagger document for:

OAuth authentication, with the ability to require different “scopes,” or no authentication at all, on a call-by-call basis
API key validation
Response caching
Spike arresting (service-level traffic limits designed to arrest out-of-control API clients)
Quotas (application and user-specific traffic limits designed to address business
requirements and allow APIs to be monetized)
API analytics (which are gathered at runtime and pushed asynchronously to the
Apigee cloud)

By using Apigee 127, a software developer for the Node.js platform can design an API, handle non-functional requirements without writing additional code, and quickly jump between the API definition and the code using the very short “compile”-edit-debug link that Node.js enables.

Others have used a similar approach in other languages. For instance, “swagger-inflector” is a new project that uses the same philosophy, but for Java rather than node.

Swaggernode, Apigee 127, swagger-inflector, and others are available on NPM and GitHub and have been used by many developers to build productive APIs. Please try them out and share how they can be better!

Acknowledgements

Ideas are cheap, but execution isn’t. The people who actually created swagger-node and the Swagger Editor are Scott Ganyo, Jeff West, Jeremy Whitlock, and Mohsen Azimi, and they were guided by Marsh Gardiner and Ed Anuff.

This post is part of our ongoing exploration into the explosion of web tools and approaches.

Public domain plug image via Pixabay.