21st century smarter government is 'data-centric' and 'digital first,' says US CIO

US CIO Steven VanRoekel says that machine-readable open data must be the 'new default' in government.

Any nation’s top government IT executive has a tough gig in the 21st century. The United States chief information officer, for instance, has an immense budget to manage — an estimated $80 billion dollars in annual federal IT spending.

Steven VanRoekelUS CIO Steven VanRoekel (@StevenVDC), who started work in the White House just over eight months ago, must address regulatory compliance on privacy and security, and find wasteful spending.

As the nation’s federal CIO, he has inherited a staggering challenge: evolve the nation’s aging IT systems toward a 21st century model of operations. In the age of big data, he and everyone who works with him must manage a lot of petabytes, and do much more with less. He must find ways to innovate to meet the needs of the federal government and the increased expectations of citizens who transact with cutting-edge IT systems in their personal and professional lives.

When he was named to the position, he told the New York Times: “We’re trying to make sure that the pace of innovation in the private sector can be applied to the model that is government.”

From adjusting to the needs of an increasingly mobile federal workforce to moving to the cloud to developing a strategy for big data, it’s safe to say that VanRoekel has a lot on his plate. When he was named to the post, it was also safe to say that there were reasons to be hopeful about his prospects. Under VanRoekel, FCC.gov got a long overdue overhaul to reboot as an open government platform. In the process, he and his team tapped into open source, the cloud, and collective intelligence.

He brought a dot-com mentality to the FCC, including a perspective that “everything should be an API” that catches some tech observer’s eye. He worked with an innovative new media team that established a voice for @FCC on social media, where there had been none before, and an FCC.gov/live livestream that automatically detected the device a viewer used to access it.

VanRoekel is the man who told me in April that “the experiences that live outside of FCC.gov should interact back into it. In a perfect world, no one should have to visit the FCC website.” Instead, he said, you’d go to your favorite search engine or favorite app, and open data from the FCC’s platform would be baked into it.

“If we think of citizens as shareholders, we can do a lot better,” he said. “Under the Administrative Procedure Act, agencies will get public comments that enlighten decisions. When citizens care, they should be able to give government feedback, and government should be able to take action. We want to enable better feedback loops to enable that to happen.”

After VanRoekel spoke at the FOSE conference this month, I walked with him to the Old Executive Officer Building, next to the White House, to dig a bit deeper into some of the broad strokes he outlined in his speech that morning. (The images below come from pictures taken during his presentation.)

The Office of Management and Budget is widely expected to release a strategy on mobile and data-centric government in the near future. Our interview, which follows, touches upon many of the issues outlined above and provides some insight into the strategic thinking of one of the key players in the Washington tech policy world.

The Washington Post reported that some half a million BlackBerry devices are still in use in government. Is most of the federal workforce still on that mobile platform in 2012?

Steven VanRoekel: I’d say it’s predominantly Blackberry.

I understand, however, that iPhones, are now usable in the Executive Office Building Suite. Do you have an IOS device?

Steven VanRoekel: Yes. And yes, an iPad and my iPhone. They’re part of a “bring your own device pilot” that we’ve been running. We also have Droid clients. It’s been live for about a month or so. Hundreds of people [in the Executive Office alone] are using it. It’s very popular.

What are the biggest opportunities for the federal government in its use of mobile right now? What role will open data play in making government a platform?

Steven VanRoekel: Mobile is representative of a larger opportunity because of the influence of consumerization of technology on the IT space and the role of the CIO. The inflection point we’re finding ourselves in will lead, I think, to a bigger phenomenon.

We first began looking at two reform strategies: a web reform strategy and a mobile strategy. When we put everything out on the table, we realized the kingpin to both strategies was open data.

It takes the parts that I described [at the FOSE Conference in DC]:

  1. An open data foundation
  2. A new way of thinking about the application layer
  3. The platform layer that consumes open data and does important things
  4. The presentation layer, and doing that in a way that’s more consistent

The way we’ve built applications, solutions, and everything that government’s been involved in, has been in silos. Everybody builds their own and does their own thing. There’s an opportunity with this inflection point.

Does that mean government mobile application development must change?

Steven VanRoekel: Mobile’s the best example of where we can push across the boundaries and have more of a consistent (but device-agnostic) view across government, that actually leads to a real shift in the way we deliver solutions and applications. I think it applies both inside and outside of government: There’s also the “citizen side of mobile,” too.

When you’re carrying a mobile device, you have the power of every federal agency in your pocket. How do we turn that to good? How do we take all of these physical things and move them into the world of virtual in a way that’s going to be meaningful for citizens?

decoupling-content-delivery-small.jpg

The federal government must fundamentally shift how it thinks about digital information and data. Rather than primarily thinking about the final presentation and tightly coupling that presentation with the underlying data (content, information, etc.) — whether it’s a web page or a mobile app — we must employ a data-centric approach that ensures our data is available through multiple channels without developing costly and separate processes for presenting each channel. If we do open data right, we will do web and mobile right at the same time.

Mobile is also a huge opportunity for us to take the “innovate with less” approach. That’s when we go around and look at mobile implementation across government. Use the USDA [for example]: more than 1,000 mobile contracts. Now they’re down to three, with hundreds of millions of dollars saved just in a year.

We went down to the city of Atlanta. We said, “Okay, let’s not pick D.C., let’s pick another place where there are federal agencies and just go look at what they spend on unlimited data plans on their mobile devices.” Between the lowest and the highest unlimited data plan on the same device, there’s an $81 difference in price. That just shows us that there’s low-hanging fruit out there to save taxpayers money and then to take that savings and pour it back into innovation.

As an aside, an unfortunate outcome of the mobile revolution has been for government agencies to treat mobile differently; they develop different content through different processes. In many cases, this creates redundancies. A data-centric approach decouples information from its presentation. If we instead focus on divorcing data from how and where it’s presented, we solve both problems at the same time. Luckily, we are not far down the road on this front.

Are you thinking about privacy and security on mobile devices?

Steven VanRoekel: Privacy and security are the foundation for everything we do. Today’s technology landscape has made it more important than ever to take into account privacy and security of government devices and data. We must embed security, privacy, and data protection into the entire life cycle of technologies, and adopt new solutions that will enable consistent security in an evolving world.

How are you thinking about supporting open data, with respect to creating open standards and, in particular, with respect to improving data quality? The last issue continues to be a notable issue across government, no matter what level of government you’re in.

Steven VanRoekel: To improve the quality of our data, we need to start digital from the beginning. Our data is produced just like most things in government — within the silos of programs and agencies. We gather, procure or produce data without thinking beyond the immediate need for the data, or without thinking about how to make it accessible for other government agencies or the public.

If we start all these processes with standardization around the way we collect it, we will promote quality through the life cycle of the data. Many of the issues with the quality of data come from human error at the point of turning paper information into digital information.

It’s a multi-step process:

First, we need to improve the quality of our data by making government services digital from the start. We have made good progress here — online tax filing continues to grow and is becoming the default way people file their taxes, and we have brought both Social Security and passport applications online in this Administration.

Second is more consistent use of metadata tags and data standards to increase interoperability of data for use not only within government programs, but by citizens and the private sector.

Third, we need to make our information more usable by exposing data through APIs and web services, and providing citizen developers the tools they need to put our data to work, to join us as partners in building better services for the public on top of government data — promoting quality through data’s life cycle.

Data quality will remain an issue unless we improve the input side as well as the use and consumption side. Anybody who’s carrying a big three-ringed binder to perform his or her job, or carrying a clipboard to collect data is prime real estate, in my mind, for someone who needs a smarter device for that data collection.

We were actually contemplating something like this when we shipped broadbandmap.gov at the FCC.

I wanted to do a validation layer of the map to understand the data that was being submitted was actually from carriers. [I asked,] “Can we validate against their marketing claims, in some sense of where their coverage maps are?”

So we started down the road of doing a typical procurement to say, “Okay, we should hire a contractor to go out and test some stuff for us and figure that out.” And, of course, it came back as a multimillion dollar contract and was not going to get us the richness and geographic dispersed data that we wanted. So then, in a second step I’ve really never talked about, we started to explore how much it would cost us to actually put a smart device in every postal truck in this country. As it drives around, it just assesses what’s going on from an infrastructure standpoint and sniffs the network and gives us a geo point.

So you wanted to give postal trucks sensor packages like Google Street View cars?

Steven VanRoekel: Exactly. As they drive around, they’d give us the sense of the 3G coverage, drop zones and different things like that. Could we pull that off? So we started to explore that. It was innovative — and I was excited about it because I think if you set that up as a platform, it’d be pretty neat to be able to do some data-gathering — but it was cost prohibitive.

So then we said, “let’s build mobile apps to go do this.” And we built Droid and iPhone speed measurement apps, along with a desktop version that you could run on your home computer.

We had millions of data points come back to us. It cost something like $50,000 to build this whole infrastructure in apps. There were questions about the data quality. I mean, if you run the speed app here [in the office] versus standing 20 feet that way, outside, you’re going to get a different result. But what we noticed was that by looking at different points of data and starting to build relationships between the points of data, you could start to get a really meaningful map put together. As a decision metric, it was pretty powerful. And it was super low cost.

What about giving citizens or data consumers, particularly open government watchdogs, developers and civic startups, the ability to communicate with government about specific datasets? For instance, what if there’s a dispute about energy data that’s disclosed by a utility using the new Green Button. Would you see citizens creating or using their own energy measurement apps? Is that even something that government can be or should be enabling citizens to do?

Steven VanRoekel: I haven’t thought a lot about it. One feature we have thought about — and that we actually implemented — is in the open source version of Data.gov that we just showed last week. It’s user feedback on data entry.

Basically, you could look at data that the government’s providing and then give feedback on individual line items or the datasets. In the workflow, we actually have reporting mechanisms that go back to the publisher of the data and say, “This data needs to be corrected” or “Something’s wrong here.”

That’s a bit of this. Now, how do you do that at scale? If you’re the Treasury Department and you’re respecting privacy but putting out tax data or something in millions and millions of roads, how are you combing through that? How are you finding that?

I think that our first goal here is to start to get some standardization in the schema and deliver mechanisms of this data. We’re not going to put forth an effort to actually redefine all government schema and try to create a standards effort to do everything, nor will we have Data.gov be the schema catalog for government. It’s insurmountable to do that given what we’ve got at the government.

I think what we can do is agree on a middle-level metadata structure for all government data and have Data.gov be the metadata catalog for government data and resources. So as a developer, you would come to Data.gov and search for a data set. We’d point you to it and, upon pointing, you would then discover how to connect to it in a meaningful way.

So data itself would still “live” on an agency’s website?

Steven VanRoekel: Right, and the schema can change over time — like census years, for example. Every time you hit a census year, the schema attempts to change because there’s some new demographic that we were asked to follow.

We don’t want to maintain that from a centralized repository, but a metadata standard would be good.

One thing to point you to is the report on health IT by the President’s Council of Advisors in Science and Technology. On page 51, there’s a great description of this metadata layer, both on respecting privacy and security, and use on ways of describing data. That is really the direction I’m taking on thinking about Data.gov.

Data.gov and the federal government working toward open data is still very much a new thing. There’s a significant challenge, however, represented by scanning and digitization. What are the lessons learned, in terms of what has been done so far, and how are you going to improve upon the record to date?

Steven VanRoekel: One is that Data.gov was a great cultural tool that shocked the system. It got people to pay attention to the value of open data and what it can do.

Some advocates credit Data.gov with catalyzing the launch of dozens of other open data platforms around the United States and world in the past two years. Do you agree?

Steven VanRoekel: Totally. It was the right first step. The right next step is to think about how we create a developer extension to this data in a way that’s more discoverable and more usable in real time. That’s what it will take for open data to grow. I think of enriching the [development] community’s effort because of the role they play to help synthesize data for people. Not everyone who has an interest in data is a developer who’s going to fire up an API key manager and dive in.

OK. Let’s talk a bit about open data and U.S. collaboration with the Indian government on “open sourcing” Data.gov. What problems will that solve for governments and countries?

Steven VanRoekel: The importance to other countries is that this has the voice of the United States behind it. It was announced by President Obama at the United Nations, during the Open Government Partnership launch. Driving that forward is very important.

The next step of this is starting to get other countries folded into the development process. There’s more work to do. I think there are more things we can do. Localization is one example. We’ve built extensibility in there for localization.

The code is built on Drupal, right? Will there be modules that can be added to this over time? How does open source relate to collaboration around open government?

Steven VanRoekel: That’s right. I think there’s a sort of “Brand America” applied to this in a way that I think will be very positive for the rest of the world.

The second part of this is, why open source: There’s no licensing restrictions. You can get it out to the rest of the world. That’s a key part of it.

There’s also a culture around open source that is brought to bear on this effort. We want to collaborate across multiple bodies, multiple entities, and get lots of people involved, focusing on continuous improvement. That’s the spirit of open source, where you’ve got different communities of people really getting together and doing great work. The bottom line on this project is doing it in a way where we can hand it off to others and they’ll build the next great feature, and keep the ball moving on it.

So, what can we learn from all of the challenges we’ve seen around digitizing data?

Steven VanRoekel: I think the first motion we need to have is, first and foremost, establishing the new default. Just say, on a going forward basis, we need to convert these systems. It’s what I did at the FCC, right? Declare “machine readable” as the new default.

We went and did a bunch of the hard work to build middleware, to translate Web services and data, to “XML-ize” all of the data we had on the back end.

As we deal all the time with churn, investment in systems, and build out new initiatives, there’s an opportunity to start to move the ball forward on all of this. We need to establish the new default. I think that actually extends into Congress as well. I’ve had conversations saying, “Okay, when you’re doing a bill that calls for the collection of data, in terms of education data, the public good, public welfare or other things, why don’t we put some boiler plate language into it that says ‘make this data machine readable.’ Let’s do that in some way.” If it’s financial data, maybe XBRL is a good direction to take.

I don’t want to pre-describe something in law that would then give vendor preference, but I do think “machine readable” and similar concepts are loose enough that we can actually be very vendor-neutral in the way we approach this while getting a long way in data quality on both the collection side and on the output side.

Just a couple of years ago, I went out and got passports for all of my kids because we were going on an international trip. I filled out the form online, printed it off, and then you have to submit the paper PDF to people. In eighth grade, I took the space out of my last name to debug a basic program I was writing because it would break off space in my name. It was a name parser. You start at the right and you move left. For “Mary Anne’s” and others in my class, there were always spaces in the first name but none in the last name. So I took the space out of my last name, and the app worked great. When I got married to my wife, I legally changed it, so now I have no space in my last name.

So does that mean you’re a machine readable CIO? Is that how that works?

Steven VanRoekel: That’s right. All my kids henceforth and generations going forward are named for a debug of a BASIC program on an Apple IIe.

So your wife puts a presentation layer on you before you go out in public in the morning?

Steven VanRoekel: Exactly. So when we did these passports, we handed in the paper applications for my three kids. Two of them came back with spaces in the last name and one didn’t. And they were all spelled exactly the same, with no space. If you remember, on the passport form, there are no caps. It’s not a big “V,” big “R”; it’s just all one word. So a human made the assumption that there’s a space in those names.

Now, however, you can apply for a passport completely electronically and that works.

That’s a step forward for passport processing, certainly, but what about the serious issue of scanning analog government documents?

Steven VanRoekel: We are taking a “looking forward, looking back” approach to improving access to government digital services.

If you start to push quality on the frontend, you’re going to get there. I do think we need to look at scanning. We need to look at opening up these documents.

I think we need to keep looking at the value of data and then focus on the most valuable archives to bring into the digital world. All of the documents were being released as image-based PDFs when I showed up [at the FCC]. I forced the team to go back and use some software you could run with OCR [optical character recognition] to convert image-based PDFs.

Our focus is on creating a new default so that all data is machine readable and easily accessible moving forward. When we make good progress there, then we will be looking back and prioritizing the most strategic and cost-effective ways to import legacy information into the new paradigm. The recent release of the 1940s Census shows some promise in this area.

When you’re managing director of the FCC, you can force your staff to go back and convert the PDFs. When you’re CIO of the United States, can you force federal agency workers to do the same thing?

Steven VanRoekel: I do have policy authority to do things like that, but you have to be thoughtful about it.

Federal government workers have been hearing policy directives from OMB for decades. It doesn’t mean they always do it.

Steven VanRoekel: Right. But you also have to own all of those things. You have to be thoughtful about what things cost. How are you going to run that forward? What’s the value at the end of the day? I think we need to look at those options, think about what we’re doing, and be smart about it. Citizens don’t like you to go scan paper for paper’s sake. I think we need to have a specific strategy.

Much of the time, government can’t work with the most nimble, modern startups because they just don’t fit the procurement specs. Some of the most innovative startups won’t work with government because most of them can’t tolerate a nine-month cycle, right? What substantive ways are you making either of these areas work better?

Steven VanRoekel: Right. They can’t afford it.

On the human resources side, aside from working with Aneesh formerly and now Todd on “Entrepreneurs in Residence” and driving a scale-out plan for that, last October, I launched the Presidential Technology Fellows program. I think it’s going to be a great catalyst to start to get younger people into government who have expertise that is not the norm in government.

What I was most delighted about is that we launched the program in combination with the PMF [Presidential Management Fellow] program, where more than 5,000 people entered the top funnel. Now they’re going through a vetting process. We still don’t know the size of the pool that will come out of the other end of the process, but that’s going to be open to be resourced in the next few months. We’ll let you know when that happens.

I thought that part of the big job here would be talking agencies into hiring these people, where we might need to arbitrarily create demand to meet the supply. What I was most delighted about is every agency came forward with multiple requests for hiring presidential technology fellows. I’m really excited to see what this is going to do.

I’ve also got commitments from the CIOs in government to put these people on startup projects that focus on things like modular and agile development, including up-and-coming social media projects. We also encourage people who have UI design, user experience expertise and others to join this process. We put some of the elements of that into the vetting engines so that we get some of those people coming through.

The federal IT skills gap and a more general IT productivity gap came up this morning. If you look at what the Consumer Financial Protection Bureau needs right now, they want to hire a lot of devs to help them work with unstructured data. Can they attract developers with the skills that they need for that?

Steven VanRoekel: I think so. The thing we promote in the Technology Fellows program is access to challenges and opportunities that you would never have outside of the walls of government. Look at the sheer scale of government.

We are the single largest consumer of technology, single-entity consumer of technology, in the world. We have technology running on a guy’s desk over there to running on the International Space Station and everywhere in between. There is technology that we cover running in all of those places. Every six months, part of the Technology Fellows program enables you to rotate into really interesting and fascinating things to go focus and work on. I think that’s unparalleled in the scope of opportunity.

Working in a company as large as Microsoft, the sun never sets on the Microsoft network. I mean, field offices are all around the globe, right? And it pales in comparison to the opportunities that are in the federal government.

Can you share an example of an improvement to procurement?

Steven VanRoekel: We have a huge opportunity to leverage the buying power of the federal government to make it easier and more cost effective to introduce new technologies into the federal environment.

Take mobile contracts, for example — the USDA had more than 1,000 different contracts for mobile and wireless service. It recently condensed all contracts under three blanket purchase agreements for a costs savings of 18%. Our goal is to replicate this model across government.

We are currently working on government-wide contract vehicles to make it easier for agencies to get the best dollar value for their spend on mobile contracts.

We are also looking at modular contracting — allowing more flexibility for smaller, nimble vendors to provide service, like component-development, agile, etc., to meet the mission needs of government. Buying power and speed of deployment equals “Shared First”; modular equals “Future First.”

This interview was edited and condensed for clarity.

tags: , , , , , , ,