- What American Startups Can Learn From the Cutthroat Chinese Software Industry — It follows that the idea of “viral” or “organic” growth doesn’t exist in China. “User acquisition is all about media buys. Platform-to-platform in China is war, and it is fought viciously and bitterly. If you have a Gmail account and send an email to, for example, NetEase163.com, which is the local web dominant player, it will most likely go to spam or junk folders regardless of your settings. Just to get an email to go through to your inbox, the company sending the email needs to have a special partnership.” This entire article is a horror show.
- White House Hangout Maker Movement (Whitehouse) — During the Hangout, Tom Kalil will discuss the elements of an “all hands on deck” effort to promote Making, with participants including: Dale Dougherty, Founder and Publisher of MAKE; Tara Tiger Brown, Los Angeles Makerspace; Super Awesome Sylvia, Super Awesome Maker Show; Saul Griffith, Co-Founder, Otherlab; Venkatesh Prasad, Ford.
- Municipal Codes of DC Freed (BoingBoing) — more good work by Carl Malamud. He’s specifically providing data for apps.
- The Modern Malware Review (PDF) — 90% of fully undetected malware was delivered via web-browsing; It took antivirus vendors 4 times as long to detect malware from web-based applications as opposed to email (20 days for web, 5 days for email); FTP was observed to be exceptionally high-risk.
"open data" entries
Anonymized phone data isn't as anonymous as we thought, a CFPB API, and NYC's "geek squad of civic-minded number-crunchers."
Mobile phone mobility traces ID users with only four data points
A study published this week by Scientific Reports, Unique in the Crowd: The privacy bounds of human mobility, shows that the location data in mobile phones is posing an anonymity risk. Jason Palmer reported at the BBC that researchers at MIT and the Catholic University of Louvain reviewed data from 15 months’ worth of phone records for 1.5 million people and were able to identify “mobility traces,” or “evident paths of each mobile phone,” using only four locations and times to positively identify a particular user. Yves-Alexandre de Montjoye, the study’s lead author, told Palmer that “[t]he way we move and the behaviour is so unique that four points are enough to identify 95% of people.”
Chinese Lessons, White House Embraces Makers, DC Codes Freed, and Malware Numbers
Sensor journalism will augment our ability to understand the world and hold governments accountable.
When I went to the 2013 SXSW Interactive Festival to host a conversation with NPR’s Javaun Moradi about sensors, society and the media, I thought we would be talking about the future of data journalism. By the time I left the event, I’d learned that sensor journalism had long since arrived and been applied. Today, inexpensive, easy-to-use open source hardware is making it easier for media outlets to create data themselves.
“Interest in sensor data has grown dramatically over the last year,” said Moradi. “Groups are experimenting in the areas of environmental monitoring, journalism, human rights activism, and civic accountability.” His post on what sensor networks mean for journalism sparked our collaboration after we connected in December 2011 about how data was being used in the media.
At a SXSW panel on “sensoring the news,” Sarah Williams, an assistant professor at MIT, described how the Spatial Information Design Lab At Columbia University* had partnered with the Associated Press to independently measure air quality in Beijing.
Prior to the 2008 Olympics, the coaches of the Olympic teams had expressed serious concern about the impact of air pollution on the athletes. That, in turn, put pressure on the Chinese government to take substantive steps to improve those conditions. While the Chinese government released an index of air quality, explained Williams, they didn’t explain what went into it, nor did they provide the raw data.
The Beijing Air Tracks project arose from the need to determine what the conditions on the ground really were. AP reporters carried sensors connected to their cellphones to detect particulate and carbon monoxide levels, enabling them to report air quality conditions back in real-time as they moved around the Olympic venues and city. Read more…
Chicago CIO Brett Goldstein is experimenting with social coding for a different kind of civic engagement.
GitHub has been gaining new prominence as the use of open source software in government grows.
Earlier this month, I included a few thoughts from Chicago’s chief information officer, Brett Goldstein, about the city’s use of GitHub, in a piece exploring GitHub’s role in government.
While Goldstein says that Chicago’s open data portal will remain the primary means through which Chicago releases public sector data, publishing open data on GitHub is an experiment that will be interesting to watch, in terms of whether it affects reuse or collaboration around it.
In a followup email, Goldstein, who also serves as Chicago’s chief data officer, shared more about why the city is on GitHub and what they’re learning. Our discussion follows.
The collaborative coding site hired a "government bureaucat."
GitHub’s profile has been rising recently, from a Wired article about open source in government, to its high profile use by the White House and within the Consumer Financial Protection Bureau. This March, after the first White House hackathon in February, the administration’s digital team posted its new API standards on GitHub. In addition to the U.S., code from the United Kingdom, Canada, Argentina and Finland is also on the platform.
“We’re reaching a tipping point where we’re seeing more collaboration not only within government agencies, but also between different agencies, and between the government and the public,” said GitHub head of communications Liz Clinkenbeard, when I asked her for comment. Read more…
Harvey Lewis shares insights from Deloitte UK's research on the open data economy.
There’s increasing interest in the open data economy from the research wings of consulting firms. Capgemini Consulting just published a new report on the open data economy. McKinsey’s Global Institute is following up its research on big data with an inquiry into open data and government innovation. Deloitte has been taking a long look at open data business models. Forrester says open data isn’t (just) for governments anymore and says more research is coming. If Bain & Company doesn’t update its work on “data as an asset” this year to meet inbound interest in open data from the public sector, it may well find itself in the unusual position of lagging the market for intellectual expertise.
In January, I interviewed Harvey Lewis, the research director for the analytics department of Deloitte U.K. Lewis, who holds a doctorate in hypersonic aerodynamics, has been working for nearly 20 years on projects in the public sector, defense industry and national security. Today, he’s responsible for applying an analytical eye to consumer businesses, manufacturing, banking, insurance and the public sector. Over the past year, his team has been examining the impact of open data releases on the economy of the United Kingdom. The British government’s embrace of open data makes such research timely. Read more…
More federally funded research and data will be made freely available to the public online.
As part of the response, John Holdren, the director of the White House Office of Science and Technology Policy, released a memorandum today directing agencies with “more than $100 million in research and development expenditures to develop plans to make the results of federally-funded research publically available free of charge within 12 months after original publication.”
The Obama administration has been considering access to federally funded scientific research for years, including a report to Congress in March 2012. The relevant e-petition, which had gathered more than 65,000 signatures, had gone unanswered since May of last year.
The U.S. Department of Veterans Affairs launched a new innovation center to solve big problems.
There are few areas as emblematic of a nation’s values than how it treats the veterans of its wars. As improved battlefield care keeps more soldiers alive from injuries that would have been lethal in past wars, more grievously injured veterans survive to come home to the United States.
Upon return, however, the newest veterans face many of the challenges that previous generations have encountered, ranging from re-entering the civilian workforce to rehabilitating broken bodies and treating traumatic brain injuries. As they come home, they are encumbered by more than scars and memories. Their war records are missing. When they apply for benefits, they’re added to a growing backlog of claims at the Department of Veterans Affairs (VA). And even as the raw number of claims grows to nearly 900,000, the average time to process them is also rising. According to Aaron Glanz of the Center for Investigative Reporting, veterans now wait an average of 272 days for their claims to be processed, with some dying in the interim.
The growth in the VA backblog should be seen in context with a decade of war in Afghanistan and Iraq, improved outreach and more open standards for post-tramautic stress syndrome, Agent Orange exposure in Vietnam and “Gulf War” illness, allowing more veterans who were previously unable to make a claim to file.
While new teams and technologies are being deployed to help with the backlog, a recent report (PDF) from the Office of the Inspector General of the Veterans Administration found that new software deployed around the country that was designed to help reduce the backlog was actually adding to it. While high error rates, disorganization and mishandled claims may be traced to issues with training and implementation of the new systems, the transition from paper-based records to a digital system is proving to be difficult and deeply painful to veterans and families applying for benefits. As Andrew McAfee bluntly put it more than two years ago, these kinds of bureaucratic issues aren’t just a problem to be fixed: “they’re a moral stain on the country.”
Given that context, the launch of a new VA innovation center today takes on a different meaning. The scale and gravity of the problems that the VA faces demand true innovation: new ideas, technology or methodologies that challenge and improve upon existing processes and systems, improving the lives of people or the function of the society that they live within. Read more…
The White House added a community for the "smart disclosure" of consumer data to Data.gov.
On Monday morning, the Obama administration launched a new community focused on consumer data at Data.gov. While there was no new data to be found among the 507 datasets listed there, it was the first time that smart disclosure has an official home in federal government.
“Smart disclosure means transparent, plain language, comprehensive, synthesis and analysis of data that helps consumers make better-informed decisions,” said Christopher Meyer, the vice president for external affairs and information services at Consumers Union, the nonprofit that publishes “Consumer Reports,” in an interview. “The Obama administration deserves credit for championing agency disclosure of data sets and pulling it together into one web site. The best outcome will be widespread consumer use of the tools — and that remains to be seen.”
You can find the new community at Consumer.Data.gov or data.gov/consumer. Both URLs forward visitors to the same landing page, where they can explore the data, past challenges, external resources on the topic, in addition to a page about smart disclosure, blog posts, forums and feedback.
“Analyzing data and giving plain language understanding of that data to consumers is a critical part of what Consumer Reports does,” said Meyer. “Having hundreds of data sets available on one (hopefully) easy-to-use platform will enable us to provide even more useful information to consumers at a time when family budgets are tight and health care and financial ‘choices” have never been more plentiful.”
Leading experts on data-driven storytelling came together in our recent Google+ Hangout.
Over the past year, I’ve been investigating data journalism. In that work, I’ve found no better source for understanding the who, where, what, how and why of what’s happening in this area than the journalists who are using and even building the tools needed to make sense of the exabyte age. Yesterday, I hosted a Google Hangout with several notable practitioners of data journalism. Video of the discussion is embedded below:
Over the course of the discussion, we talked about what data journalism is, how journalists are using it, the importance of storytelling, ethics, the role of open source and “showing your work” and much more.