"Java" entries

Power of the platforms

Uncertainty is a feature, not a bug.

Image: CC BY 2.0 NASA's Earth Observatory via Wikimedia Commons  http://commons.wikimedia.org/wiki/File:Southern_Lights.jpg#mediaviewer/File:Southern_Lights.jpg

After decades of work on programming, we finally got a development environment with massive reach and tremendous power. Somehow, though, the web isn’t centered on a comprehensive programming environment. The web succeeded with a (severely) lowest-common denominator, specification-driven approach that let it grow with time, technology, and multiple communities, across multiple platforms.

Almost two decades ago, I was all excited about Java. Write applets once, run anywhere, with libraries to make sure it all came out the same wherever anywhere might be. Java is still a powerhouse, but it all worked out differently than I expected. Even in Java’s early years, before the Java news was filled with security bulletins, applets felt like a strange mix with their surrounding web pages. Creating an applet demanded programmers to build every detail. Even with Java’s ever-improving libraries, creating a Java applet that did much was an intense experience focused on programming.

Java wasn’t the only comprehensive way to build web apps, of course. Flash demanded programming, but its values always incorporated design, action, and well, flash, in ways that meshed well with the way people built sites. Flash kept growing and growing before its ecosystem took a fatal hit from the iPhone as HTML5 offered replacements for some of its key strengths. I mostly notice Flash these days because it asks me to update it regularly and because pages tell me when it’s crashed.

Compared to either of those rich environments, web technology is a tangled mess. The early web was functional but unstyled, with no behavior beyond navigating among pages. That? That would dominate client-side computing? Read more…

Where apps end and the system begins

Exploring the system calls and control flow that underpin high-level languages.

Editor’s note: Sometimes we can forget how important it is to truly understand how a system works, and how the nuts and bolts affect our everyday activities. Marty Kalin asks us to dive a little deeper and strengthen our computational thinking abilities.

It’s clear that applications need system resources to execute: a processor, memory, and usually I/O devices such as the keyboard and screen. It’s less clear how applications gain access to these shared resources, which are under operating system (OS) control. The OS, like any good manager, is efficient and unobtrusive as it handles resource requests from applications. Let’s take a look at how applications interact with the OS, in both routine and dramatic fashion.

Consider what happens when a print statement executes. Here’s a Ruby example:

puts 'Hello, world!'

The Ruby puts statement wraps a call to a high-level I/O function in the standard C library (in this case, printf), which acts as the interface between resource-requesting applications and resource-granting OS routines. In this example, the screen is the requested resource. The standard library interacts seamlessly with the OS, which also is written in C with some assembly language. The library function printf is high-level because, as the f in the name indicates, the function can format the bytes to be written as integers, floating-point values, and character strings such as Hello, world!. In systems-speak, the Ruby application and the C library function execute in user space, which does not bestow the rights and privileges needed to control system resources such as the screen.

Read more…

Java 8 functional interfaces

Getting to know various out-of-the-box functions such as Consumer, Predicate, Supplier

In the first part of this series, we learned that lambdas are a type of functional interface – an interface with a single abstract method. The Java API has many one-method interfaces such as Runnable, Callable, Comparator, ActionListener and others. They can be implemented and instantiated using anonymous class syntax. For example, take the ITrade functional interface. It has only one abstract method that takes a Trade object and returns a boolean value – perhaps checking the status of the trade or validating the order or some other condition.

@FunctionalInterface
public interface ITrade {
  public boolean check(Trade t);
}

In order to satisfy our requirement of checking for new trades, we could create a lambda expression, using the above functional interface, as shown here:

ITrade newTradeChecker = (Trade t) -> t.getStatus().equals("NEW");

// Or we could omit the input type setting:
ITrade newTradeChecker = (t) -> t.getStatus().equals("NEW");

Read more…

5 ways developers win with PaaS

Powering your app with open source and OpenShift

Getting Started with OpenShift As a software developer, you are no doubt familiar with the process of abstracting away unnecessary detail in code — imagine if that same principle were applied to application hosting. Say hello to Platform as a Service (PaaS), which enables you to host your applications in the cloud without having to worry about the logistics, leaving you to focus on your code. This post will discuss five ways in which PaaS benefits software developers, using the open source OpenShift PaaS by Red Hat as an example.

No More Tedious Config Tasks

Most of us don’t become developers to do system administration, but when you are running your own infrastructure you end up doing exactly that. A PaaS can take that pain away by handling pesky config and important security updates for you. As a bonus, it makes your sys admin happy too by allowing you to provision your own environment for that killer new app idea you want to tinker with, rather than nagging them for root access on a new VM.

On OpenShift, it goes like this: let’s say you decide you want to test an idea for a Java app, using Tomcat and PostgreSQL (yes, we could argue about the merits of those choices, but work with me here). You can spin that up with a one-line terminal command:

rhc app create myawesomeapp tomcat-7 postgresql-9.2 -s

That -s on the end is telling the platform to make the app auto-scaling, which I will elaborate on later; yes, that’s all it takes. RHC (Red Hat Cloud) is just a Ruby Gem wrapping calls to the OpenShift REST API. You could also use the OpenShift web console or an IDE plugin to do this, or call the API directly if that’s how you roll. The key technologies in play here are just plain old Git and SSH — there’s nothing proprietary.

Read more…

Six Degrees of Kevin Bacon in six languages

20 years of efficiently computing Bacon numbers

apollo1

The Oracle at Delphi spoke just one language, a cryptic one that priests “compiled” into ancient Greek. The Oracle of Bacon—the website that plays the Six Degrees of Kevin Bacon game for you—has, in its 20-year existence, been written in six languages. Read on for the history and the reasons why.

1995-1999: C

The original version of the Oracle of Bacon, written by Brett Tjaden in 1995, was all C. The current version, my stewardship of it, and my revision control history only go back to 1999, so that’s where I’ll start. In 1999, I rewrote the Oracle… still entirely in C. Expensive shortest-path and spell-check algorithms? Definitely in C. String processing to build the database? Also C! Presentation layer to parse CGI parameters and generate HTML? C here, too!

Why C? The rationale for the algorithmic component was straightforward: the Oracle of Bacon ran on a slow, shared Unix machine that other people were using to get actual work done. Minimizing CPU and memory resource requirements was the polite thing to do. I needed a compiled language that let me optimize time and space extensively. The loops all counted down, not up, because comparing against zero was fractionally faster on SPARC. It had to be C.

But why were the offline string processing and the CGIs in C? Mostly, I think, to reuse code from the other parts of the code base and from previous projects I’d written when C was the only language I knew.

2004-2005: Perl

As the site added features, I got tired of writing code to generate HTML in C. I wrote new CGIs, then rewrote existing CGIs, in Perl. Simply put, writing the CGIs in an interpreted language made me more productive. I had hash tables and vectors built into the language and CGI support a simple “use” statement away. I didn’t have to compile on one server and then deploy to another—I could edit the CGIs right there on the web server. Good deployment practices it wasn’t, but it made me more productive as a programmer, and the performance of the CGIs didn’t matter all that much.

Read more…

Java 8, now what?

What you'll need to know to start your Java 8 migration process today

There was recently a thread on the London Java Community mailing list about when people should think about adopting Java 8. Lambdas, an improved collections library, new date and time support, and a host of under-the-hood tweaks, add up to a lot of compelling reasons for people to upgrade. There’s still a lot of confusion over when and how to accomplish it, though, so here’s a helpful guide.

When will Java 8 be released?

The GA (General Availability) release of Oracle’s JRE and JDK, which is probably the JVM that you’re using, released March 18th. It may take other JVM vendors a while to release their implementations if you aren’t using an OpenJDK or the Oracle JDK.

So I should just upgrade on release date, right?

That would be a very brave move to make. A huge amount of resources go into testing Java 8 and ensuring that things will work out of the box on the day of release. However, the massive ecosystem of Java libraries means that not everything can be tested to destruction in time. It’s incredibly likely that there will be outstanding bugs upon release. You should expect update releases a month or two after GA, they’ll solve the major problems.

It’s also important to think about what libraries or frameworks your application depends on. If you’re just writing plain old Java then an existing codebase is likely to work fine. If, on the other hand, you depend on a library or framework that tries to do something clever then you may run into problems.

Read more…

WORA Can Be Better Than Native

Targeting the highest common denominator

Some would claim that native is the best approach, but that looks at existing WORA tools/communities, which mostly target cost saving. In fact, even native Android/iOS tools produce rather bad results without deep platform familiarity. Native is very difficult to properly maintain in the real world and this is easily noticeable by inspecting the difficulties we have with the ports of Codename One, this problem is getting worse rather better as platforms evolve and fragment. E.g. Some devices crash when you take more than one photo in a row, some devices have complex issues with http headers, and many have issues when editing text fields in the “wrong position”.

There are workarounds for everything, but you need to do extensive testing to become aware of the problem in the first place. WORA solutions bring all the workarounds and the “ugly” code into their porting layer, allowing developers to focus on their business logic. This is similar to Spring/Java EE approaches that addressed complexities of application server fragmentation.

Read more…

Think Functionally to Simplify Code

Let the environment do more of the work

Functional programming keeps growing. While it has long been a popular topic in academic circles, and many of my CS-educated friends wonder why it took me so long to discover it, the shift in approach that functional programming requires made it a hard sell in the commercial world. As our computers have become more and more powerful and our problems more complex, functional programming approaches and environments seem better able to shoulder those loads.

Neal Ford, Meme Wrangler at ThoughtWorks, has been showing developers how to shift from classic imperative models to cleaner functional approaches. I was lucky to get to talk with him at OSCON, and we’ve also posted his OSCON talk, with many more concrete code examples.

Read more…

From BASIC to HyperTalk to JavaScript to Rails to Erlang

Every programming experience teaches

I’ve never formally trained to be a programmer, outside of occasional conference workshops and a week of XSL tutorials. In some ways, that’s terrible, because it’s taken me about thirty years to learn what some friends of mine appear to have learned in four. I’ve written some code that goes way beyond spaghetti, though fortunately the worst of it was probably when I was 15.

On the bright side, when I look past my many mistakes, I can see what I learned from a large number of various different experiences, and the pieces they helped me see. It’s a little easier to tell this story through the parts than it might be through a formal curriculum.

My parents’ FORTRAN books
I was reading computer books—dry ones—before I even got to play. I have vague memories about program structure, but mostly I learned that knowledge sticks better if it includes hands-on work, and not just a book.
Sinclair ZX81
1K of memory! The sheer thrill of seeing my creations on screen was amazing. I had just enough logic to get things done, and leave myself puzzled. The Sinclair community seemed focused on making great small things. I learned simple logic in BASIC and that sometimes it takes a hack to get things done.
Applesoft BASIC
After Sinclair BASIC, Applesoft seemed vast. Much of what I did was transfer what I’d done on the Sinclair (itself a lesson in platform-shifting). As I settled, I started writing larger and larger programs, eventually forcing myself to restructure everything into subroutines…with global variables, of course.
6502 Assembly
I knew there was more than BASIC. My early adventures with assembly language were mostly about graphics, and didn’t work all that well, but I picked up two key things: recursion and the importance of registers.

Read more…

Scaling People, Process, and Technology with Python

OSCON 2013 Speaker Series

NOTE: If you are interested in attending OSCON to check out Dave’s talk or the many other cool sessions, click over to the OSCON website where you can use the discount code OS13PROG to get 20% off your registration fee.

Since 2009, I’ve been leading the optimization team at AppNexus, a real-time advertising exchange. On this exchange, advertisers participate in real-time auctions to bid on individual ad impressions. The highest bid wins the auction, and that advertiser gets to show an ad. This allows advertisers to carefully target where they advertise—maximizing the effectiveness of their advertising budget—and lets websites maximize their ad revenue.

We do these auctions often (~50 billion a day) and fast (<100 milliseconds). Not surprisingly, this creates a lot of technical challenges. One of those challenges is how to automatically maximize the value advertisers get for their marketing budgets—systematically driving consumer engagement through ad placements on particular websites, times of day, etc.—and we call this process “optimization.” The volume of data is large, and the algorithms and strategies aren’t trivial.

In order to win clients and build our business to the scale we have today, it was crucial that we build a world-class optimization system. But when I started, we didn’t have a scalable tech stack to process the terabytes of data flowing through our systems every day, and we didn't have the team to do any of the required data modeling.

People

So, we needed to hire great people fast. However, there aren’t many veterans in the advertising optimization space, and because of that, we couldn’t afford to narrow our search to only experts in Java or R or Matlab. In order to give us the largest talent pool possible to recruit from, we had to choose a tech stack that is both powerful and accessible to people with diverse experience and backgrounds. So we chose Python.

Python is easy to learn. We found that people coding in R, Matlab, Java, PHP, and even those who have never programmed before could quickly learn and get up to speed with Python. This opened us up to hiring a tremendous pool of talent who we could train in Python once they joined AppNexus. To top it off, there’s a great community for hiring engineers and the PyData community is full of programmers who specialize in modeling and automation.

Additionally, Python has great libraries for data modeling. It offers great analytical tools for analysts and quants and when combined, Pandas, IPython, and Matplotlib give you a lot of the functionality of Matlab or R. This made it easy to hire and onboard our quants and analysts who were familiar with those technologies. Even better, analysts and quants can share their analysis through the browser with IPython.

Process

Now that we had all of these wonderful employees, we needed a way to cut down the time to get them ramped up and pushing code to production.

First, we wanted to get our analysts and quants looking at and modeling data as soon as possible. We didn’t want them worrying about writing database connector code, or figuring out how to turn a cursor into a data frame. To tackle this, we built a project called Link.

Imagine you have a MySQL database. You don’t want to hardcode all of your connection information because you want to have a different config for different users, or for different environments. Link allows you to define your “environment” in a JSON config file, and then reference it in code as if it is a Python object.

 { "dbs":{
  "my_db": {
   "wrapper": "MysqlDB",
   "host": "mysql-master.123fakestreet.net",
   "password": "",
   "user": "",
   "database": ""
  }
 }}

Now, with only three lines of code you have a database connection and a data frame straight from your mysql database. This same methodology works for Vertica, Netezza, Postgres, Sqlite, etc. New “wrappers” can be added to accommodate new technologies, allowing team members to focus on modeling the data, not how to connect to all these weird data sources.

In [1]: from link import lnk
 
In [2]: my_db = lnk.dbs.my_db
 
In [3]: df = my_db.select('select * from my_table').as_dataframe()
 

Int64Index: 325 entries, 0 to 324
Data columns:
id    325 non-null values
user_id   323 non-null values
app_id   325 non-null values
name    325 non-null values
body    325 non-null values
created   324 non-null values

By having the flexibility to easily connect to new data sources and APIs, our quants were able to adapt to the evolving architectures around us, and stay focused on modeling data and creating algorithms.

Second, we wanted to minimize the amount of work it took to take an algorithm from research/prototype phase to full production scale. Luckily, with everyone working in Python, our quants, analysts, and engineers are using the same language and data processing libraries. There was no need to re-implement an R script in Java to get it out across the platform.
Read more…