- Hospital Hacking (Bloomberg) — interesting for both lax regulation (“The FDA seems to literally be waiting for someone to be killed before they can say, ‘OK, yeah, this is something we need to worry about,’ ” Rios says.) and the extent of the problem (Last fall, analysts with TrapX Security, a firm based in San Mateo, Calif., began installing software in more than 60 hospitals to trace medical device hacks. […] After six months, TrapX concluded that all of the hospitals contained medical devices that had been infected by malware.). It may take a Vice President’s defibrillator being hacked for things to change. Or would anybody notice?
- Cybersecurity and Data Science — pointers to papers in different aspects of using machine learning and statistics to identify misuse and anomalies.
- Multi-Agent Systems — undergraduate textbook covering distributed systems, game theory, auctions, and more. Electronic version as well as printed book.
"distributed systems" entries
Comparing different orchestration tools.
Most software systems evolve over time. New features are added and old ones pruned. Fluctuating user demand means an efficient system must be able to quickly scale resources up and down. Demands for near zero-downtime require automatic fail-over to pre-provisioned back-up systems, normally in a separate data centre or region.
On top of this, organizations often have multiple such systems to run, or need to run occasional tasks such as data-mining that are separate from the main system, but require significant resources or talk to the existing system.
When using multiple resources, it is important to make sure they are efficiently used — not sitting idle — but can still cope with spikes in demand. Balancing cost-effectiveness against the ability to quickly scale is difficult task that can be approached in a variety of ways.
All of this means that the running of a non-trivial system is full of administrative tasks and challenges, the complexity of which should not be underestimated. It quickly becomes impossible to look after machines on an individual level; rather than patching and updating machines one-by-one they must be treated identically. When a machine develops a problem it should be destroyed and replaced, rather than nursed back to health.
Various software tools and solutions exist to help with these challenges. Let’s focus on orchestration tools, which help make all the pieces work together, working with the cluster to start containers on appropriate hosts and connect them together. Along the way, we’ll consider scaling and automatic failover, which are important features.
Find your way through OSCON with these four learning paths.
The open source movement has been with us for almost two decades, and it’s clear that open source is now a de facto choice for software engineers across the globe. The content that you’ll find at OSCON is a reflection of that fact.
The open source world and OSCON itself are vast. With 48 sessions over two days and a bonus day with 11 workshops to choose from, you’ll no doubt have some tough choices to make when you attend the event. Keeping that in mind, I put together four learning paths that encompass the hot topics and important transitions we’re covering at OSCON.
The O'Reilly Radar Podcast: Astrid Atkinson on optimization, and Kelsey Hightower on distributed computing.
Subscribe to the O’Reilly Radar Podcast to track the technologies and people that will shape our world in the years to come.
In this week’s episode, O’Reilly’s Mac Slocum talks to Astrid Atkinson, director of software engineering at Google, about the delicate balance of managing complexity in distributed systems and her experience working on-call rotations at Google.
Here are a few snippets from their chat:
I think it’s often really hard for organizations that are scaling quickly to find time to manage complexity in their systems. That can be really a trap, because if you’re really always just focused on the next deadline or whatever, and never planning for what you’re going to live with when you’re done, then you might never find the time.
You can only optimize what you pay attention to, and so if you can’t see what your system is doing, if you can’t see whether it’s working, it’s not working.
I used to get paged awake at two in the morning. You go from zero to Google is down. That’s a lot to wake up to.