Fog Creek’s Remote Work Policy — In the absence of new information, the assumption is that youâ€™re producing. When you step outside the HQ work environment, you should flip that burden of proof. The burden is on you to show that youâ€™re being productive. Is that because we donâ€™t trust you? No. Itâ€™s because a few normal ways of staying involved (face time, informal chats, lunch) have been removed.
MillWheel (PDF) — a framework for building low-latency data-processing applications that is widely used at Google. Users specify a directed computation graph and application code for individual nodes, and the system manages persistent state and the continuous ď¬‚ow of records, all within the envelope of the frameworkâ€™s fault-tolerance guarantees. From Google Research.
Probabilistic Scraping of Plain Text Tables — the method leverages topological understanding of tables, encodes it declaratively into a mixed integer/linear program, and integrates weak probabilistic signals to classify the whole table in one go (at sub second speeds). This method can be used for any kind of classification where you have strong logical constraints but noisy data.