Linkblog #4

Articles

Talks

  • One Hacker Way by Erik Meijer – an interesting and a bit controversial expressive talk about the organization of the software development process and software business.

Projects

  • Orleans. It was bound to happen: Microsoft released their actor framework for distributed systems that includes many interesting features such as cluster membership. If your want more meat, here is the paper.
  • Electron. You probably use Atom text editor and you probably know that it is in fact a Node.js + Webkit. Now, Github released Electron, a framework that Atom built with. Now, you can easily write almost-native application with JS.
  • Hubot – another project of Github (not as new as Electron). It is about creating a bot that can join your work chat (in HipChat, Slack, etc; a list of adapters), react on the messages, fulfil your commands and other things. In one of projects I take part, we use it for status checking, metrics monitoring and fun.
  • Motion sensing using the Doppler effect – an amazing idea of using the Doppler effect in order to detect user’s hand movements, in browser. However, I failed to run it :)

And a bit of humour for you →

Linkblog #3

Articles

  • NOSQL Patterns – an article about various patterns you can meet in NoSQL-systems: data partitioning, replication, cluster membership, consistency models etc.
  • NoSQL Data Modeling Techniques. If the previous article is about internals of NoSQL systems, this one is closer to usage. It covers some data modeling techniques useful for storing your data in a NoSQL storage, such as denormalization, aggregation, hierarchy storing etc.
  • Facebook’s Mystery Machine: End-to-end Performance Analysis of Large-scale Internet Services – an overview of a paper that describes Facebook’s approach to analyze (internal) service performance and interesting findings about it. Shortly, they parse logs of internal services and this gives them information about how long each part of a request is. The approach requires no additional instrumentation, as it said.
  • A series of articles about implementing monads in C# with a bit of theory. The series shows that there is no mystery in monads and monad pattern can be easily implemented in a non-functional language like C#. That might be useful for understanding monads without diving into a language like Haskell. Follow the links at the end of each article to get to the next part.
  • Java Garbage Collection Distilled – an explanation of garbage collectors used HotSpot JVM and OpenJDK, GC tradeoffs, its monitoring and tuning.
  • Akka Cluster Load Balancing – an approach to adaptive load balancing in Akka cluster based on free heap space metrics. The author has implemented a custom actor router that directs workload to the actor on a node with the smallest heap.

Talks

Projects

  • Bazel – Google have open-sourced theirs build tool Bazel that is used to build most of theirs projects. (Marked as alpha.)
  • facebook-tunnel – have free Facebook traffic? You can get to your internets through Facebook chat :) An amusing proof-of-concept.

Podcasts

  • IoT Podcast – a new podcast about Internet of things. There are two episodes for now.

Courses

  • Principles of Reactive Programming – the new iteration of Martin Odersky’s, Erik Meijer’s and Roland Kuhn’s course about reactive programming, reactive streams, actors, Akka, etc. The material is promised to be updated and improved. It starts at 13 April. I finished the course already, but it will be interesting to know how it has changed.
    If you are going to take the course, drop me a couple of lines, it will be interesting to discuss things.

And a funny picture for you →

The Bloom filter

In many software engineering problems, we have a set and need to determine if some value belongs to this set. If the possible maximum set cardinality (size; maximum size = total count of elements we consider) is small, the solution is straightforward: just store the set explicitly (for instance, in form of a RB-tree), update it when necessary and check if the set contains elements that we are interested in. But what if maximum set cardinality is large or we need many such sets to operate simultaneously? Or if the set membership test is an expensive operation?

Suppose we want to know if an element belongs to a set. We have decided that it is acceptable to get false positive answers (the answer is “yes”, but the element is not actually in the set) with probability p and not acceptable to get false negative (the answer is “no”, but the element in actually in the set). The data structure that could help us in this situation is called the Bloom filter.

A Bloom filter (proposed by Burton Howard Bloom in 1970) is a bit array of m bits (initially set to 0) and k different hash functions. Each hash function maps a value into a single integer number.

Look at this picture from Wikipedia:
Bloom filter

Continue reading