Much like reading a book, sequential reading and writing of programming code has been my primary way of undestanding the subject matter. From top-down, I initialize some things, make some methods or functions that get called depending on certain parameters, and then execute; delivering whatever output that I so desire. Perhaps there might be a loop somewhere in there for data that isn’t reliable (like data over TCP), but generally this is how it goes.

Almost a year ago I wrote my BitTorrent client in Ruby in this particular way. Generally, it works like this:

  1. A torrent file gets read, the data gets interpreted.
  2. Using the interpreted data, a connection is made to either an HTTP or UDP tracker url.
  3. On a successful response, a list of Peer objects is constructed.
  4. With those peers, requests for connection are made over TCP.
  5. Regardless of connection status, a loop begins that either terminates when all peers refuse connection, or the file in question has been successfully downloaded. Below is the meat of my client program:
# this is an iteration over a list of *active* peers, whereupon each socket is checked
# for any incoming data that utilize Ruby's non-blocking IO
def messager(index, blk)
  if @buffer.empty?
    recv # more data gets downloaded
  else
    # the buffer is a String object, and depending upon the 5th byte
    # (usually determines what message is), a choice is made.
    case @buffer[4]
    when nil
      # *probably* a KEEP_ALIVE message; basically the peer is acknowledging you, but isn't
      # ready yet.
      @buffer.slice!(0..3) if @buffer[0..3] == KEEP_ALIVE
    when HANDSHAKE
      # initial transmission of data from peer.
      parse_handshake
    when BITFIELD
      # a description of what chunks of data they have concerning the file(s)
      bitfield.each_with_index do |bit, i|
        blk.call(i, @socket) if bit == '1'
      end

      send_interested if @buffer.empty?
    when HAVE
      # similar to bitfield, but much smaller
      if @buffer.bytesize < 9
        recv
      else
        # see this curried proc? It's lame and confusing. But it works.
        have { |i| blk.call(i, @socket) }
      end
      send_interested if @buffer.empty?
    when INTERESTED
      parse_interested(index)
    when PIECE
      # probably the most complicated part. Not all pieces are created equal.
      parse_piece(index)
    when CANCEL
      @socket.close
    else
      recv
      send(KEEP_ALIVE)
    end
  end
end

TL;DR: Whatever is in the TCP pipeline gets interpreted and depending on several conditions, a method/function is called and the loop continues on with another connection. It’s a functional approach, but as the program grew more complex, this massive loop began causing me a lot of headaches. Pretty soon I was flipping between multiple files/ruby objects, trying to track down bugs, adding features and proceeding to rip my hair out.

There’s another way to do this.

A bit of a tangent, but I love the web-app building rubygem Sinatra. Got a GET request heading to your root address?

get "/" do
  # render something!
end

As simple as this is, there’s a lot going on underneath that allows such a construction to work well and efficiently. Much of this has to do with writing a program that is event driven. I don’t know what the underpinnings are for how Sinatra works, but I have a basic understanding of it. To illustrate how it works, I’ll implement this with my BitTorrent client.

The Setup

Principally, there are three Ruby objects that are in play, here:

  1. PeerSession - inherits methods and properties from TCPSocket, with some custom ones added in.
  2. PieceReactor - emits events depending on circumstances of whatever data that is coming from PeerSession objects
  3. PeerEvent - An object with callbacks, generated by PieceReactor
class PeerSession < TCPSocket
  attr_accessor :current_piece
  # bunch of code handling the connection, and interepreting the incoming bytes
end

class PieceReactor
  extend PeerEvent # I'll cover PeerEvent in a moment

  on :handshake do |peer, msg|
    # On a handshake response, execute this block!
  end

  on :interested do |peer, msg|
    # On an interested response, execute this block!
  end

  # other "on X" events

  def connect
    # initializes PeerSession objects, @peers, that are passed on to #tick on a loop
  end

  def tick
    @peers.each do |peer|
      peer.peer_write
      peer.read_messages do |msg|
        case msg[:id]
        when 1
          emit(:interested, peer, msg)
        when 4
          emit(:have, peer, msg)
        when 5
          emit(:bitfield, peer, msg)
        when 7
          peer.current_piece = msg
          emit(:piece, peer, msg)
        when :handshake
          emit(:handshake, peer, msg)
        when :keep_alive
          emit(:keep_alive, peer, msg)
        when :wait
          # do nothing
        end
      end
    end
  end
end

What is on and emit doing? Where are they defined? Why am I extending PeerEvent to PieceReactor?

So, on is a PeerEvent method. By itself, it does nothing - why?

def callbacks
  @callbacks ||= Hash.new { |hash, key| hash[key] = [] }
end

callbacks stands as a collection of proc objects. To fill the collection, you’d write out on methods like the ones I mentioned above and save it as proc objects waiting to be called.

def on(type, &blk)
  callbacks[type] << blk
  self
end

emit is the endpoint and the most crucial part - when something specific happens, reference that specific callback and call it, using the symbol representing it (the one that was used with on) to access the proc in the collection and call it, passing any arguments that it requires.

def emit(type, *args)
  callbacks[type].each do |c|
    c.call(*args)
  end
end

The great advantage in this strategy isn’t speed, but for a programmer’s peace of mind - it is easier to compose and localize actions to occur in a specific location, and not have to create custom objects for every instance that could be strewn across multiple files. It feels cleaner, and easier to debug. Added to this, by extending the PeerEvent module to PieceReactor the exposure to important variables is preserved and readily accessible to the proc objects. This can significantly reduce the headaches caused when trying to things together, just so it can work. I hope to delve deeper into this topic later on!