Electric thoughts for my conservative loved ones

Australia’s political debate was recently given a high-voltage shock. The Morrison Liberal-National government was preparing to call an election. Bill Shorten was diligently keeping his mouth shut, knowing that Labor was on autopilot to a likely victory.

Then, in his budget reply speech, Bill mentioned electric cars. Of all the topics he raised, that one seems to spark the most discourse.

On that day, as I read the speech, I was sitting in an electric car, or “EV”, doing 120km/h on a Florida freeway. Like Bill, the car was quite literally on autopilot, driving itself entirely. My partner Jessica sat in the drivers seat, monitoring the car’s performance as it watched for humans doing crazy human things.

This particular freeway connects Trump Country and… the rest. Florida is a purple state, with gun-toting, gas-guzzling gated communities a short drive away from  urbane, medium density developments whose residents commute by train.

Out here, I have experienced first-hand the pain of families and friends being torn apart by the hysterical partisanship Trump cultivates. Healing these rifts is tough. Dinner conversations are delicate affairs, where third rails (immigration, healthcare, trade, and more) must be carefully avoided.

The EV debate in Australia has surprised me, because the tone feels like that which I’ve found in Trumpland. Rather than engaging with the science and economics of EVs, the debate appears to be morphing into a cultural one. Internal combustion engines (ICE) represent some sort of honourable status quo. EVs represent a fad, an “other”, at worst a conspiracy designed to screw the unwary out of their excellent ICEs.

At left, the Model 3 giving a graphical representation of its situational awareness. At right, the simultaneous view out the windshield

As the keeper of a V8 BMW, which I adore, and as someone who has lived with EVs on the East & West coasts of the United States, I can assure you that the conspiracy is operating in reverse.

EVs are, pound for pound, faster, safer, less complicated, more comfortable, cheaper to run, capable of carrying more cargo, and handle more nimbly than ICE cars. Anyone that tells you otherwise has not spent any significant time in a Tesla Model 3. Even the best German ICE engineering cannot overcome the fundamental design advantages of removing a huge, heavy lump of finely-tuned moving parts from the vehicle.

Regardless of whether they are good for the environment, EVs are simply better cars. Yet even when fed with electricity generated by the dirtiest coal power stations, they are vastly less carbon-intensive per unit of distance travelled.

Their only drawback, their sole weakness, is the price of their lithium ion batteries. Yet the price of these batteries is steadily falling, and the point at which they become cheaper than relatively immensely complicated ICE’s is inevitable.

Charging infrastructure is ubiquitous – Even a standard 110v plug in the US can comfortably deliver 7km/h of range at 10 amps, more than enough to cover a light ICE duty cycle. Installation of relatively inexpensive three phase wall chargers can deliver ~35km/h at home. The Tesla Supercharger network delivers an astonishing 800km/h and up, and they are everywhere, even in Australia. No matter where you charge, the price per km travel of electricity is far below that of petroleum.

Ubiquitous superchargers can deliver 800km/h in range

I am fortunate to be a digital nomad, criss-crossing the world and observing a new city almost every week. Compared to what I see on my travels, the dialogue in Australia around energy is like something out of the Stone Age.  Out here, in the rest of the world, the pace of change is astonishing. There is not a single petrol powered motorcycle left on Beijing’s streets. EVs are now the most popular premium vehicle in the United States and European Union. Car parks are full of chargers, while freeways are peppered with EVs driving themselves.

Meanwhile in Australia, smug News Corp pundits are making idiotic quips about EVs shortfalls from… the 90s.

If Bill Shorten wants to shave a few percentage points off my disposable income while he articulates a semi-coherent vision for jolting Australia out of its coal-induced stupor then yes, I’m ok with that. The rest of the world is moving on, at EV speed. Meanwhile, the Liberal-National coalition is running Facebook scare-campaigns promoting the primacy of the ICE.

We are fortunate in Australia to have two broadly competent major parties, who both have a history of “not screwing it up too badly” over the past 40 years. When one of them is running such an appalling backward looking campaign, it is time to give the other one a chance.

Time for Australia to plug in

How to install OpenSSL & stunnel on MacOS

When travelling behind the Great Firewall of China,  I wanted a copy of OpenSSL and stunnel on my machine. Googling “install stunnel macos” gives a bunch of answers that involve the word “brew”.

OpenSSL and stunnel are open-source C programs, which means we can compile them from source. Doing so is not difficult, but it is a bit fiddly, and I think that fiddliness can dissuade people and cause them to unnecessarily reach for a bloated package manager.

All the information in this blog post is “as of February 2019”, and applies to stunnel 5.50, OpenSSL 1.1.1, and macOS 10.14.

stunnel depends on OpenSSL, so we will compile and install OpenSSL first.

Compiling & Installing OpenSSL on MacOS

stunnel is going to look for ssl.h, and to make it available we need to compile with the shared flag.  We probably don’t want to spray OpenSSL all over the system, so we will use --prefix to specify an install location other than the default /usr/bin.

$ ./Configure darwin64-x86_64-cc shared --prefix=/Users/hugh/somedir --openssldir=/Users/hugh/somedir no-ssl3

Now you can make and make install. Many lines of output later, you will

Compiling & Installing stunnel on MacOS

OpenSSL at the ready, we can now move on to stunnel.   We will tell stunnel about our newly minted copy of OpenSSL using the ./configure command:

$ ./configure --with-ssl=/Users/hugh/somedir --prefix=/Users/hugh/somedir

If the shared flag was not specified when compiling OpenSSL, then this is where you will hit the much-googled:

checking for TLS directory... not found
configure: error: 
Could not find your TLS library installation dir
Use --with-ssl option to fix this problem

We did used shared, so we are good to go! Hit that make button, followed by make install, and you are the proud owner of a copy of stunnel compiled with OpenSSL!

Asynchronous Object Initialisation in Swift

Baby birds, rockets, freshly roasted coffee beans, and … immutable objects. What do all these things have in common? I love them.

An immutable object is one that cannot change after it is initialised. It has no variable properties. This means that when using it in a program, my pea brain does not have to reason about the state of the object. It either exists, fully ready to complete its assigned duties, or it does not.

Asynchronous programming presents a challenge to immutable objects. If the creation of an object requires network I/O, then we will have to unblock execution after we have decided to create the object.

As an example, let’s consider the Transaction class inside Amatino Swift. Amatino is a double entry accounting API, and Amatino Swift allows macOS & iOS developers to build finance capabilities into their applications.

To allow developers to build rich user-interfaces, it is critical that Transaction operations be smoothly asynchronous. We can’t block rendering the interface while the Amatino API responds! To lower the cognitive load yielded by Amatino Swift, Transaction should be immutable.

We’ll use a simplified version of Transaction that only contains two properties: transactionTime and description. Let’s build it out from a simple synchronous case, to a full fledged asynchronous case.

class Transaction {
  let description: String
  let transactionTime: Date 
  
  init(description: String, transactionTime: Date) {
    self.description = description
    self.transactionTime = transactionTime
  }
}

So far, so obvious. We can instantly initialise Transaction. In real life, Transaction is not initialised with piecemeal values, it is initialised from decoded JSON data received from an HTTP request. That JSON might look like this:

{
  "transaction_time": "2008-08",
  "description": "short Lehman Bros. stock"
}

And we can decode that JSON into our Transaction class like so:

/* Part of Transaction definition */
enum JSONObjectKeys: String, CodingKey {
  case txTime = "transaction_time"
  case description = "description"
}

init(from decoder: Decoder) throws {
  let container = try decoder.container(
    keyedBy: JSONObjectKeys.self
  )
  description = try container.decode(
    String.self,
    forKey: .description
  )
  let dateFormatter = DateFormatter()
  dateFormatter.dateFormat = "yyyy-MM" //...
  let rawTime = try container.decode(
    String.self,
    forKey: .txTime
  )
  guard let txTime: Date = dateFormatter.date(
    from: rawTime
  ) else {
    throw Error
  }
  transactionTime = txTime
  return
}

Whoah! What just happened! We decoded a JSON object into an immutable Swift object. Nice! That was intense, so lets take a breather and look at a cute baby bird:

Break time is over! Back to it: Suppose at some point in our application, we want to create an instance of Transaction. Perhaps a user has tapped ‘save’ in an interface. Because the Amatino API is going to (depending on geography) take ~50ms to respond, we need to perform an asynchronous initialisation.

We can do this by giving our Transaction class a static method, like this one:

static func create(
  description: String,
  transactionTime: Date,
  callback: @escaping (Error?, Transaction?) -> Void
) throws {
  /* dummyHTTP() stands in for whatever HTTP request
     machinery you use to make an HTTP request. */
  dummyHTTP() { (data: Data?, error: Error?) in
    guard error == nil else { 
      callback(error, nil)
      return
    }
    guard dataToDecode: Data = data else {
      callback(Error(), nil)
      return
    }
    let transaction: Transaction
    guard transaction = JSONDecoder().decode(
      Transaction.self,
      from: dateToDecode
    ) else {
      callback(Error(), nil)
      return
    }
    callback(nil, transaction)
    return
  }
}

This new Transaction.create() method follows these steps:

  1. Accepts the parameters of the new transaction, and a function to be called once that transaction is available, the callback(Error?:Transaction?). Because something might go wrong, this function might receive an error, (Error?) or it might receive a Transaction (Transaction?)
  2. Makes an HTTP request, receiving optional Data and Error in return, which are used in a closure. In this example, dummyHTTP() stands in for whatever machinery you use to make your HTTP requests. For example, check out Apple’s guide to making HTTP requests in Swift
  3. Looks for the presence of an error, or the absence of data and, if they are found, calls back with those errors: callback(error, nil)
  4. Attempts to decode a new instance of Transaction and, if successful, calls back with that transaction:callback(nil, transaction)

The end result? An immutable object. We don’t have to reason about whether or not it is fully initialised, it either exists or it does not. Consider an alternative, wherein the Transaction class tracks internal state:

class Transaction {
  var HTTPRequestInProgress: bool
  var hadError: Bool? = nil
  var description: String? = nil
  var transactionTime: Date? = nil

  init(
    description: String,
    transactionTime: Date,
    callback: (Error?, Transaction?) -> Void
  ) {
    HTTPRequestInProgress = true
    dummyHTTP() { data: Data?, error: Error? in 
       /* Look for errors, try decoding, set
          `hadError` as appropriate */
       HTTPRequestInProgress = false
       callback(nil, self)
       return
    }
  }
}

Now we must reason about all sorts of new possibilities. Are we trying to utilise a Transaction that is not yet ready? Have we guarded against nil when utilising a Transaction that is ostensibly ready?  Down this path lies a jumble of guard statements, if-else clauses, and sad baby birdies.

Don’t make the baby birdies sad, asynchronously initialise immutable objects! 💕

Further Reading

– Hugh

Lessons from releasing a personal project as a commercial product

Aliens. It all begins with aliens. Rewind to San Francisco, and a game developer named Unknown Worlds.  Unknown Worlds is awesome.  We’re chilled out, but we create wonderful products. The games we make bring joy to millions of people around the world. The founders, Charlie and Max, are just the coolest and most inspirational blokes.

Before Unknown Worlds, I was at KPMG. A bean-counter, not a programmer. I couldn’t tell computers what to do. But now, making games, I was surrounded by people who could.

I was so inspired by Brian Cronin, Dushan Leska, Jonas Bötel,  Steve An, and others. They were gods. They would sit in a trance for days, occasionally typing incantations on their keyboards, and eventually show us some amazing new game feature. I was in awe.

Dushan would say to me: ‘Just automate something you do every day. It will be hard, you will have to learn a lot, but it will teach you how to write code‘. So I did.

I hold Dushan (mostly) responsible for this mess

At KPMG I spent a lot of time doing battle with Microsoft Excel.  There is nothing fundamentally wrong with Excel. The problem is that it is an extremely generalised tool, and the work we were doing was not generalised. Too much time was spent copying and pasting data, sanitising data, shuffling data by hand.

When I arrived at Unknown Worlds, I started monitoring our sales. I channeled my inner KPMG and created glorious spreadsheets with pretty graphs. It was an awfully manual process. So, on Dushan’s advice, I started automating it.

The process was agonisingly slow. I would devote time after work, on weekends, at lunches: I had no teacher. Once I got going though, I was hooked. Tasks that used to take us hours at KPMG evaporated in moments in the hands of the machine. I felt like a magician.

With great power comes great responsibility. Soon I was writing code in our games. I thought I was pretty damn clever. Some of the stuff I wrote was super cool, one feature even got me invited to speak at Game Developer’s Conference. But damn, most of it was hot garbage.

Working on Subnautica taught me that mediocre programmers are dangerous to the health of large projects. Also dangerous: Reaper Leviathans.

There is nothing more dangerous on a big software project than a mediocre programmer. We’re like a radioactive prairie dog on heat: Running around contaminating codebases with bugs, indecipherable intent, zero documentation, no testing, and poor security.

Eventually I learned enough to realise I needed to ban myself from our game’s codebases. I was desperate to be better: I wanted to be able to contribute to Unknown Worlds games in a sustainable, positive way. One day I read a recommendation: Create a personal project. A project you can sculpt over a long period of time, learning new skills and best practices as you go.

Channeling Dushan again, I decided to start an accounting software project. Accounting software gives me the shits. As I learned more about code, I realised that most accounting software is shit. And it’s near impossible to integrate the big accounting software packages into other software.

How many software startups can you fit in one frame?

Piece by piece, after hours, over weekends, and at any time a healthier person would take a holiday, I put together a beast I called Amatino. It was always supposed to be something small. A side project that I would use myself. Haha… ha. Oh dear.

Today Amatino is available to anyone. It’s a globally-distributed, high-performance, feature-rich accounting wet dream. You can actually subscribe to Amatino and real money will arrive in my bank account. That’s just fucking outrageous!

Still can’t believe this is a real screenshot

Even better, I’ve achieved my original goal. I feel comfortable digging around in code on Unknown Worlds games, and am no longer a dangerous liability to our code quality. I can finally do some of what I saw Max, Charlie, Dushan, Steve, Jonas and Brian doing all those years ago.

Along the way I picked up a few lessons.

Lesson 1: Do it

Creating your own product is utterly exhilarating and mind expanding. I’m about as artistic as an Ikea bar stool, but I imagine this is how artists feel when they make art. It just feels great.

Lesson 2: Keep your day job

Alright, maybe quit your day job if it doesn’t make you happy. But if you are happy, keep at it. Over the past years I’ve given Unknown Worlds 100% and more. Unknown Worlds makes me super happy. To build Amatino simultaneously, I had to develop discipline: Every night, every weekend, every holiday, code like hell.

Spend enough time around Max (L) and Charlie (R), the founders of Unknown Worlds, and you will be inspired to do cool stuff

There are many benefits. First, you don’t lose contact with your work mates. Charlie, Max, Scott, Brandt, and many others are constant inspirations to me. Second, you don’t have to worry about funding, because you have a job. Third, you are kept grounded.

I think if I didn’t spend all day making games, Amatino would have sent me insane. I would have lacked direction, and woke up not knowing what to do. Instead, I worked on making games, structured my day around Unknown Worlds, and devoted focused, intense energy to Amatino when possible.

Lesson 3: Your partner comes first

No matter how important a milestone is, or how deep in thought you are, or how good you think your ideas are, you drop everything for your partner. You lift up your partner, you encourage your partner, you support your partner. Every day, without fail, without exception.

This was a hard lesson to learn. It is the most important lesson.

Without Jessica, Amatino would not have happened. And it is precisely because she took me away from Amatino that she helped. The ritual of cooking for her, sharing meals with her, going on dates with her, doing household chores with her, listening attentively to her thoughts, concerns, and dreams. All these things take immense time, time you might wish to devote to your project instead.

You must not make that trade. It is a false economy. Your productivity will suffer, your health and emotional wellbeing will suffer. The energy you devote to your partner instead of your project will come back to you tenfold and more.

Don’t bore your partner to death by constantly talking about your project. Most importantly, don’t put off big life decisions because you think the time will be right after your project is released.

Don’t put off the big decisions!

Lesson 4: Eat well, exercise, and don’t get drunk

You all hear this enough elsewhere. You have a day job, a personal project, and perhaps a partner too: You cannot waste time recovering from the ingestion of cognitive impediments.  Any social value you get from being drunk is utterly dwarfed by the opportunity cost of brain-cells not functioning at peak efficiency.

Your mates might give you hell for this. Don’t worry, they will still love you in the long run.

Lesson 5: Ignore the framework brigade

I’m building a Dockerized cloud Node app with React-native frontend on GCP powered by the blockchain.” Don’t be those people. Learn from first principles. Start with abstract design thought, not a list of software for your ‘stack’. Don’t be afraid to build your own systems.

Reach for third-party dependencies judiciously and only where absolutely necessary. Learn by dabbling in languages where you need to allocate your own memory, while leveraging the speed boost that comes with those in which you don’t. Build computers. Tinker with them.

You will learn a lot from building, breaking, and upgrading your own computers. This one was maybe me taking it a bit too far

Hot tip: If your elevator pitch contains the brand name of a third party dependency, you are violating Lesson 5.

Lesson 6: Be humble

Maybe some people get ahead in life by being arrogant, self-assured dickheads. In fact, I am sure that is true. If you want to build and release a product, you need to check your ego at the door.

Suck in information from everyone and everything around you. Approach the world with unabridged, unhinged curiosity. Even when you don’t agree with someone, give them your undivided attention and listen, don’t talk. Consider their advice most especially if it conflicts with your own assumptions.

Good luck!

Playing doctor with a Linux system

Monitoring Linux system health is a route to peace of mind. When a fleet of machines is serving an application, it is comforting to know that they are each and collectively operating within hardware performance limits.

There are countless libraries, tools, and services available to monitor Linux system health. It is also very easy to acquire system health information directly, allowing construction of a bespoke health monitoring subsystem.

There are five critical metrics of system health:

  1. Available memory
  2. CPU utilisation
  3. Network I/O (data transmission and receipt)
  4. Available disk space
  5. Disk I/O (reads and writes to disk)

Let’s take a look at how we can examine each one. This article is written from the perspective of Ubuntu 16.04, but many of the commands are available across Linux distributions. Some of them only require the Linux kernel itself.

Available Memory

We can get available memory using the free command. On its own, free will give us a bunch of columns describing the state of physical and swap memory.

$ free
       total  used  free  shared  buff/cache  available
Mem:   498208 47676 43408 5568    407124      410968
Swap:       0     0     0

There’s a lot going on here. What we are looking for, in plain English, is ‘how much memory is available to do stuff’. If such a number was low, we would know that the system was in danger of running out of memory.

The number we want is counterintuitively not the one in the column labelled ‘free’. That column tells us how much memory the system is not using for anything at all. Linux uses memory to cache regularly accessed files, and for other purposes that don’t preclude its allocation to a running program.

What we want is column 7, ‘available’. We can get just that number by using grep and awk. We can also use the -m flag to return results in megabytes, rather than bytes, thus making the output more readable.

$ free -m | grep 'Mem:' | awk '{print $7}'
400

That’s much better! A single integer representing how many megabytes of memory are available for the system to do things.

On its own, this is not very useful. You are not going to go around SSH’ing to every box in your fleet, running commands and noting numbers down on a piece of paper. The magic happens when the output is combined with some program that can collate all the data. For example, in Python, we could use the Subprocess module to run the command then store the number:

"""memory.py"""
import subprocess

command = "free -m | grep 'Mem:' | awk '{print $7}'"
memory_available = int(subprocess.getoutput(command))

CPU Utilisation

To monitor Linux system cpu utilisation, we can use the top command. top produces while bunch of output measuring the cpu utilisation of every process on the system. To get an overall sense of system health, we can zero in on the third line:

$ top
//
%Cpu(s):  0.3 us,  0.3 sy,  0.0 ni, 99.3 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
//

Four numbers are of use to us; Those succeeded by us, sy, id, and wa, which indicate the proportion of CPU time allocated to user processes, system processes, idling, and I/O wait respectively.

To acquire these numbers programmatically, we need to adjust top‘s output slightly.  We’ll use a few flags:

  • -b: Run in a non-interactive mode. Top will return to the shell rather than running indefinitely
  • -n: Sample for a specified number of iterations. We will use two iterations, and take numbers from the second.
  • -d: Delay time between iterations. We will supply a non-zero number so that top acquires data over some time

The whole command will be:

$ top -d 3 -b -n 2 | grep "%Cpu"

In Python, we can execute the command and split the output into individual floating point numbers. To do so, we take advantage of the fixed-width position of top’s output.

"""cpu.py"""
import subprocess

command = 'top -d 3 -b -n 2 | grep "%Cpu"'
output = subprocess.getoutput(command)
data = output.split('\n')[1]
cpu_user = float(data[8:13])
cpu_system = float(data[17:22])
cpu_idle = float(data[35:40])
cpu_io_wait = float(data[44:49])

Network I/O

All the hardware in the world won’t save you if your network connection can’t keep up. Monitoring transmit and receive volumes is, fortunately, pretty easy. The kernel provides us with a convenient window onto network activity, in /sys/class/net.

$ ls /sys/class/net
eth0 lo tun0

On this example system, /sys/class/net contains three network interfaces. An ethernet adapter eth0, the local loopback lo, and a vpn tunnel adapter tun0.

How you proceed to gather the information available about these interfaces is going to depend heavily on your situation. The following technique satisfies a couple of assumptions:

  1. We don’t know the number or disposition of network interfaces in advance
  2. We want to gather transmit / receive statistics for all interfaces except the local loopback
  3. We know that the local loopback interface name alone will always start with the character l.

These assumptions might not apply to you. Even if they don’t, you might be able to apply some of the techniques used herein to your situation.

Inside each interface, there is a statistics directory containing a wealth of information.

$ ls /sys/class/net/tun0/statistics
collisions        rx_packets
multicast         tx_aborted_errors
rx_bytes          tx_bytes
rx_compressed     tx_carrier_errors
rx_crc_errors     tx_compressed
rx_dropped        tx_dropped
rx_errors         tx_errors
rx_fifo_errors    tx_fifo_errors
rx_frame_errors   tx_heartbeat_errors
rx_length_errors  tx_packets
rx_missed_errors  tx_window_errors

To get a general overview of network activity, we will zero in on rx_bytes and tx_bytes.

 

$ cat /sys/class/net/tun0/statistics/rx_bytes
11880392069
$ cat /sys/class/net/tun0/statistics/tx_bytes
128763654271

These integer counters tick upwards since, effective, system boot. To sample network traffic, you can take readings of the counters at two points in time. The counters will wrap, so if you have a very busy or long-lived system you should account for potential wrapping.

Here is a Python program that samples current network activity in kilobytes per second.

"""network.py - sample snippet"""
//
root = 'cat /sys/class/net/'
root += interface + '/statistics/'
rx_command = root + 'rx_bytes'
tx_command = root + 'tx_bytes'
start_rx = int(subprocess.getoutput(rx_command))
start_tx = int(subprocess.getoutput(tx_command))
time.sleep(seconds)
end_rx = int(subprocess.getoutput(rx_command)
end_tx = int(subprocess.getoutput(tx_command))
rx_delta = end_rx - start_rx
tx_delta = end_tx - start_tx
if rx_delta <0:
   rx_delta = 0
if tx_delta <0:
   tx_delta = 0
rx_kbs = int(rx_delta / seconds / 1000)
tx_kbs = int(tx_delta / seconds / 1000)
//

Note that this program includes a hard coded interface, tun0. To gather all interfaces, you might loop through the output of ls and exclude the loopback interface.  For purposes that will become clearer later on, we will store each interface name as a dictionary key.

"""network.py - interface loop snippet"""
//
output = subprocess.getoutput('ls /sys/class/net')
all_interfaces = output.split('\n')
data = dict()
for interface in interfaces:
   if interface[0] == 'l':
      continue
   data[interface] = None
//

On a system with multiple interfaces, it would be misleading to measure the traffic across each interface in sequence. Ideally we would sample each interface at the same time. We can do this by sampling each interface in a separate thread. Here is a Python program that ties everything together and does just that. The above two snippets, “sample” and “interface loop”, should be included where annotated.

"""network.py"""
import subprocess
import time
from multiprocessing.dummy import Pool as ThreadPool

DEFAULT_SAMPLE_SECONDS = 2

def network(seconds: int) -> {str: (int, int)}:
   """
   Return a dictionary, in which each string
   key is the name of a network interface,
   and in which each value is a tuple of two
   integers, the first being sampled transmitted
   kb/s and the second received kb/s, averaged
   over the supplied number of seconds.

   The local loopback interface is excluded.
   """
   # 
   # Include 'interface loop' snippet here
   #
   
   def sample(interface) -> None:
      #
      # Include 'sample' snippet here
      #
      data[interface] = (tx_kbs, tx_kbs)
      return

   pool = ThreadPool(len(data))
   arguments = [key for key in data]
   _ = pool.map(sample, arguments)
   pool.close()
   pool.join()
   return data

if __name__ == '__main__':
   result = network(DEFAULT_SAMPLE_SECONDS)
   output = 'Interface {iface}: {rx} rx kb/s
   output += ', {tx} tx kb/s'
   for interface in result:
      print(output.format(
         iface=interface,
         rx=result[interface][1],
         tx=result[interface][0]
      ))

Running the whole thing gives us neat network output for all intefaces:

$ python3 network.py
Interface tun0: 10 rx kb/s, 64 tx kb/s
Interface eth0: 54 rx kb/s, 25 tx kb/s

Of course, printing is fairly useless. We can import the module and function elsewhere:

"""someprogram.py"""
from network.py import network as network_io

TX_DANGER_THRESHOLD = 5000 #kb/s
sample = network_io(2)
for interface in sample:
   tx = sample[interface][0]
      if tx > TX_DANGER_THRESHOLD:
         # Raise alarm 
# Do other stuff with sample

Disk Space

After all that hullaballoo with network I/O, disk space monitoring is trivial. The df command gives us information about free disk usage:

$ df
Filesystem     1K-blocks    Used Available Use% Mounted on
udev              239812       0    239812   0% /dev
tmpfs              49824    5540     44284  12% /run
/dev/xvda1       8117828 3438396   4244156  45% /
tmpfs             249104       0    249104   0% /dev/shm
tmpfs               5120       0      5120   0% /run/lock
tmpfs             249104       0    249104   0% /sys/fs/cgroup
tmpfs              49824       0     49824   0% /run/user/1000

This is a bit of a mess. We want column four, ‘available’, for the partition you wish to monitor, which in this case is /dev/xvda1. The picture will get much messier if you have more than one partition on the system. In the case of a system with one partition, you will likely find it mounted at /dev/somediskname1. Common disk names include:

  • sd: SATA and virtualised SATA disks
  • xvd: Xen virtual disks. You will see this if you are on EC2 or other Xen based hypervisors
  • hd: IDE and virtualised IDE disks

The final letter will increment upwards with each successive disk. For example, a machine’e second SATA disk would be sdb. An integer partition number is appended to the disk name. For example, the third partition on a machine’s third Xen virtual disk would be xvdc3.

You will have to think about how best to deal with getting the data out of df. In my case, I know that all machines on my network are Xen guests with a single partition,  so I can safely assume that /dev/xvda1 will be the partition to examine on all of them. A command to get the available megabytes of disk space on those machines is:

$ df -m | grep "^/" | awk '{print $4}'
4145

The grep phrase "^/" will grab every line beginning with "/". On a machine with a single partition, this will give you that partition, whether the disk is sd, xvd, hd, and so on.

Programmatically acquiring the available space is then trivial. For example, in Python:

"""disk.py"""
import subprocess

command = 'df -m | grep "^/" | awk \'{print $4}\''
free = int(subprocess.getoutput(command))

Disk I/O

A system thrashing its disks is a system yielding unhappy users. /proc/diskstats contains data that allow us to monitor disk I/O. Like df, /proc/diskstats output is a messy pile of numbers.

$ cat /proc/diskstats
//
202       1 xvda1 2040617 57 50189642 1701120 3799712 2328944 85759400 1637952 0 1064928 3338520
//

Column 6 is the number of sectors read, and column 10  is the number of sectors written since, effectively, boot. On a long lived or shockingly busy system these numbers could wrap. To measure I/O per second, we can sample these numbers over a period of time.

Like with disk space monitoring, you will need to consider disk names and partition numbers. Because I know this system will only ever have a single xvd disk with a single partition, I can safely hardcode xvda1 as a grep target:

$ cat /proc/diskstats | grep "xvda1" | awk '{print $6, $10}'
50192074 85761968
 Once we have the number of sectors read and written, we can multiply by the sector size to get I/O in bytes per second. To get sector size, we can use the fdisk command, which will require root privileges.
$ sudo disk -l | grep "Sector size" | awk '{print $4}'
512

On a machine with more than one disk, you will need to think about getting sector sizes for each disk.

Here’s a Python program that ties all that together:

"""diskio.py"""
import subprocess
import time

seconds = 2

command = 'sudo fdisk -l | grep'
command += '"Sector size" | awk \'{print $4}\''
sector_size = int(subprocess.getoutput(command))
command = 'cat /proc/diskstats | grep "xvda1"'
command += ' | awk \'{{print $6, $10}}\''

sample = subprocess.getoutput(command)
start_read = int(sample.split(' ')[0])
start_write int(sample.split(' ')[1])

time.sleep(seconds)

sample = subprocess.getoutput(command)
end_read = int(sample.split(' ')[0])
end_write = int(sample.split(' ')[1])

delta_read = end_read - start_read * sector_size
delta_write = end_write - start_write * sector_size
read_kb_s = int(delta_read / seconds / 1000)
write_kb_s = int(delta_write / seconds / 1000)

A Bespoke Suit

Now that we’ve collected all these data, we can decide what to do with them. I like to gather up all the data into a json package and shoot them off to a telemetry aggregating machine elsewhere on the network. From there it is a hop, skip and a jump to pretty graphs and fun SQL queries.

By gathering the data yourself, you have the freedom to store, organise, and present the data as you see fit. Sometimes, it is most appropriate to reach for a third party tool. In others, a bespoke solution gives unique and powerful insight.

Architecting a WiFi Hotspot on a Remote Island

Internet access on Lord Howe Island is very limited. The island is extremely remote. I am intensely interested in providing affordable, accessible, and reliable internet connections to residents and guests.

The ‘Thornleigh Farm’ Internet (internally code-named ‘Nike’) is a newly launched service that offers public internet access on Lord Howe Island. Here are some of the architectural choices I made in developing the service.

Island Cloud

WiFi hotspot solutions require a server that acts to authenticate, authorise, and account for network access. The current industry trend appears to be toward doing this in the ‘cloud’ – I.e. remote data-centres.

Such a solution is not suitable for Lord Howe Island, because of satellite latency. Signals travel well over 70,000 kilometres through space between transmitting and receiving stations, yielding a practical minimum latency of around 600ms, often higher. This high latency creates a crappy customer experience during sign-on.

Instead, Nike utilises local servers for network control. Power is very expensive on Lord Howe Island, which led to a choice of low voltage Intel CPU’s for processing. Two dual-core Intel ‘NUC’ machines serve as hypervisors for an array of network control virtual machines.

Intel NUC machines and Ubiquiti switching equipment

Going local means replicating infrastructure we take for granted in the cloud. Nike utilises local DNS (Bind9), database (Postgres), cache (Redis), and web (NGINX) servers. It’s like stepping back in time, and really makes you appreciate Amazon Web Services (AWS)!

DNS Spaghetti

Bringing the Nike application “island-side” meant dealing extensively with the Domain Name System (DNS). Local requests to the application domain, thornleigfarm.com, need to be routed locally or via satellite depending on their purpose.

For example, new clients are served the Nike purchase page from a local server. Clients of the Thornleigh Farm Store, which offers food orders, are served from an AWS server via satellite.

A local Bind9 DNS captures all thornleighfarm.com domain traffic on our network, and punts it to the local Nginx server. Nginx then chooses to proxy the request to local applications, or to the external thornleighfarm.com AWS Route 53 DNS, depending on the request path.

An island-side client receiving content served from island servers

This request spaghetti has some cool effects: Clients requesting thornleighfarm.com/internet receive an information page when off the island, and a purchase page when they are on it.

Client Identification

From the outset, I wanted to avoid requiring user accounts. Older customers in particular react very poorly to needing to create a new user account, set a password, remember it, and so on.

Also, I am a privacy psychopath and I want to collect the absolute bare minimum customer data necessary to provide the service.

Instead, Nike identifies clients by device Media Access Control (MAC) address. This is uniquely possible on the Thornleigh network because all public clients are on the same subnet. The Nike application can get the MAC associated with a particular IP address in real-time by making a request to the network router.

Part of the Nike codebase that identifies clients by MAC

A small custom HTTP API runs on our Ubiquiti Edgemax router, that looks up a given MAC in its routing table and returns the associated IP if available.

Payments

Stripe is an amazing payments provider, full-stop. Their API is fantastically well documented, customer service brilliant, and tools of exceptional quality. They pay out every day, and offer low fees. I cannot recommend them highly enough.

Nike ran into a minor problem with the Stripe Checkout system: It does not work in Android WebViews. Android uses WebViews in a manner analogous to the Apple Captive Network Assistant: They sandbox public WiFi DNS capture. In Android’s case, the sandboxing is strict enough to kill Checkout.

Stripe Elements inside the MacOS Captive Network Assistant

This problem was easily solved by moving to Stripe Elements, and building a simple custom payments form utilising existing Nike styling.

Layer 1

Deploying physical network infrastructure on Lord Howe Island presents a few challenges. First, power is scarce. Second, regulatory approvals for any sort of island-modifying work are very difficult to obtain.

The property that serves as the nexus for Nike, Thornleigh Farm, is hidden inside a shield of palm forest. It is not possible to broadcast any meaningful signal out of the property, though we do offer the Public Internet network across the farm for the use of farm customers.

Fortunately, the property includes a glorious old boat-shed sitting on Lagoon Beach. Even more fortunately, an old copper conduit runs under the forest between farm and boat-shed. This enabled the installation of an optical fibre. The shed then acts as the southernmost network node.

Ubiquiti NanoBeam AC Gen2 radios provide multiple radio links in the Nike Layer 1 network

A 5Ghz link then penetrates a treeline to the north, linking to another island business, with whom we have joined forces, and who serve as the northernmost node.

All in all, a mixture of Cat6 copper, 5Ghz point-to-point radios, and optical fibre connect the satellite dishes with our server room and then on to the boat sheds on the beach.

Access Control

The Thornleigh Farm network is mostly built from Ubiquiti Unifi equipment. The WiFi networks, including the Nike ‘Public Internet’ network, are controlled by the proprietary Unifi Controller (UC), running on a local virtual machine.

The UC has a publicly documented API that ostensibly allows fairly fine grained manipulation of client network access. In practice, the documentation is not developer-friendly, and interacting with the UC was the most difficult part of the project outside construction of the physical network.

For a while, I flirted with deploying a fully custom system utilising open-source RADIUS and ChilliSpot software. This path did not bare fruit, and I settled back on bashing through the UC API.

An example of some of the calculations that occur while authorising a client

Nike functions as a Python application that interfaces with the UC whenever it needs to authorise, de-authorise, or check usage by a client. Data usage tracking is handled by custom code and stored in our local Postgres database.

The custom implementation allows us to do some fun stuff, like offer refunds of partial usage, and allow customers to stack multiple data packs on top of each other. Nike continuously updates the UC whenever a client’s remaining quota changes, and then the UC internally handles disconnecting the client when they exceed their quota .

Final Thoughts

Isolation, latency, and high operating costs make Lord Howe Island a difficult environment in which to deploy public internet. The internet is, however, a more and more crucial element of modern life. Participation in modern economic activity requires reliable connection to the internet, and I hope that in the long term Nike can serve a valuable service to residents and guests of Lord Howe Island.

If you’d like to discuss the project, hit me up on Twitter.

Intel Processors in the 2017 Macbook Pro

WWDC just rolled around, and techy news sites are running their usual breathless summary articles. They list of the high level features of new Macbook Pros,  but never seem to give any technical information about the processors beyond superficial and meaningless ‘Ghz’ numbers.

This is probably appropriate for most of their readers. Apple knows that its customers don’t usually care about the difference between an i7-5557U and an i7-7660U, so they don’t provide model numbers on their site, either.

It’s possible to figure out what processors are lurking behind the marketing vomit by visiting Intel’s ARK knowledge base. Cross reference timing, base and maximum clock frequencies, and voila you’ve got yourself technical data upon which to base purchasing decisions.

Here are all the processors Apple has put in their Macbook Pro machines this WWDC:

Macbook Pro 13

 

Drone Mapping

Having a birds eye perspective on a problem facilitates better decision making. Quite literally, being able to look down upon a geographic area allows one to craft better plans. Modern mapping services like Google Maps grant views with exceptional clarity.

Google’s photogrammetric mapping is astoundingly good – Here’s the Lower Haight district in San Francisco

Such services are not available on Thornleigh Farm – located, as it is, on remote Lord Howe Island. This is understandable: Internet giants like Google are hardly going to expend their resources producing hyper-detailed maps of a remote island in the Tasman Sea.

A freely available satellite survey of Lord Howe Island provided on Google Maps

The best resolution I’ve been able to get from freely available satellite surveys is generally about 30cm / pixel. This isn’t enough to usefully inform decision making. Enter consumer-grade drone technology. Using a DJI Mavic Pro drone, I’ve been able to produce aerial surveys with a resolution as fine as 1cm / pixel.

An individual image from a DJI Mavic Pro flying over Thornleigh Farm

Images can be stitched together using Adobe Photoshop’s inbuilt Photomerge functionality. I took approximately 200 photos and then used Photomerge to stitch them all together at once. On my laptop, this took only a few hours. The result is so intimately detailed that I don’t want to post it here, out of respect for our privacy on the farm.

Drones like the DJI Mavic Pro are not cheap. But they are significantly cheaper than orbiting satellites, and presumably cheaper than the vehicles Google uses to produce photogrammetric maps. Yet they allow a small business on a remote island to produce fantastic, detailed aerial surveys to inform better decision making. If anyone ever tells you drones are useless toys – Point them to Thornleigh Farm.