It’s Time To Change Section 44

Ludlam, Waters, Canavan, Roberts, Joyce – The Section 44 pain train is rolling through an ever growing list of representatives and senators. It is time for this absurdity to stop. Section 44, specifically subsection (i), reflects an outmoded, irrelevant view of what it means to be an Australian citizen. It is actively harmful to our ability to grow and prosper as a nation.

Any person who –

Is under any acknowledgement of allegiance, obedience, or adherence to a foreign power, or is a subject or a citizen or entitled to the rights or privileges of a subject or citizen of a foreign power

[…]

shall be incapable of being chosen or of sitting as a senator or a member of the House of Representatives.

The members embroiled in the dual-citizenship fiasco hail from across the political spectrum. Regardless of our persuasions, we can surely agree that all these members are patriots, acting for the best interests of the Australian people, even if they disagree about how those interests are best served.

Australia is a diverse nation of immigrants. This is our great strength, the secret sauce that has propelled us to great wealth, peace, and prosperity. We should want our parliament to reflect our diversity, to contain bridges to the world’s peoples. Such bridges, manifesting in dual-citizenships, are tools to allow our parliament  to better act in our collective interest.

If senator Canavan’s mother signed him up to be an Italian citizen without his knowledge, if senator Waters immigrated here from Canada as an infant, we should savour and welcome and support their links to these foreign lands. We should welcome senators Canavan, Waters, and others dual-citizens as a strength, a representation in the legislature of our collection diverse, immigrant selves.

The alternative, the status quo, is to admit a great insecurity about our way of life. We suggest that a dual-citizenship, however tenuous, is sufficient corruption to be likely to sway a member to act against Australia. Forget foreign spies, bribery, infiltration, inducements. No: It is enough that an infant was born in a Canadian hospital, that infant is likely to be a traitor!

We grow stronger by embracing the ties that bind us to our fellow creatures around this Earth. We have grown wealthy, safe, strong, and prosperous through such embraces, while more insular, inward looking nations struggle with the limitations such insulation imposes.

Trade, treaties, the movement of people, the flow of finance, the exchange of ideas: An Australia more tightly bound to Canada, to Italy, or to any other nation is a stronger Australia. Members of parliament with ties to these and other lands are a source of strength, not weakness. It’s time to amend the constitution, re-write section 44(i), and embrace our own strength.

Bigint vs Numeric Performance in PostgreSQL 9.6

PostgreSQL 9.6

Suppose we have an application that stores two big lists of numbers in a relational database. These numbers represent money – Both the amount of money units in transactions, and the price of that money.

In response to a client query, we must multiply the big list of transaction values by the big list of prices, and sum a running total of those multiples. We assume we can’t pre-calculate and store the running values, and that we must supply the client with strings, perhaps because they are communicating with us via a web API. Suppose clients expect responses in real time.

Storing money requires absolute precision. We can’t use floats, for obvious reasons. So we would probably use an arbitrary precision type. If we were using PostgreSQL 9.6, as this article assumes, then we might choose to use a numerictype to store both values and prices.

Numerics give us arbitrary precision, but come with some baggage. Numerics are binary coded decimals, and CPUs can’t interact with them directly using hardware instructions. Whereas two 64-bit integers will fit in a CPU register and be happily summed, two numerics need to go through a bunch of software shuffling to perform a calculation.

Therefore, if we need to perform lots of calculations quickly, numerics might slow us down when compared to using a 64-bit integer. In PostgreSQL, a 64-bit integer is called a bigint.

Hypothesis: Performing all calculations on integers of type bigint, rather than binary coded decimals of type numeric, will vastly reduce computation times when responding to our client’s queries. Let’s assume the following:

  • A price is stored as a bigintwith an implicit precision of four decimal places. For example, a price of £4.4352 is stored as 44352 (NB the absence of the decimal point).
  • An amount is stored as abigint with an implicit precision of two decimal places. This suits currencies with two decimal places of minor unit precision, such as the US Dollar (USD). For example, an amount of $12,045.42 is stored as 1204542.

The value of any transaction is therefore the price of the money (44352) times the amount of money units (1204542) divided by the extra orders of magnitude introduced by the implicit decimal places.

In our example, the price has introduced four decimal places, and the amount has introduced two decimal places, for a total of six. Therefore, the calculation is as follows:

44352 * 1204542 / 10^6

This yields 53423.846784, which may be nicely formatted for client consumption as $53,423.85.

So, to get to our true value with explicit decimal places, we need to perform division. In PostgreSQL we can’t divide two values of type bigint or we will lose precision. For example,  select (10::bigint / 3::bigint) yields 3. Because we are dealing with money, that is not acceptable.

Numerics, as arbitrary precision types, can retain precision on division. We will cast to anumeric, and divide by anothernumeric of appropriate size for our implied decimal places.

Here’s the example above written in Postgres 9.6 SQL:

SELECT 
    to_char(value::numeric(15,4) / 1000000::numeric(15,4), 'FM999,999,999,999.99') AS str_representation
FROM (
    SELECT
        amount * price AS interim_value
    FROM (
        SELECT
            1204542::bigint AS amount,
            44352::bigint AS price
    ) AS random_data
) AS calculation

Is this really faster than using binary coded decimals? To answer that question, we’ll create a big set of random data that we can operate on.

We will assume a largish organisation that processes 250 transactions per day every day for 10 years, or 912,500 transactions, and round it up to 1,000,000 for luck. We will assume these transaction amounts are in USD, with two decimal places of implied precision, and are sized somewhere between $0 and $200. Each amount is processed at a price to convert the USD to some Pounds Sterling (GBP). Let’s assume (somewhat absurdly) that each GBP price is somewhere between £0 and £10 with four decimal places of implied precision.

SELECT
    generate_series (1, 1000000) as row_id,
    (random() * 20000)::bigint AS amount,
    (random() * 100000)::bigint AS price

My machine features an i7-5557U CPU, and it takes ~670ms to generate these data, which look like this:

row_id   | amount | price  
---------+--------+--------
       1 |  14179 |  48998
       2 |  16948 |  56369
       3 |  12760 |  16965
       4 |  17177 |    977
       5 |  11632 |  38872
       6 |  18370 |  44416
       7 |  14370 |  89625

Now, let’s perform the necessary calculations. For each of the million rows, we need to multiply the amount by the price, and then sum those multiples into a running balance column.

SELECT 
    value, 
    SUM(value) OVER (ORDER BY row_id) AS balance
FROM (
    SELECT 
        row_id, amount * price AS value
    FROM (
        SELECT 
            generate_series (1,1000000) as row_id,
            (random()*20000)::bigint AS amount, 
            (random()*10000)::bigint AS price
    ) AS random_data
) AS raw_values

This takes ~1,550ms on my i7-5557U. Less the ~670ms to generate the data yields a calculation time of ~880ms. Computers are so cool. That’s just 880 nanoseconds per row! Here’s some sample output:

   value   |    balance     
-----------+----------------
  15987240 |       15987240
   5282332 |       21269572
  32625180 |       53894752
 125860125 |      179754877
   3753301 |      183508178
  79953750 |      263461928
  74762297 |      338224225

These huge integers, with their implicit decimal places, are not very useful. They are certainly not human readable. We need to convert them into decimal numbers with explicit decimal places. To do so, we will cast them to type numeric. Finally, recall that our clients require string output, so we will convert the numericinto a nicely formatted stringusing the to_char()function.

SELECT
    to_char(value::numeric / 1000000::numeric, 'FM999,999,999,990.90') as str_value,
    to_char(balance::numeric / 1000000::numeric, 'FM999,999,999,990.90') as str_balance
FROM (
    SELECT 
        value, 
        SUM(value) OVER (ORDER BY row_id) AS balance
    FROM (
        SELECT 
            row_id, amount * price AS value
        FROM (
            SELECT
                generate_series (1, 1000000) as row_id,
                (random() * 20000)::bigint AS amount, 
                (random() * 100000)::bigint AS price
        ) AS random_data
    ) AS raw_values
) AS balances

Bam! Approximately 2,700ms, Less generation (~670ms) and calculation (~880ms), that’s ~1,150ms. All that bigint::numerictype casting andnumeric / numericdivision adds a good chunk of change. Here is some sample output:

 string_value | string_balance 
--------------+----------------
 1,029.03     | 1,029.03
 85.97        | 1,115.00
 291.78       | 1,406.78
 670.64       | 2,077.42
 62.33        | 2,139.75
 142.92       | 2,282.68
 2.41         | 2,285.09
 31.01        | 2,316.10

Alright, let’s try the same thing using numeric the whole way through. Then we will compare the difference. First, we’ll generate the random data. We’ll use a numeric with 19 digits of precision, to match the 19 positive number digits we get from a PostgreSQL bigint.

SELECT
    generate_series (1, 1000000) as row_id,
    (random() * 200)::numeric(17,2) AS amount, 
    (random() * 1)::numeric(15,4) AS price

Generating the random data takes ~2,200ms, and looks as follows:

 row_id  | amount | price  
---------+--------+--------
       1 |  84.33 | 0.8137
       2 | 178.67 | 0.0793
       3 | 115.89 | 0.1935
       4 |   5.61 | 0.9763
       5 |  32.45 | 0.8218
       6 | 154.49 | 0.4451
       7 | 127.43 | 0.3493

Now the basic balance calculation, wherein we multiply amount by price, and then sum a running total of those multiples:

SELECT 
    value, 
    SUM(value) OVER (ORDER BY row_id) AS balance
FROM (
    SELECT 
        row_id, amount * price AS value
    FROM (
        SELECT
            generate_series (1, 1000000) as row_id,
            (random() * 200)::numeric(17,2) AS amount, 
            (random() * 1)::numeric(15,4) AS price
    ) AS random_data
) AS raw_values
That takes ~3,550ms. Less generation time (2,200ms) yields a calculation time of ~1,350ms. The output looks like this:
   value    |     balance     
------------+-----------------
  32.893335 |       32.893335
   9.424890 |       42.318225
  46.160686 |       88.478911
 134.282533 |      222.761444
  19.844650 |      242.606094
   8.255778 |      250.861872
   2.859764 |      253.721636

Finally, the complete query. This time, we already have a list of values of type numeric, so we don’t need to peform the initial typecasting step we did in the bigintversion. Instead, we just use to_char()to create pretty strings.

SELECT
    to_char(value, 'FM999,999,999,990.90') as str_value,
    to_char(balance, 'FM999,999,999,990.90') as str_balance
FROM (
    SELECT 
        value, 
        SUM(value) OVER (ORDER BY row_id) AS balance
    FROM (
        SELECT 
            row_id, amount * price AS value
        FROM (
            SELECT
                generate_series (1, 1000000) as row_id,
                (random() * 200)::numeric(17,2) AS amount, 
                (random() * 1)::numeric(15,4) AS price
        ) AS random_data
    ) AS raw_values
) AS balances

The full query, in numeric form, takes ~4,200ms. Less generation (2,200ms) and calculation (~1,350ms) time, the string processing takes ~650ms.

As expected, the nativelynumeric query is much faster at generating strings: ~650ms vs ~1,150ms, or 56% of the nativebigint query time.

However, the bigint query wins in the calculation stakes: ~880ms vs ~1,350, or 65% of thenumeric time. That means that excluding the random data generation time, the two queries run pretty much equal: 2,030ms for thebigint, and 2,000ms for thenumeric.

Of course, we must consider the elephant in the room: Maximum precision. Usingthebigint technique, we lose a huge chunk of precision every time we perform a calculation. With an amount featuring two implied decimals, and a price featuring four, we lose six digits.

A PostgreSQLbigint can store a maximum positive value of 9,223,372,036,854,775,807,or several quintillion units. Less the digits we used for implied decimals, that falls to two trillion units.

That puts thebigint calculation in a bind. Say we have a query that requires a further calculation step: Perhaps price is calculated as the multiple of two numbers with four decimal digits, e.g. 2.4561 * 1.3258. That is a common case for currencies, and would caused our maximum balance value to fall to around eight hundred million.

Based on the above analysis, we might reasonably assume that such a calculation heavy query denominated inbigint might pull away from anumeric version. However, the loss of maximum balance value might preclude its use for clients with big balance numbers.

As is the way of these things, I’ve now got more questions than I did when I started. What I want to know now is:

  • Does changing thenumeric precision definition, e.g. numeric(9,4)instead of numeric(15,4)affect PostgreSQL calculation performance?
  • We ignored retrieval speed in this analysis. When the data is coming off the disk, donumericandbigint cause materially different read times?
  • Does adding an extra calculation step cause thebigint method to pull away fromnumeric?
  • Couldbigint andnumeric be substituted depending on client needs, allowing small clients to benefit from speed while big ones benefit from arbitrary precision?

Questions for another day.

Chat to me about this article anytime on Twitter.

Intel Processors in the 2017 Macbook Pro

WWDC just rolled around, and techy news sites are running their usual breathless summary articles. They list of the high level features of new Macbook Pros,  but never seem to give any technical information about the processors beyond superficial and meaningless ‘Ghz’ numbers.

This is probably appropriate for most of their readers. Apple knows that its customers don’t usually care about the difference between an i7-5557U and an i7-7660U, so they don’t provide model numbers on their site, either.

It’s possible to figure out what processors are lurking behind the marketing vomit by visiting Intel’s ARK knowledge base. Cross reference timing, base and maximum clock frequencies, and voila you’ve got yourself technical data upon which to base purchasing decisions.

Here are all the processors Apple has put in their Macbook Pro machines this WWDC:

Macbook Pro 13

 

What should boys make of Wonder Woman?

Boys, let’s be selfish for a moment. You’ve probably read a lot of great commentary about Wonder Woman from a feminist perspective. Let’s talk about it from the male perspective, about what she means for us.

Action movies serve up a plethora of fantastic male heroes: They might be tounge-in-cheek, like Keanu Reeves as John Wick. Or serious but self-aware, like Daniel Craig as James Bond. Or just plain serious, unrelenting, and indefatigable à la Denzel Washington in Man on Fire. These men are superheroes, in on way or another.

They are physically strong, attractive, and capable. They are emotionally tortured by what they’ve seen and done. They are smart. They are fighters. They are who we imagine ourselves to be, should we ever find ourselves in a position where we must save the world from great evil.

What of Wonder Woman? What Consider her aggression, her anger,  her fire, ferocity, and fury in the face of threats to those she loves.  Revel in her will to win.

These are the qualities we associate with our Bonds, Wicks, and Creasy’s, wherein they are warriors for good and, in their best moments, inspiring. Yet for heterosexual males, bar the odd surprise-semi when Daniel Craig gets out of the ocean, they’re not blood-pumping.

The worst elements of male chauvinism present weak females as the ideal: Men fight, women stay home, keep themselves clean and pretty, and submit to the dominant male upon his return from the battlefield. In their less inspiring moments, our male superheroes act out this ideal with some token female co-star. These females are often physically stunning, and framed by the camera as the most tantalising of treats.

Yet they cannot hold a candle to Wonder Woman. Wonder Women oozes sexuality. Not because of her physical appearance, as great as those legs may be, but because she fights. She roars. She charges into battle and defeats her foes with unrelenting, unashamed application of physical force. And she quotes Greek literature.

Some men might find the weak, protected female form attractive. Their desires dominate our media landscape. You and I know they’re missing out. Give me a woman that stands with me on the battlefield. One that fights alongside me, bleeds with me, stands tall beside me in the face of danger, screams with me into the jaws of impending doom.

She’s the one you want to kiss passionately after a great victory. She’s the one you trust to be your companion through life, because she has literally saved your life, as you have hers. She’s strong, attractive, capable, emotionally tortured, smart, a fighter – Just like you, because in this story, you are the hero too. You are a team. An ass-kicking, sexy,  battle-winning team.

Boys, in our own selfish interests, we should be cheering on Wonder Woman. We should be cheering womens-only screenings, we should be whooping in the cinema, and we should be encouraging artists to create more films like it. We should be thrilled to indulge in fantasies where both ourselves and our partners are superheroes.

Of course, we must also stop thinking with our dicks, being selfish, and acknowledge all the other reasons why Wonder Woman is important. Other writers have argued those reasons much better than I ever could. This was just between us men. Wonder Woman is not just a victory for women in film, it’s a victory for all of us as people.

Drone Mapping

Having a birds eye perspective on a problem facilitates better decision making. Quite literally, being able to look down upon a geographic area allows one to craft better plans. Modern mapping services like Google Maps grant views with exceptional clarity.

Google’s photogrammetric mapping is astoundingly good – Here’s the Lower Haight district in San Francisco

Such services are not available on Thornleigh Farm – located, as it is, on remote Lord Howe Island. This is understandable: Internet giants like Google are hardly going to expend their resources producing hyper-detailed maps of a remote island in the Tasman Sea.

A freely available satellite survey of Lord Howe Island provided on Google Maps

The best resolution I’ve been able to get from freely available satellite surveys is generally about 30cm / pixel. This isn’t enough to usefully inform decision making. Enter consumer-grade drone technology. Using a DJI Mavic Pro drone, I’ve been able to produce aerial surveys with a resolution as fine as 1cm / pixel.

An individual image from a DJI Mavic Pro flying over Thornleigh Farm

Images can be stitched together using Adobe Photoshop’s inbuilt Photomerge functionality. I took approximately 200 photos and then used Photomerge to stitch them all together at once. On my laptop, this took only a few hours. The result is so intimately detailed that I don’t want to post it here, out of respect for our privacy on the farm.

Drones like the DJI Mavic Pro are not cheap. But they are significantly cheaper than orbiting satellites, and presumably cheaper than the vehicles Google uses to produce photogrammetric maps. Yet they allow a small business on a remote island to produce fantastic, detailed aerial surveys to inform better decision making. If anyone ever tells you drones are useless toys – Point them to Thornleigh Farm.

Exchange-traded index funds are our friends

Horsies, doggies and casinos are a great analogy for stock markets. There are a lot of testosterone-amped men running around spinning yarns about how they know which horse will win, and why. The reality is, if anyone actually has scientific evidence suggesting strongly which horse will win, they are betting big and keeping quiet.

So it goes with shares. No one knows. Anyone that says they know is benevolently ignorant, or malevolently misleading.

As small scale investors, we are ripe targets. Just like intoxicated punters studying the form, we can be lured with promises of a quick buck, a certain return. There is no such thing – But there is big money in telling us there is, because if we trade lots, we pay lots of fees, and feed lots of payouts. So we are up against it.

Yet we still want a place to park our money for the long term, where it can work for us. This is where exchange-traded index funds are our friends. Instead of betting a single horse will win, they are a bet that some horses will win, some horses will lose, but that everyone will have a pretty good time and come back next year.

Pick a broad enough fund, and sit on it long enough, and you have an extremely high probability of achieving two objectives: Keep your capital safe, and earn a useful return. You won’t get rich, but you won’t get fucked, and there are a great many people out there with a serious interest in you being fucked.