If you have ever worked for an organisation looking to measure productivity by tracking KPIs or OKR targets, you may have thought to yourself some variation of the following

When a measure becomes a target, it ceases to be a good measure

This is known as Goodhart’s Law, named after the British economist Charles Goodhart. This is assumed to hold true because, as the measure in question begins by being correlated to the true (unobserveable) productivity, it starts to be maximised directly to the point that the measure becomes untethered and maximised with no increase in true productivity itself.

A fascinating new working paper investigates the text of mandatory regulatory filings of companies. The authors find that the language used has drifted over time; not unexpected over the course of many years. But the authors further claim that the cause is that the writers of these filings are deliberately trying to articulate a positive sentiment to automated trading systems using natural language processing.

The authors measure the frequency that the documents in question are downloaded by machines compared to humans. Among the companies that had their filings downloaded more often, the corresponding filings showed a trend to language that was evaluated as positive through automated sentiment analysis. While the drift in language used in company filings is indeed evocative of Goodhart’s Law, there is even more at play.

Finding Alpha

Finance is a perfect testbed for observing the dynamics underlying Goodhart. Classical economics states that any useful and accessible information should be perfectly incorporated into a fair market price: as soon as some knowledge about the demand for oil, or the prospects for a company’s growth is made available, market participants instantly buy or sell in response driving the price to where it ‘should’ be in light of that new information. These idealistic assumptions don’t hold perfectly, and the extent to which a strategy does find information that gives an edge is measured by its alpha — a quantity ruthlessly hunted down by traders.

Thus modern finance is an endless game of cat and mouse to sniff out any information that others might have overlooked that could be incorporated into a trading strategy. Each of these hidden pockets of epistemological treasure also represent a pocket of monetary treasure, but for a finite time only. Once that information becomes incorporated by others’ trading strategies and the alpha tends to zero.

Not surprisingly, alpha has been eked out of historical weather patterns, shorter cables between computers in stock exchanges executing trades and quickly scanning news stories. For some time the majority of trading volumes have been made by automated systems driven by algorithms, rather than the quaint human traders shouting across a trading floor.

Press releases, quarterly earnings reports and other news all have the potential to, in effect, immediately reprice assets. It used to be that the fastest analyst to scan the incoming article in their terminal for keywords and determine if it contained good news or bad news, could get ahead of the market. This is where a natural language system can be trained to derive the sentiment of the new document and trigger and buy or sell order in response faster than any human trader.

When Machines Cease to be Only Tools

Changing the writing style of financial disclosures would not be the first example of human behaviour being changed by tools that humans have created. From the automated looms which attracted the inhabitants of rural England to emigrate to cities in the Industrial Revolution, to the producers of early online content tweaking their copy to include keywords favoured by search engines.

What is different, and profound, here is that the human writers of financial disclosures have learnt to modify how a cultural product is produced for the benefit of and primary consumption by machines.

As a result, the product that is produced resembles not the normal natural language used to communicate between humans but something else. Natural language is rich and fascinating but it is also ambiguous and often redundant. It could be argued that is key to its charm and efficacy; to inspire with imagery and subtlety but also to allow for diplomacy by obfuscating a more direct but painful message. Computers are neither ambiguous nor redundant. By writing documents in such a way that computers can better understand us, we start to use something other than natural language. Instead these documents are closer to an inconcise structured data set for an algorithm to ingest or a set of programming instructions in a strange new computing paradigm.

The more experience one has with deploying software, the more skeptical one will be with a vision of humanity being taken over by an omnipresent SkyNet-like system. Likewise, more familiarity with AI techniques makes the prospect of an ompnipotent artificial intelligence more laughable. Instead, deployment of modern technology into society is more subtle.

Where humans and machines work together, an interface arises. As automated tools, sensors and devices continue to be embedded deeper into the economic, social and cultural fabric of our lives; this is in a way a story of an ever metamorphosing interface. That interface might take the form of an open-source programming language running on a Raspberry Pi, swipes on an iPad or commands shouted at an Alexa. But for previous generations that boundary could have been a highly specialised engineer hurriedly replacing a blown vacuum tube in an ENIAC machine. Just how the lines moves and bends is constrained not only by the limits of physics on the machine side. On our side, our societal norms, regulation and our willingness to contort also play a role. The relentless march of the market and the allure of modern machines have shown explicitly that we will reprogram ourselves to mould to this invisible boundary.

Data, science, data science and trace amounts of the Middle East and the UN