"There are a lot of machine-learning models currently being deployed ... and there was no sense of accountability"
Inioluwa Deborah Raji

Inioluwa Deborah Raji, one of MIT Technology Review's Innovators Under 35 for 2020, first came into contact with the biases embedded in artificial intelligence models when she interned for a machine learning startup during a summer break from college. Clarifai was using its machine learning models to help clients avoid images that are "not safe for work". It turned out that the model was picking out images of people of colour at a far higher rate than those of white people.

This came down to the data sets used to train the model; the model was using images from porn to identify the attributes which constitute "not safe for work" images, and stock photos for the safe ones. As it happens, porn is more diverse - the model was simply associating people of darker skin with indecent content more than it should have.

This triggered a deep dive into the state of training data sets used by institutions and technology firms, where she discovered equally egregious imbalances.

Raji would go on to produce seminal findings with MIT researcher Joy Buolamwini as part of the latter's master thesis called Gender Shades. The research conclusively revealed that Microsoft, IBM and Megvii identified the gender of dark-skinned women 34.4% less accurately than that of light-skinned men. To grasp the significance of this, it's worth remembering that Microsoft and Amazon sell facial recognition technologies to police forces in the US.

Raji and Boulamwini's findings influenced the US National Institute of Standards and Technology to include a racial bias test in its annual audit of facial recognition technologies, an important victory that went beyond typical soft laws.

Soft laws

In a paper published by Nature in September 2019, Anna Jobin, Marcello Ienca and Effy Vayena scoped the existing corpus of documents containing ethical guidelines or principles for AI. These documents constitute soft-law - or non-legislative policy - efforts by public and private sector institutions to promote the integration of ethics and values into the use of AI, which are unlike hard law in that they are not legally binding. Such efforts have been proven somewhat effective in the field of medicine.

Their research discovered 84 such documents after a thorough screening process. One of the most interesting findings of their study is that 88% of the guidelines were released after 2016, which is the clearest indicator of the recency of this trend. Another thing to note is the geographic concentration in economically developed countries, with 21 documents coming from the US, 13 from the UK, and 19 from the EU:

Concentration of guidelines in economically developed countries is in line with where AI is most developed

The most valuable insight, however, is the convergence of the guidelines around certain ethical values and principles. 11 common ethical values and principles emerged from the analysis, after semantic and thematic proximity were taken into account. These are:

The degree to which the guidelines converge around Transparency, Justice and Fairness is striking. This analysis focuses on the latter (though I recommend Cathy O'Neil's Weapons of Math Destruction for those of you interested in how the 'black box' nature of AI models can perpetuate inequities in society).

The Justice and Fairness principle encompasses mitigation of unwanted biases and discrimination, respect for diversity and equality.

In the last month alone, Amazon, IBM and Microsoft have either halted sales of facial recognition technology to the police or abandoned research into the technology altogether, in the knowledge that, in their current state, these models can have deleterious effects on society. Other manifestations of biases and discrimination include: Amazon's infamous use of a machine-learning model for recruitment which was biased against women; Google's algorithm disproportionately flagging black users' tweets as hate speech; Apple's credit card giving higher credit limits to men than women.

Is it any wonder, then, that AI bias becoming a top regulatory concern was one of CB Insights' 2020 Tech Trends predictions? As the graph below illustrates, the mentions of AI bias in the news has been rising sharply since 2016, the same period in which most of the existing guidelines were written.

The mentions of bias in AI began to climb after 2016, with a sharp rise at the tail end of 2019

As Jobin, Ienca and Vayena discovered, private companies contributed 19 of the 84 guidelines on ethical principles in AI, a clear sign of the endemic 'self-regulation' mindset prevalent in Big Tech today. Of all of the tech companies that have issued statements on the importance of eliminating bias from AI models, IBM's was perhaps the boldest, naming AI bias control as one of the 2018 innovations that will change our lives within 5 years.

However, as the writers at CB Insights point out, the probability of a shift from self-regulation to legislative regulation of bias in AI and machine learning is high. And so it has borne out. In February, the European Commission proposed a set of strict rules for the development and use of artificial intelligence. These include the liability and certification checks of underlying data for "high-risk" AI (I'll let you define that) and obligating the use of European data for European algorithms. The EU's High Level Expert Group on Artificial Intelligence has similarly written about the importance of "trustworthy" AI which safeguards the integrity of public and private sector institutions (by detecting and proving discrimination).

Soft laws becoming hard laws seems inevitable. Companies large and small will have two primary options: managing AI bias in-house using tools or outsourcing this to specialist startups. When considered in the wider context of AI deployment expected over the next decade, the market for both could be huge.

The market for AI ethics

The below charts from PwC's quarterly MoneyTree report demonstrate just how much money has gone into AI over the last few years (this is just in the US).

Approximately $35.8bn was invested into AI in 2019, which seems small compared to estimates of potential impact posited by McKinsey:

In each of these sectors, systemic biases in AI models could be far more disastrous than the biases of individuals. The soft laws will go some way to pulling a market for AI ethics solutions into being, but the spectre of hard laws on the horizon will be the most important inflection point.

To close, let's take a look at some companies that are already innovating around these problems.

AI.Reverie (US) - AI.Reverie’s synthetic data provides diverse and scalable images and scenarios which reduce bias and help algorithms generalize the synthetic data with greater accuracy. This ensures that training data for models is balanced. Raised $5.6m.

Veritone (US) - Deep-learning models require anywhere from 5,000 to 10 million labeled examples to attain human level accuracy. With facial recognition technologies, Veritone's aiWARE breaks down training sets into smaller models which are representative of the population and interpolates to create similar examples. This has the potential to solve the problem of the data sets used for facial recognition technologies. Raised $65m.

Pymetrics (US) - Pymetrics uses a de-biasing tool called AuditAI which follows guidelines from the Uniform Guidelines on Employee Selection Procedures to generate a mathematical test that determines whether a model is bias-free. The methodology can be applied to any use case, but Pymetrics focus on recruitment. Raised $56.6m.

Fiddler Labs (US) - Fiddler offers an Explainability AI platform that augments explainability techniques in the public domain including Shapley Values and Integrated Gradients, ensuring black boxes don't remain black boxes. The solution also helps companies assess biases in their models and comply with industry regulations. Raised $13.2m.

Monday must-reads

1. An Inconvenient Fact: Private Equity Returns & The Billionaire Factory by Ludovic Phalippou - Ludovic Phalippou has written many papers on the Private Equity industry's lacklustre returns, but this one may be his most acerbic critique of the industry yet. Phalippou finds PE produced little to no outperformance against benchmarks since 2005, which isn't egregious in itself until you factor in the hundreds of billions in performance fees that have swelled the number of PE billionaires from 3 to 22 in the same time period.

2. 2020 | MIT Technology Review Innovators Under 35. The MIT Technology Review team produce an annual list profiling some of the world's most innovative young minds. This year's list features solutions for brain-computer interfaces, racial biases in facial recognition technologies, EV batteries, solar panel efficiency and much more. I encourage you to read the previous lists too to spot some patterns and changes over time.

3. China's unmanned hotels ride 'contactless' wave - fascinating and thought-provoking case study of how Internet of Things devices have made humans entirely redundant in a chain of hotels. The model works, too, with 90% occupancy rates, not to mention the better margins and lower capital expenditure.

4. Mounting pressure on Apple's App Store, as Rakuten joins Basecamp, Spotify and others in calling out the App Store's policy of taking a 30% cut of all payments made within apps downloaded from the App Store. This story really lit Twitter alight when David Heinemeier Hansson, CTO of Basecamp, accused Apple of moving the goalposts on its new HEY email service. These stories come hot on the heels of the European Commission announcing anti-trust probes into the App Store and Apple Pay. Smart people sit all along the spectrum on this issue, which always makes for interesting debates.

5. A list of impact investing platforms compiled by Dealroom - A blend of impact investing companies have flowered over the last few years, each attempting to position itself as the unique platform that truly delivers positive social and environmental returns with your money. Digging past each company's rhetoric and into the actual funds offered is always a revealing exercise.

Deals of the week

Onna (EU) is a knowledge integration platform for enterprises that binds together the knowledge collectively held between all workplace apps. if you've ever worked at a large corporation, you've undoubtedly used a smorgasbord of tools through your day - just imagine how amazing it would be to be able to do a search query on all of the knowledge stored across these apps? You can probably understand why Atomico led the $27m Series B for Onna.

Separately, employee benefits is hotting up, with a flurry of deal announcements in the last couple of weeks. Brightside (US), a financial care platform for employees, raised a $35m Series A led by Andreessen Horowitz. Just a few days before that, Origin (US) raised $12m for a financial planning platform for employees, with Felicis Ventures leading the round. In the UK, a healthcare benefits platform called Peppy raised £1.7m in a round led by Outward VC. The signs are that verticalised employee benefits platforms are the trend going forward, as opposed to holistic platforms like Perkbox.

Quotidian Quote

"I have spent my life judging the distance between American reality and the American dream" - Bruce Springsteen