Something we think a lot about at SharpestMinds is how to create an incentive structure that aligns our business goals with the goals of our users. This has led us to embrace income share agreements as a central part of our business model. The ethical implications of AI is something I also think a a lot about and I’ve noticed some parallels in the problem of aligning incentives in a business model and the value-alignment problem in AI research.
A major concern about the possibility of artificial super-intelligence is the value-alignment problem. How do we ensure that highly intelligent AI systems have goals and behaviors that align with human values?
The concern is that if an AI’s goals are not aligned with humanity’s values, it will pose an existential threat to our species. To give the classic toy example: if you gave a sufficiently powerful AI the single goal of maximizing paper clips, the end result would be a universe full of paper clips and devoid of all human life.
The objections to worrying about the value-alignment problem seem to come in two flavours:
- Super intelligent AI or AGI (artificial general intelligence) is either not possible, or too far away in the future to worry about.
- Concerns about AGI are based on misunderstandings about technology, culture, and the nature of intelligence.
The arguments from (1) are easily refuted, in my opinion. There’s nothing in our theories of the mind or physics to suggest AGI is impossible, and progress towards it is likely to continue at an exponential rate. To ignore the problem because it’s too far in the future may be human nature, but it’s a mistake (see: climate change).
The arguments from (2) sound like: “Anything with that level of intelligence would never make silly mistakes like turning the universe into paper clips,” or, “It’s ludicrous to assume that an AGI could ever obtain such god-like powers without human help,” or the classic, “We can just pull the plug if it turns evil.”
I lack the expertise for good rebuttals to the above. I’ll leave that to thinkers like Stuart Russell and Max Tegmark. But, while I think the philosophical and technical questions around AGI are worth having, I believe they are a kind of red herring when it comes to the value-alignment problem. In the present, with no hints of AGI around the corner, we are already facing the value-alignment problem with a different type of intelligence. Corporations.
Large corporations are, in a sense, machines that pursue goals independent from the goals of any particular employee or customer. Aligning these goals with humanity’s values is already a source of friction. Quite often, maximizing shareholder value (the typical goal of a corporation) does not align with human values.
To give a somewhat tired, but relevant, example, look at Facebook. Mark Zuckerberg did not set out to build a platform to increase political polarization, start an epidemic of depression and anxiety, and incite genocides. It’s safe to assume that nobody who works there wants these things either, but the goal of maximizing shareholder value, coupled with an ad-based revenue model, has led to bad incentives and created a Facebook machine that values our attention over our well-being.
This has led to machine learning algorithms, now a major component of the Facebook machine, being programmed with the seemingly innocuous goal of “surface content that users want to see.” A great example of failing the value-alignment problem because it turns out this goal, executed by algorithms with no understanding of human nature, has unexpected consequences that do not align with our values.
We need to think more carefully about the incentives driving our companies and institutions. Companies tend to act like independent agents, prioritizing the incentive structures of their business models over the greater good. Furthermore, these incentives are now being programmed into constantly improving machine learning algorithms. These algorithms are far from super-intelligent AI, but they can still exert a lot of influence. We should not wait for AGI to be feasible to start thinking about the value-alignment problem. It is already very real.