Friday, February 09, 2024

One Man’s View On What We Can Do To Control AI

This appeared last week:

How we can control AI

By Eric Schmidt

The Wall Street Journal

12:26PM January 31, 2024

Today’s large language models, the computer programs that form the basis of artificial intelligence, are impressive human achievements.

Behind their remarkable language capabilities and impressive breadth of knowledge lie extensive swathes of data, capital and time.

Many take more than $US100m ($152m) to develop and require months of testing and refinement by humans and machines. They are refined, up to millions of times, by iterative processes that evaluate how close the systems come to the “correct answer” to questions and improve the model with each attempt.

What’s still difficult is to encode human values. That currently requires an extra step known as Reinforcement Learning from Human Feedback, in which programmers use their own responses to train the model to be helpful and accurate. Meanwhile, so-called “red teams” provoke the program in order to uncover any possible harmful outputs. This combination of human adjustments and guardrails is designed to ensure alignment of AI with human values and overall safety. So far, this seems to have worked reasonably well.

But as models become more sophisticated, this approach may prove insufficient. Some models are beginning to exhibit polymathic behaviour: They appear to know more than just what is in their training data and can link concepts across fields, languages, and geographies. At some point they will be able to, for example, suggest recipes for novel cyberattacks or biological attacks — all based on publicly available knowledge.

There’s little consensus around how we can rein in these risks. The press has reported a variety of explanations for the tensions at OpenAI in November, including that behind the then-board’s decision to fire chief executive Sam Altman was a conflict between commercial incentives and the safety concerns core to the board’s non-profit mission.

Potential commercial offerings like the ability to fine-tune the company’s ChatGPT program for different customers and applications could be very profitable, but such customisation could also undermine some of OpenAI’s basic safeguards in ChatGPT.

Tensions like that around AI risk will only become more prominent as models get smarter and more capable.

We need to adopt new approaches to AI safety that track the complexity and innovation speed of the core models themselves.

While most agree that today’s programs are generally safe for use and distribution, can our current safety tests keep up with the rapid pace of AI advancement? At present, the industry has a good handle on the obvious queries to test for, including personal harms and examples of prejudice. It’s also relatively straightforward to test for whether a model contains dangerous knowledge in its current state.

What’s much harder to test for is what’s known as “capability overhang” — meaning not just the model’s current knowledge, but the derived knowledge it could potentially generate on its own.

Red teams have so far shown some promise in predicting models’ capabilities, but upcoming technologies could break our current approach to safety in AI. For one, “recursive self-improvement” is a feature that allows AI systems to collect data and get feedback on their own and incorporate it to update their own parameters, thus enabling the models to train themselves. This could result in, say, an AI that can build complex system applications (e.g., a simple search engine or a new game) from scratch.

But, the full scope of the potential new capabilities that could be enabled by recursive self-improvement is not known.

Another example would be “multi-agent systems,” where multiple independent AI systems are able to co-ordinate with each other to build something new.

Having just two AI models from different companies collaborating together will be a milestone we’ll need to watch out for. This so-called “combinatorial innovation,” where systems are merged to build something new, will be a threat simply because the number of combinations will quickly exceed the capacity of human oversight.

Short of pulling the plug on the computers doing this work, it will likely be very difficult to monitor such technologies once these breakthroughs occur. Current regulatory approaches are based on individual model size and training effort, and are based on passing increasingly rigorous tests, but these techniques will break down as the systems become orders of magnitude more powerful and potentially elusive. AI regulatory approaches will need to evolve to identify and govern the new emergent capabilities and the scaling of those capabilities.

Europe has so far attempted the most ambitious regulatory regime with its AI Act, imposing transparency requirements and varying degrees of regulation based on models’ risk levels. It even accounts for general-purpose models like ChatGPT, which have a wide range of possible applications and could be used in unpredictable ways. But the AI Act has already fallen behind the frontier of innovation, as open-source AI models — which are largely exempt from the legislation — expand in scope and number.

President Biden’s recent executive order on AI took a broader and more flexible approach, giving direction and guidance to government agencies and outlining regulatory goals, though without the full power of the law that the AI Act has. For example, the order gives the National Institute of Standards and Technology basic responsibility to define safety standards and evaluation protocols for AI systems, but does not require that AI systems in the US “pass the test.” Further, both Biden’s order and Europe’s AI Act lack intrinsic mechanisms to rapidly adapt to an AI landscape that will continue to change quickly and often.

I recently attended a gathering in Palo Alto organised by the Rand Corp. and the Carnegie Endowment for International Peace, where key technical leaders in AI converged on an idea: The best way to solve these problems is to create a new set of testing companies that will be incentivised to out-innovate each other — in short, a robust economy of testing.

…..

Eric Schmidt is the former CEO and executive chairman of Google and cofounder of the philanthropy Schmidt Sciences, which funds science and technology research.

The Wall Street Journal

The full article is here with suggestions on how to check if the Ais are getting ahead of themselves and US!:

https://www.theaustralian.com.au/business/the-wall-street-journal/how-we-can-control-ai/news-story/454fae637233dada7d0adf7b6ff99541

Worth a read to get a handle on the issues surrounding all this!

David.

No comments:

Post a Comment