Google has taken its next step in artificial intelligence with the launch of Gemini, an AI model trained to behave in human-like ways that’s likely to intensify the debate about the technology’s potential promises and perils.
Since the launch of OpenAI’s ChatGPT roughly a year ago, Google has been racing to produce AI software that rivals what its Microsoft-backed rival has introduced.
Google and its parent company Alphabet have added a portion of the Gemini model to their AI chatbot Bard, and said they planned to release more advanced versions of Gemini through Bard early in 2024.
Alphabet said it had three versions of Gemini, each of which designed to use a different amount of processing power. The most powerful version is designed to run in data centres, and the smallest is said to run efficiently on mobile devices.
On some phones, Gemini will be able to quickly summarise recordings made on the device and provide automatic replies on messaging services, starting with WhatsApp, according to Alphabet.
The AI will only work in English throughout the world at first, although Google executives assured reporters during a briefing that the technology will have no problem eventually diversifying into other languages.
Gemini will also eventually be infused into Google’s dominant search engine, although the timing of that transition hasn’t been spelled out yet.
The technology’s problem-solving skills are being touted by Google as being especially adept in maths and physics, fuelling hopes among AI optimists that it may lead to scientific breakthroughs that improve life for humans.
But an opposing side of the AI debate worries about the technology eventually eclipsing human intelligence, resulting in the loss of millions of jobs and perhaps even more destructive behaviour, such as amplifying misinformation or triggering the deployment of nuclear weapons.
“We’re approaching this work boldly and responsibly,” Google CEO Sundar Pichai wrote in a blog post. “That means being ambitious in our research and pursuing the capabilities that will bring enormous benefits to people and society, while building in safeguards and working collaboratively with governments and experts to address risks as AI becomes more capable.”
Gemini’s arrival is likely to up the ante in an AI competition that has been escalating for the past year, with OpenAI and long-time industry rival Microsoft.
Backed by Microsoft’s financial muscle and computing power, OpenAI was already deep into developing its most advanced AI model, GPT-4, when it released the free ChatGPT tool late last year. That AI-fuelled chatbot rocketed to global fame, bringing buzz to the commercial promise of generative AI and pressuring Google to push out Bard in response.
Just as Bard was arriving on the scene, OpenAI released GPT-4 in March and has since been building in new capabilities aimed at consumers and business customers, including a feature unveiled in November that enables the chatbot to analyse images. It’s been competing for business against other rival AI startups such as Anthropic and even its partner, Microsoft, which has exclusive rights to OpenAI’s technology in exchange for the billions of dollars that it has poured into the startup.
The alliance so far has been a boon for Microsoft, which has seen its market value climb by more than 50 per cent so far this year, primarily because of investors’ belief that AI will turn into a gold mine for the tech industry. Alphabet has also been riding the same wave with its market value rising about 45 per cent so far this year.
With Gemini coming out, OpenAI may find itself trying to prove its technology remains smarter than Google’s.
In a virtual press conference, Google declined to share Gemini’s parameter count — one but not the only measure of a model’s complexity. A white paper released on Wednesday outlined the most capable version of Gemini outperforming GPT-4 on multiple-choice exams, primary school maths and other benchmarks, but acknowledged ongoing struggles in getting AI models to achieve higher-level reasoning skills.
Some computer scientists see limits in how much can be done with large language models, which work by repeatedly predicting the next word in a sentence and are prone to making up errors known as hallucinations.
“We made a ton of progress in what’s called factuality with Gemini. So Gemini is our best model in that regard. But it’s still, I would say, an unsolved research problem,” Mr Collins said.