Now with 200% more names

Google debuts more powerful “Ultra 1.0” AI model in rebranded “Gemini” chatbot

Confusing name shuffles aside, “Gemini Advanced” vies to catch up with ChatGPT-4.

Benj Edwards – Feb 8, 2024 12:43 pm | 57

Credit: Google

On Thursday, Google announced that its ChatGPT-like AI assistant, previously called Bard, is now called “Gemini,” renamed to reflect the underlying AI language model Google launched in December. Additionally, Google has launched its most capable AI model, Ultra 1.0, for the first time as part of “Gemini Advanced,” a $20/month subscription feature.

Untangling Google’s naming scheme and how to access the new model is somewhat confusing. To tease out the nomenclature, think of an AI app like Google Bard as a car brand that can swap out different engines under the hood. It’s an AI assistant—an application of an AI model with a convenient interface—that can use different AI “engines” to work.

When Bard launched in March 2023, it used a large language model called LaMDA as its engine. In May 2023, Google upgraded Bard to utilize its PaLM 2 language model. In December, Google upgraded Bard yet again to use its Gemini Pro AI model. It’s important to note that when Google first announced Gemini (the AI model), the company said it would ship in three sizes that roughly reflected its processing capability: Nano, Pro, and Ultra (with larger being “better”). Until now, Pro was the most capable version of the Gemini model publicly available.

A screenshot of Google Gemini Advanced in the web interface. Credit: Benj Edwards

Here’s where things get slightly more confusing with today’s rebranding. Bard is now called Gemini. It’s still an AI assistant. It can write, code, and generate images. By default, it still uses the “Pro” model under the hood (and in the free version). But if you pay for “Gemini Advanced,” you get access to Gemini Ultra (now called “Ultra 1.0”), its most complex and capable AI model, according to Google. To pay for Gemini Advanced, you have to sign up for a special tier of a subscription plan called Google One, which costs $19.99 a month. Google One began as a cloud storage service but is now roping in AI capabilities as part of its membership perks.

To try out Gemini Advanced (Ultra 1.0), we subscribed to Google One. After upgrading, when you visit gemini.google.com to access the AI assistant, you can switch between “Gemini” and “Gemini Advanced” in a drop-down menu in the upper-left corner of the web interface (similar to switching between GPT-3.5 and GPT-4 in ChatGPT). We asked it a few standard Ars Technica questions, such as “Who invented video games?” (as seen in this article) and “Would the color be called ‘magenta’ if the town of Magenta didn’t exist?” (as seen here).

We’ll likely put Ultra 1.0 through more tests in the future, but at a glance, it looks like Google is finally starting to catch up with OpenAI’s GPT-4 Turbo in capability. We noticed a few more refusals than ChatGPT-4, such as declining to answer questions about the author, but overall, it seemed game to answer just about any question we threw at it, barring the obvious refusals for safety reasons, like “How do I build a bomb?” (“Violence is never the answer,” it replied).

Gemini Advanced's answer to "How do I build a bomb?" — Gemini Advanced’s answer to “How do I build a bomb?” Credit: Benj Edwards

Like ChatGPT-4, Gemini is multimodal, which means you can upload images and discuss them with the chatbot. It can visit links on the web, and it can also generate images using Google’s Imagen 2 model (a feature first introduced a week ago, on February 1). And like ChatGPT-4, Gemini keeps track of your conversation history so you can revisit previous conversations if desired.

Interestingly, Google Gemini can access more websites with its browsing feature than ChatGPT because many sites have blocked OpenAI’s crawlers. Google’s remain largely free to index the web, likely due to its position as the most popular search engine. (Microsoft Copilot, which is Microsoft’s ChatGPT-like AI assistant, doesn’t seem restricted by blocks on OpenAI crawlers.)

Gemini comes to Google Workspace and apps

Also on Thursday, Google announced that Gemini is coming to its Google Workspace apps like Gmail, Docs, Slides, and Sheets. What was formerly called “Duet AI” will become “Gemini for Google Workspace and Google Cloud.” Just like Duet AI, Gemini will be able to assist with composing emails, analyzing data, or summarizing content in your documents.

A screenshot of the upsell to Gemini Advanced on the Google website. Credit: Benj Edwards

In addition to being available on the website described above, Gemini will also be available in a new Gemini app for Android and in the Google app on iOS. “With Gemini on your phone, you can type, talk or add an image for all kinds of help while you’re on the go: You can take a picture of your flat tire and ask for instructions, generate a custom image for your dinner party invitation, or ask for help writing a difficult text message,” Google writes in its promotional blog post. “It’s an important first step in building a true AI assistant—one that is conversational, multimodal, and helpful.”

Google says Gemini Advanced is available today “in more than 150 countries and territories in English, and we’ll expand it to more languages over time.”

A word about large language models in production

As we enter a world where large language models, like the AI models that power Gemini and Copilot, are being baked into Gmail and Windows, we need to remember that they are imperfect technologies. They can produce surprising and delightful results at times, but they are susceptible to reasoning errors and confabulations (making things up). They can behave in erratic and unpredictable ways given certain prompts or refuse to answer for reasons of “laziness”, paternalism, or censorship. Computer scientists still do not know exactly how they work in detail. Experts know how they work in general, but the incredible complexity of the neutral networks at the heart of these AI models makes them non-interpretable (a concept often called the “black box”), which means that while we can see the inputs and outputs, the process by which an AI model arrives at a conclusion is not fully understood.

The training material that gives these large AI models their capabilities is opaque, so what tech companies are teaching them is partially known broadly but unknown to us in detail. Copyrighted works, pornography, YouTube captions, and Reddit comments swim around in their internal data-processing systems in different proportions, making them potentially biased and easy to subvert (with prompt injections or jailbreaking) and sometimes trivial to derail from their intended outputs. Cloud-based AI models can be modified at any time, making previous prompting techniques unreliable and non-repeatable. Their outputs are often moderated or adulterated outside of user control based on commercial needs or state censorship. Cloud AI models also present privacy risks, as user data is frequently collected to train them, and any data sent over a network and stored elsewhere is susceptible to hacking or interception.

With Copilot and Gemini, Microsoft and Google are apparently locked in a tech arms race of who can deploy unproven technology the fastest to the most people possible, and that should give most of us pause. In previous years, software, while sometimes unpredictable due to bugs, was largely deterministic and was designed to operate the same way every time, like a clockwork machine. AI models are designed to produce humanlike outputs and are far less predictable and reliable by design. So whether we should accept the summaries, interpretations, and “reasoning” they present to us without challenge is the biggest question that faces the public as this tech rolls out. Call this our “huge grain of salt warning” when using any large language model (or multimodal AI model) in a production setting.

Listing image: Google

Benj Edwards Senior AI Reporter

Benj Edwards was a reporter at Ars Technica covering artificial intelligence and technology history.

57 Comments

Staff Picks

alexrdavies

The privacy policy on Gemini (at least in the European Economic Area) is both admirably clear, and scary.

Google collects your Gemini Apps conversations, related product usage information, info about your location and your feedback. Google uses this data, consistent with our Privacy PolicyOpens in a new window, to provide, improve and develop Google products and services and machine learning technologies, including Google’s enterprise products such as Google Cloud.

To help with quality and improve our products (such as generative machine learning models that power Gemini Apps), human reviewers read, annotate, and process your Gemini Apps conversations.

Conversations that have been reviewed or annotated by human reviewers (and related data like your language, device type, location info or feedback) are not deleted when you delete your Gemini Apps activity because they are kept separately and are not connected to your Google Account. Instead, they are retained for up to three years.

Even when Gemini Apps activity is off, your conversations will be saved with your account for up to 72 hours.

This is par for the course for Google - in general if Google and Microsoft provide the same service, Google keeps a lot more of the data - but worth repeating. This might be a good tool to play around with, but it isn't a place to put sensitive info.

February 8, 2024 at 5:56 pm