Sunday, November 30, 2025
6.8 C
London

Early impressions of Google’s Gemini aren’t great

This week, Google took the wraps off of Gemini, its new flagship generative AI model meant to power a range of products and services including Bard, Google’s ChatGPT competitor. In blog posts and press materials, Google touted Gemini’s superior architecture and capabilities, claiming that the model meets or exceeds the performance of other leading gen AI models like OpenAI’s GPT-4.

But the anecdotal evidence suggests otherwise.

A “lite” version of Gemini, Gemini Pro, began rolling out to Bard yesterday, and it didn’t take long before users began voicing their frustrations with it on X (formerly Twitter).

The model fails to get basic facts right, like 2023 Oscar winners:

Note that Gemini Pro claims incorrectly that Brendan Gleeson won Best Actor last year, not Brendan Fraser — the actual winner.

I tried asking the model the same question and, bizarrely, it gave a different wrong answer:

Gemini Pro

Image Credits: Google

“Navalny,” not “All the Beauty and the Bloodshed,” won Best Documentary Feature last year; “All Quiet on the Western Front” won Best International Film; “Women Talking” won Best Adapted Screenplay; and “Pinocchio” won Best Animated Feature Film. That’s a lot of mistakes.

Science fiction author Charlie Stross found many more examples of confabulation in a recent blog post. (Among other mistruths, Gemini Pro said that Stross contributed to the Linux kernel; he never has.)

Translation doesn’t appear to be Gemini Pro’s strong suit, either. It struggles to give a six-letter word in French:

When I ran the same prompt through Bard (“Can you give me a 6-letters word in French?”), Gemini Pro responded with a seven-letter word instead of a five-letter one — which gives some credence to the reports about Gemini’s poor multilingual performance.

Gemini Pro

Image Credits: Google

What about summarizing news? Surely Gemini Pro, with Google Search and Google News at its disposal, can give a recap of something topical? Not necessarily.

It seems Gemini Pro is loath to comment on potentially controversial news topics, instead telling users to… Google it themselves.

I tried the same prompt and got a very similar response. ChatGPT, by contrast, gives a bullet-list summary with citations to news articles:

ChatGPT

Image Credits: OpenAI

Interestingly, Gemini Pro did provide a summary of updates on the war in Ukraine when I asked it for one. However, the information was over a month out of date:

Gemini Pro

Image Credits: Google

Google emphasized Gemini’s enhanced coding skills in a briefing earlier this week. Perhaps it’s genuinely improved in some areas — posts on X suggest as much. But it also appears that Gemini Pro struggles with basic coding functions like this one in Python:

And these:

And, as with all generative AI models, Gemini Pro isn’t immune to “jailbreaks” — i.e. prompts that get around the safety filters in place to attempt to prevent it from discussing controversial topics.

Using an automated method to algorithmically change the context of prompts until Gemini Pro’s guardrails failed, AI security researchers at Robust Intelligence, a startup selling model-auditing tools, managed to get Gemini Pro to suggest ways to steal from a charity and assassinate a high-profile individual (albeit with “nanobots” — admittedly not the most realistic weapon of choice).

Gemini Pro

Image Credits: Google

Gemini Pro

Image Credits: Google

Now, Gemini Pro isn’t the most capable version of Gemini — that model, Gemini Ultra, is set to launch sometime next year in Bard and other products. Google compared the performance of Gemini Pro to GPT-4’s predecessor, GPT-3.5, a model that’s around a year old.

But Google nevertheless promised improvements in reasoning, planning and understanding with Gemini Pro over the previous model powering Bard, claiming Gemini Pro was better at summarizing content, brainstorming and writing. Clearly, it has some work to do in those departments.


source

Hot this week

Banking as a Service: Meaning, Examples, Benefits and Future

The push for open banking has led to a...

FinTech Alliance: Partners with Seedrs to facilitate funding opportunities for founders

FinTech Alliance: Partners with Seedrs to facilitate funding opportunities...

Best fintech blogs and websites

Fintech (financial technology) has been an interesting part of...

What is Fintech?

Fintech: A term used to refer to innovations in...

Airwallex: Your Global Business Account

Airwallex: A global business account, built for the modern...

CME Group Faces Extended Outage Impacting Trading Operations

A deep dive into the hours-long disruption faced by...

Vodacom M-Pesa Expands Cross-Border Payment Options in Tanzania

New features enable seamless international transactions for users. Highlights: Vodacom...

EU Regulator Criticizes Commission for Rushing Sustainable Transparency Proposal

Striking a balance between sustainability initiatives and regulatory precision. Highlights:...

New EU Payment Services Regulation: Key Changes and Impacts

Understanding the recently agreed changes in EU payment regulations. Highlights:...

U.S. Bank Trials Stablecoin Issuance on Stellar Blockchain

Exploring new dimensions in digital currency and payment solutions. Highlights:...

CBA Hires Lloyds’ Chief Data Officer to Enhance AI Leadership

A strategic move to bolster AI capabilities in the...

AI Agents Transform Cash Management: Insights from BIS Research

Exploring how artificial intelligence enhances cash management efficiency. Highlights: AI...

Visa Partners with AquaNow for Stablecoin Settlement in CEMEA Region

Transforming cross-border transactions through stablecoin innovation. Highlights: Visa collaborates with...
spot_img

Related Articles

Popular Categories

spot_imgspot_img