News & Trends  | 14 Feb 2024

Gemini - the next era of AI at Google

What is the new AI capable of?

Porträt von Ann Julie Granzer
Ann-Julie Granzer

Gemini markiert den Beginn einer neuen Ära in der künstlichen Intelligenz bei Google. Damit symbolisiert Google den zweiseitigen Ansatz bei der Entwicklung von KI: einerseits die Integration in alltägliche Geräte und Dienste, andererseits die Bereitstellung fortschrittlicher KI-Lösungen für komplexe Probleme. Dabei ist Gemini ein Sprachmodell und keine App oder Frontend, weswegen Bard als Benutzerschnittstelle dient, die den Zugriff auf Gemini-Modelle ermöglicht. 

What is the significance of Gemini in the context of AI development?

Gemini represents a significant advance in AI technology that will impact almost all Google products, according to Pichai and Demis Hassabis, CEO of Google DeepMind. This innovation is considered Google's largest research and development project and is seen as the future of the company in the AI domain. It represents a quantum leap, as Gemini has multimodal capabilities. It has been trained to work not only with text, but also with audio, images and video. These capabilities set Gemini apart from previous models such as Google's LaMDA, which were only trained on text data.

What capabilities does Google Gemini have?

Gemini consists of different models that are tailored to different use cases: Gemini Nano for Android devices, Gemini Pro for Google AI Services and Bard, and Gemini Ultra for data centers and direct enterprise applications.

The Gemini Nano version is an efficient model for processing tasks directly on a device running Android 14, such as the Pixel Pro 8. The support of AI enables new functions on the smartphone, such as quick summaries of what has been said in the recorder app or “intelligent answers” in chat apps such as WhatsApp.

Gemini Pro is used to process a wide range of tasks, which is why it has already been integrated into Google Bard to improve information processing and planning. Since December 13, 2023, developers and enterprise customers can access the Pro version via the Gemini API or Google Cloud Vertex AI.

The largest and most powerful model for processing highly complex tasks is Gemini Ultra. It is the first model to outperform human experts in the Massive Multitask Language Understanding Test (MMLU).This test result was possible because Gemini is able to perform a precise analysis of the tasks before answering the questions and generate specific answers accordingly. Gemini Ultra also outperforms the latest version of GPT-4 in disciplines such as mathematics and code. As extensive security checks and confidence tests are still being carried out by external partners, Gemini Ultra is not yet available to consumers.

The three different models are characterized by their multimodality, i.e. they can interpret and standardize information from different data sources.Particularly noteworthy are the logical and conceptual capabilities of Gemini, which run significantly more efficiently and faster on the Tensor Processing Units v4 and v5e developed by Google itself. The full capabilities of these Gemini models are not yet available in all products, but Google promises to introduce them in the near future.

What about the safety factors at Gemini?

Gemini focuses on responsibility and safety. That's why Gemini includes the most comprehensive safety assessments of any Google AI model to date. The focus is on toxicity and bias. As part of this, Google developed special safety classifiers, for example to flag and sort out violence. To identify internal weaknesses in the evaluation approaches, Google works with external partners and uses benchmarks such as real toxicity prompts when training the AI model. These different prompts contain different levels of toxicity and were developed at the Allen Institute of AI.

How does Gemini position itself in the market?

Gemini competes directly with other major language models such as OpenAI's GPT-4, but differentiates itself through its integration with various Google devices and platforms. By providing models of different sizes, Google meets the everyday needs of consumers as well as the demanding requirements of developers and businesses. It is already available in Bard and through Vertex AI, a developer platform fully managed by Google, with plans for wider rollout to other services and applications.

What does the future hold for Google Gemini?

The future of Google Gemini is bright. Through integration with core products such as the Google search engine and SBU, advertising products and the Chrome browser, Google plans to roll out the model globally. Gemini Ultra, which is intended for highly complex tasks, is still in development and will be available to developers in early 2024 after extensive trust and security testing. Gemini is currently only available in English, but extensions to other languages are planned, which will further increase the global reach and influence of this model.

Sie möchten auf dem Laufenden zu digitalen News und Trends bleiben?

Registrieren Sie sich für unseren Newsletter, um immer auf dem neuesten Stand zu sein!

Porträt von Ann Julie Granzer

Ann-Julie Granzer

Ann-Julie Granzer works as a Junior SEO Consultant at diva-e in Munich. There she supports the SEO team by providing website analyses and devising optimisations for customer projects. She also creates the monthly search newsletter, SEO news, takes on organisational tasks and much more. Her favourite thing to do is combine her knowledge from her degree in financial management with SEO.

See all articles