Exploring Gemini and Android AI Core

Nishi AjmeraNishi Ajmera
2 min read

What is Gemini?

Gemini is a series of multimodal generative AI models developed by Google. Multimodal generative AI models are models that combine types of inputs, such as images, videos, audio, and text provided as a prompt. Gemini models can accept text and images in prompts, depending on the model variation you choose, and output text responses.

Gemini 1.0 is introduced in three different sizes

Gemini Ultra - Gemini Ultra is still under development and is said to be Google's largest and most capable model for highly complex tasks.

Gemini Pro - Google has introduced two versions of Gemini Pro, gemini-pro and gemini-pro-vision. Gemini-pro takes input as text and generates output as text. Gemini-pro-vision takes images and text as input and generates output as text.

Gemini Nano - this is the most efficient model for on-device tasks and can be executed on capable Android devices (currently Pixel 8 Pro).

How does Gemini Nano work?

On-Device execution of these models has a lot of benefits. It helps with the processing of sensitive data. It provides offline access (without internet) to the models and also provides cost savings as the model is already installed in the core of the operating system. Integrating AI's capability into Android's layer has a lot of potential. Let's understand how it works.

Android AI core provides access to the foundation models that can run on device. To access this model you need Google AI Edge SDK which provides API to access the Gemini Nano model. You can also fine-tune these models using LoRA (Low-rank adaptation). LoRA is a very efficient fine-tuning method, the key idea behind LoRA is to update only a small part of the model's weights, specifically targeting those that have the most significant impact on the task at hand.

AICore is currently available on Pixel 8 Pro devices and a few apps are already using Gemini Nano through AI Core such as Pixel voice recorder and Gboard.

In conclusion, the Gemini series of multimodal generative AI models developed by Google has a lot of potential. With the ability to accept various types of inputs and generate responses, Gemini models can be used in a variety of applications. Additionally, the on-device execution of Gemini Nano provides a lot of benefits, including offline access, cost savings, and processing of sensitive data. With the AI Core currently available on Pixel 8 Pro devices and a few apps already using Gemini Nano through AI Core, it will be interesting to see how this technology will develop in the future.

0
Subscribe to my newsletter

Read articles from Nishi Ajmera directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Nishi Ajmera
Nishi Ajmera

I am a full stack developer passionate about emerging new technologies in the web development field. With a keen interest in web development, I have gained expertise in various programming languages, frameworks, and tools. Apart from my professional expertise, I am a public speaker and have delivered multiple sessions at various Google Developer group talks and Javascript meetups. When I am not coding or reading, I can be found exploring new places and cuisines. I believe that traveling and trying new things is a great way to keep my mind fresh and inspired.