# Google Gemini API ## Docs - [Get API key](https://gemini-api.apidog.io/doc-965852.md): - [Release notes](https://gemini-api.apidog.io/doc-965853.md): - [Libraries](https://gemini-api.apidog.io/doc-965854.md): - [Run Gemini on Google Cloud](https://gemini-api.apidog.io/doc-965855.md): - Model Capabilities [Overview](https://gemini-api.apidog.io/doc-965856.md): - Model Capabilities [Long context](https://gemini-api.apidog.io/doc-965857.md): - Model Capabilities [Structured output](https://gemini-api.apidog.io/doc-965858.md): - Model Capabilities [Document understanding](https://gemini-api.apidog.io/doc-965859.md): - Model Capabilities [Image understanding](https://gemini-api.apidog.io/doc-965860.md): - Model Capabilities [Video understanding](https://gemini-api.apidog.io/doc-965861.md): - Model Capabilities [Audio understanding](https://gemini-api.apidog.io/doc-965862.md): - models [All Model](https://gemini-api.apidog.io/doc-965863.md): - models [Pricing](https://gemini-api.apidog.io/doc-965864.md): - models [Rate limits](https://gemini-api.apidog.io/doc-965865.md): - models [Billing info](https://gemini-api.apidog.io/doc-965866.md): - Safety [Safety settings](https://gemini-api.apidog.io/doc-965867.md): - Safety [Safety guidance](https://gemini-api.apidog.io/doc-965868.md): ## API Docs - Model Capabilities > Text generation [Text input](https://gemini-api.apidog.io/api-16240700.md): The simplest way to generate text using the Gemini API is to provide the model with a single text-only input, as shown in this example - Model Capabilities > Text generation [Image input](https://gemini-api.apidog.io/api-16240701.md): The Gemini API supports multimodal inputs that combine text and media files. The following example shows how to generate text from text and image input - Model Capabilities > Text generation [Streaming output](https://gemini-api.apidog.io/api-16240702.md): By default, the model returns a response after completing the entire text generation process. You can achieve faster interactions by using streaming to return instances of [`GenerateContentResponse`](https://ai.google.dev/api/generate-content#v1beta.GenerateContentResponse) as they're generated. - Model Capabilities > Text generation [Multi-turn conversations](https://gemini-api.apidog.io/api-16240703.md): The Gemini SDK lets you collect multiple rounds of questions and responses into a chat. The chat format enables users to step incrementally toward answers and to get help with multipart problems. This SDK implementation of chat provides an interface to keep track of conversation history, but behind the scenes it uses the same [`generateContent`](https://ai.google.dev/api/generate-content#method:-models.generatecontent) method to create the response. - Model Capabilities > Text generation [Multi-turn conversations (Streaming)](https://gemini-api.apidog.io/api-16240704.md): You can also use streaming with chat, as shown in the following example - Model Capabilities > Text generation [Configuration parameters](https://gemini-api.apidog.io/api-16240705.md): Every prompt you send to the model includes parameters that control how the model generates responses. You can configure these parameters, or let the model use the default options. - Model Capabilities > Generate images [Generate images using Gemini](https://gemini-api.apidog.io/api-16240706.md): Gemini 2.0 Flash Experimental supports the ability to output text and inline images. This lets you use Gemini to conversationally edit images or generate outputs with interwoven text (for example, generating a blog post with text and images in a single turn). All generated images include a [SynthID watermark](https://ai.google.dev/responsible/docs/safeguards/synthid), and images in Google AI Studio include a visible watermark as well. - Model Capabilities > Generate images [Image editing with Gemini](https://gemini-api.apidog.io/api-16240707.md): To perform image editing, add an image as input. The following example demonstrats uploading base64 encoded images. For multiple images and larger payloads, check the [image input](https://ai.google.dev/gemini-api/docs/vision#image-input) section. - Model Capabilities > Generate images [Generate images using Imagen 3](https://gemini-api.apidog.io/api-16240708.md): The Gemini API provides access to [Imagen 3](https://deepmind.google/technologies/imagen-3/), Google's highest quality text-to-image model, featuring a number of new and improved capabilities. Imagen 3 can do the following: - Model Capabilities > Gemini thinking [Use thinking models](https://gemini-api.apidog.io/api-16240709.md): Models with thinking capabilities are available in [Google AI Studio](https://aistudio.google.com/prompts/new_chat?model=gemini-2.5-flash-preview-04-17) and through the Gemini API. Thinking is on by default in both the API and AI Studio because the 2.5 series models have the ability to automatically decide when and how much to think based on the prompt. For most use cases, it's beneficial to leave thinking on. But if you want to to turn thinking off, you can do so by setting the `thinkingBudget` parameter to 0. - Model Capabilities > Gemini thinking [Set budget on thinking models](https://gemini-api.apidog.io/api-16240710.md): The `thinkingBudget` parameter gives the model guidance on the number of thinking tokens it can use when generating a response. A greater number of tokens is typically associated with more detailed thinking, which is needed for solving more complex tasks. `thinkingBudget` must be an integer in the range 0 to 24576. Setting the thinking budget to 0 disables thinking. - Model Capabilities > Function calling [Function Calling with the Gemini API](https://gemini-api.apidog.io/api-16240711.md): Function calling lets you connect models to external tools and APIs. Instead of generating text responses, the model understands when to call specific functions and provides the necessary parameters to execute real-world actions. This allows the model to act as a bridge between natural language and real-world actions and data. Function calling has 3 primary use cases: