The Gemini API supports multimodal inputs that combine text and media files. The following example shows how to generate text from text and image input
Request Request Example
Shell
JavaScript
Java
Swift
curl--location-g--request POST 'https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash:generateContent?key=' \
--header'Content-Type: application/json' \
--data-raw'@$TEMP_JSON'