Gemini Pro Vision API

Multimodal understanding of text and images.

Documentation

API entry from apis.yml

apis.yml Raw ↑
name: Gemini Pro Vision API
description: Multimodal understanding of text and images.
baseURL: https://generativelanguage.googleapis.com/v1beta/models/gemini-pro-vision
tags:
- Image Understanding
- Multimodal
- Vision
properties:
- type: Documentation
  url: https://ai.google.dev/tutorials/prompting_with_media