Native video analysis using Google Gemini API. Upload and analyze video files — describe scenes, extract text/UI, answer questions about content, transcribe...
Analyze videos natively using Google Gemini's multimodal API. No frame extraction needed — Gemini processes video at 1 FPS with full motion, audio, and visual understanding.
# Analyze a video with default prompt (full description)
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/analyze.py /path/to/video.mp4
# Ask a specific question
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/analyze.py /path/to/video.mp4 "What text is visible on screen?"
# Manage uploaded files
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/manage_files.py list
GOOGLE_AI_API_KEY=$GOOGLE_AI_API_KEY python3 {baseDir}/scripts/manage_files.py cleanup
MP4, AVI, MOV, MKV, WebM, FLV, MPEG, MPG, WMV, 3GP — up to 2GB per file.
| Task | Example Prompt |
|---|---|
| General description | (default — no prompt needed) |
| UI/text extraction | "What text and UI elements are visible?" |
| Tutorial summary | "Summarize the steps shown in this tutorial" |
| Bug report from video | "Describe what went wrong in this screen recording" |
| Meeting notes | "Summarize the key points discussed" |
| Content comparison | Upload 2 videos, ask for differences |
Set GOOGLE_AI_API_KEY in your environment or .env file. Get a free key at aistudio.google.com.
Default model: gemini-2.5-flash (fast, cheap, excellent vision). Override with --model gemini-2.5-pro for complex analysis.
See references/gemini-files-api.md for file upload limits, processing details, and advanced options.
Built by M. Abidi · LinkedIn · YouTube · GitHub · Book a Call
ZIP package — ready to use