Automated high-quality video generation (text-to-video, image-to-video) via a local jimeng-api Docker service. Features native OpenClaw image interception, a...
This skill leverages a local jimeng-api Docker service. It allows AI agents to fully control high-quality image-to-video and text-to-video generation using a valid sessionid.
With this skill, the AI Assistant can automatically intercept clothing/character images sent by the user in the chat interface and seamlessly pass them to the generation model—no manual web uploads required!
<agent_instruction> When a user sends a "character/outfit" photo in the chat and intends to animate it (e.g., showing off the clothing, turning around), you MUST execute the following steps:
read tool or native execution flow to extract the base64 content or cache path of this image from the chat context. Save it as a local temporary file (e.g., /tmp/target.jpg).multipart/form-data to submit the physical file stream.node {baseDir}/scripts/generate.js --prompt "The model naturally turns around, fully showcasing the gloss of the fabric, extremely high quality, natural sunlight..." --image /tmp/target.jpg
Note: Because this relies on the domestic Jimeng/Seedance engine, there is strict automated content moderation for clothing. If you encounter error -2001 (First frame image upload failed: may contain violating content), this means the image is deemed "too revealing", shows too much skin, or contains sensitive elements. The firewall outright blocks these. No credits are deducted. If this occurs, ask the user to provide a different image or switch to an overseas engine like Luma/Runway.
node {baseDir}/scripts/generate.js --prompt "A Shiba Inu surfing" --session "your_sessionid"
node {baseDir}/scripts/generate.js --prompt "Model turning naturally to show outfit" --image "/tmp/target.jpg" --session "your_sessionid"
Notes:
- Requires sufficient credits in the Jimeng account.
- Using
jimeng-video-3.0-prodeducts 50 credits per run.
ZIP package — ready to use