Image Processing

Image Upload

Multimodal Models Only: Image processing requires models with vision capabilities. Currently, only Qwen3-VL 30B supports image and video inputs. Other models (DeepSeek, Llama, Qwen Coder) are text-only and cannot process images.

See the Model Catalog for complete model specifications and multimodal capabilities.

How It Works

Image processing works through the chat/completions endpoint using base64-encoded images. Images are sent as data URLs in the message content alongside your text prompt.

Converting Images to Base64

There are several ways to convert your images to base64 format:

# Convert image to base64
base64 -i image.jpg -o image_base64.txt

# Or use it directly in your terminal
base64 image.jpg | pbcopy  # Copies to clipboard on macOS

API Usage

from tinfoil import TinfoilAI
import base64
import mimetypes
from pathlib import Path

# Initialize client with Qwen3-VL vision model
client = TinfoilAI(
    api_key="<YOUR_API_KEY>",
)

# Read and encode image with proper MIME type detection
image_path = "image.jpg"
with open(image_path, "rb") as image_file:
    base64_image = base64.b64encode(image_file.read()).decode('utf-8')

# Determine MIME type
mime_type, _ = mimetypes.guess_type(image_path)
if not mime_type or not mime_type.startswith('image/'):
    # Fallback based on file extension
    ext = Path(image_path).suffix.lower()
    mime_type_map = {
        '.jpg': 'image/jpeg',
        '.jpeg': 'image/jpeg',
        '.png': 'image/png',
        '.gif': 'image/gif',
        '.webp': 'image/webp'
    }
    mime_type = mime_type_map.get(ext, 'image/jpeg')

# Create completion with multimodal content
response = client.chat.completions.create(
    model="qwen3-vl-30b",
    messages=[{
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {
                "type": "image_url",
                "image_url": {
                    "url": f"data:{mime_type};base64,{base64_image}"
                }
            }
        ]
    }]
)

print(response.choices[0].message.content)

Best Practices

Image Size: For optimal performance, resize large images before processing (recommended max: 4096x4096)
Base64 Encoding: Ensure proper base64 encoding and include the correct MIME type in the data URL
Multiple Images: You can include multiple images in a single chat completion by adding multiple image_url objects to the content array
Compression: Consider compressing large images to reduce payload size and improve response times

Getting Started

Models

SDKs

Guides

Privacy Verification

Tutorials

Admin API

Confidential Computing

Resources

Image Upload

How It Works

Converting Images to Base64

API Usage

Best Practices

Getting Started

Models

SDKs

Guides

Privacy Verification

Tutorials

Admin API

Confidential Computing

Resources

​Image Upload

​How It Works

​Converting Images to Base64

​API Usage

​Best Practices

Image Upload

How It Works

Converting Images to Base64

API Usage

Best Practices