Virtual Try-On (VTO)
Diffusion model inference pipeline for garment try-on visualization.
The vto service handles the heavy lifting of running diffusion models for garment try-ons. It takes a person image and a garment image and blends them using configurable inference backends.
Architecture
graph LR
CLIENT["Client"] --> GW["API Gateway"]
GW --> VTO["VTO Service :8004"]
VTO -- "simulated" --> STUB["Stub Response"]
VTO -- "inline" --> GPU["Local GPU Worker"]
VTO -- "remote" --> REMOTE["Remote Inference Cluster"]Endpoints
| Method | Path | Description |
|---|---|---|
POST | /api/v1/households/{id}/vto/jobs | Create a try-on inference job |
GET | /api/v1/households/{id}/vto/jobs/{jobId} | Poll job state + result images |
Create VTO Job
{
"memberId": "mem_123xyz",
"personImageRef": "gs://anyaself/users/person_01.jpg",
"garmentImageRef": "gs://anyaself/items/jacket_05.jpg",
"category": "TOP",
"poseVariant": "front",
"maskHints": {},
"sessionId": "ses_888xyz"
}Job Response
{
"jobId": "vto_abc123",
"householdId": "h1",
"memberId": "mem_123xyz",
"requestedByUserId": "usr_1",
"state": "SUCCEEDED",
"category": "TOP",
"personImageRef": "gs://...",
"personImageUrl": "https://storage.googleapis.com/...",
"garmentImageRef": "gs://...",
"garmentImageUrl": "https://storage.googleapis.com/...",
"poseVariant": "front",
"cacheKey": "sha256_abc...",
"cacheHit": false,
"resultImageRefs": ["gs://anyaself/vto/result_01.jpg"],
"resultImageUrls": ["https://storage.googleapis.com/..."],
"qualityScore": 0.92,
"warnings": [],
"createdAt": "2026-03-08T10:00:00+00:00",
"updatedAt": "2026-03-08T10:00:05+00:00"
}Job Lifecycle
See Data Models → VTO Job Lifecycle for the state diagram.
| State | Description |
|---|---|
QUEUED | Job created, waiting for worker |
RUNNING | Inference in progress |
SUCCEEDED | Result images available |
FAILED | Model error, timeout, or quality below threshold |
Caching
VTO jobs generate a deterministic cacheKey from the person image, garment image, category, and pose. If a matching result exists, cacheHit: true and the existing result is returned instantly.
Configuration
| Variable | Default | Description |
|---|---|---|
VTO_INFERENCE_BACKEND | simulated | simulated, inline, or remote |
VTO_ENABLE_INLINE_WORKER | true | Start the inline PyTorch worker thread |
VTO_QUALITY_THRESHOLD | 0.0 | Minimum quality score to accept (0.0 = accept all) |
PERSISTENCE_BACKEND | inmemory | firestore or inmemory |
[!NOTE] In local development,
VTO_INFERENCE_BACKEND=simulatedreturns immediate fake results without GPU. Set toinlinefor local GPU execution orremotefor a cluster endpoint.