Virtual Try-On (VTO)

The vto service handles the heavy lifting of running diffusion models for garment try-ons. It takes a person image and a garment image and blends them using configurable inference backends.

Architecture

graph LR
  CLIENT["Client"] --> GW["API Gateway"]
  GW --> VTO["VTO Service :8004"]
  VTO -- "simulated" --> STUB["Stub Response"]
  VTO -- "inline" --> GPU["Local GPU Worker"]
  VTO -- "remote" --> REMOTE["Remote Inference Cluster"]

Endpoints

Method	Path	Description
`POST`	`/api/v1/households/{id}/vto/jobs`	Create a try-on inference job
`GET`	`/api/v1/households/{id}/vto/jobs/{jobId}`	Poll job state + result images

Create VTO Job

{
  "memberId": "mem_123xyz",
  "personImageRef": "gs://anyaself/users/person_01.jpg",
  "garmentImageRef": "gs://anyaself/items/jacket_05.jpg",
  "category": "TOP",
  "poseVariant": "front",
  "maskHints": {},
  "sessionId": "ses_888xyz"
}

Job Response

{
  "jobId": "vto_abc123",
  "householdId": "h1",
  "memberId": "mem_123xyz",
  "requestedByUserId": "usr_1",
  "state": "SUCCEEDED",
  "category": "TOP",
  "personImageRef": "gs://...",
  "personImageUrl": "https://storage.googleapis.com/...",
  "garmentImageRef": "gs://...",
  "garmentImageUrl": "https://storage.googleapis.com/...",
  "poseVariant": "front",
  "cacheKey": "sha256_abc...",
  "cacheHit": false,
  "resultImageRefs": ["gs://anyaself/vto/result_01.jpg"],
  "resultImageUrls": ["https://storage.googleapis.com/..."],
  "qualityScore": 0.92,
  "warnings": [],
  "createdAt": "2026-03-08T10:00:00+00:00",
  "updatedAt": "2026-03-08T10:00:05+00:00"
}

Job Lifecycle

See Data Models → VTO Job Lifecycle for the state diagram.

State	Description
`QUEUED`	Job created, waiting for worker
`RUNNING`	Inference in progress
`SUCCEEDED`	Result images available
`FAILED`	Model error, timeout, or quality below threshold

Caching

VTO jobs generate a deterministic cacheKey from the person image, garment image, category, and pose. If a matching result exists, cacheHit: true and the existing result is returned instantly.

Configuration

Variable	Default	Description
`VTO_INFERENCE_BACKEND`	`simulated`	`simulated`, `inline`, or `remote`
`VTO_ENABLE_INLINE_WORKER`	`true`	Start the inline PyTorch worker thread
`VTO_QUALITY_THRESHOLD`	`0.0`	Minimum quality score to accept (0.0 = accept all)
`PERSISTENCE_BACKEND`	`inmemory`	`firestore` or `inmemory`

[!NOTE] In local development, VTO_INFERENCE_BACKEND=simulated returns immediate fake results without GPU. Set to inline for local GPU execution or remote for a cluster endpoint.