AnyaSelf Docs

Virtual Try-On (VTO)

Diffusion model inference pipeline for garment try-on visualization.

The vto service handles the heavy lifting of running diffusion models for garment try-ons. It takes a person image and a garment image and blends them using configurable inference backends.

Architecture

graph LR
  CLIENT["Client"] --> GW["API Gateway"]
  GW --> VTO["VTO Service :8004"]
  VTO -- "simulated" --> STUB["Stub Response"]
  VTO -- "inline" --> GPU["Local GPU Worker"]
  VTO -- "remote" --> REMOTE["Remote Inference Cluster"]

Endpoints

MethodPathDescription
POST/api/v1/households/{id}/vto/jobsCreate a try-on inference job
GET/api/v1/households/{id}/vto/jobs/{jobId}Poll job state + result images

Create VTO Job

{
  "memberId": "mem_123xyz",
  "personImageRef": "gs://anyaself/users/person_01.jpg",
  "garmentImageRef": "gs://anyaself/items/jacket_05.jpg",
  "category": "TOP",
  "poseVariant": "front",
  "maskHints": {},
  "sessionId": "ses_888xyz"
}

Job Response

{
  "jobId": "vto_abc123",
  "householdId": "h1",
  "memberId": "mem_123xyz",
  "requestedByUserId": "usr_1",
  "state": "SUCCEEDED",
  "category": "TOP",
  "personImageRef": "gs://...",
  "personImageUrl": "https://storage.googleapis.com/...",
  "garmentImageRef": "gs://...",
  "garmentImageUrl": "https://storage.googleapis.com/...",
  "poseVariant": "front",
  "cacheKey": "sha256_abc...",
  "cacheHit": false,
  "resultImageRefs": ["gs://anyaself/vto/result_01.jpg"],
  "resultImageUrls": ["https://storage.googleapis.com/..."],
  "qualityScore": 0.92,
  "warnings": [],
  "createdAt": "2026-03-08T10:00:00+00:00",
  "updatedAt": "2026-03-08T10:00:05+00:00"
}

Job Lifecycle

See Data Models → VTO Job Lifecycle for the state diagram.

StateDescription
QUEUEDJob created, waiting for worker
RUNNINGInference in progress
SUCCEEDEDResult images available
FAILEDModel error, timeout, or quality below threshold

Caching

VTO jobs generate a deterministic cacheKey from the person image, garment image, category, and pose. If a matching result exists, cacheHit: true and the existing result is returned instantly.

Configuration

VariableDefaultDescription
VTO_INFERENCE_BACKENDsimulatedsimulated, inline, or remote
VTO_ENABLE_INLINE_WORKERtrueStart the inline PyTorch worker thread
VTO_QUALITY_THRESHOLD0.0Minimum quality score to accept (0.0 = accept all)
PERSISTENCE_BACKENDinmemoryfirestore or inmemory

[!NOTE] In local development, VTO_INFERENCE_BACKEND=simulated returns immediate fake results without GPU. Set to inline for local GPU execution or remote for a cluster endpoint.

On this page