FastAPI Lifespan + Background-Warmup

Standard-Pattern fuer FastAPI-Container die auf langsame Deps (RDS-Connect, Bedrock-Client-Init, S3-Model-Pull) warten muessen — ohne dass ALB-Health waehrend dieser Zeit rot wird und Container in Restart-Loops landen.

Erstmals dokumentiert aus dem Icking-Rebuild. Wenn das Pattern in zukuenftige FastAPI-Services landet, hier nachziehen.

Das Problem

Klassischer FastAPI-Lifespan blockt:

# ANTI-PATTERN
@asynccontextmanager
async def lifespan(app):
    state.db_pool = make_pool(settings)
    await open_and_wait(state.db_pool)               # 5-30s
    state.bedrock = await init_bedrock_client()      # 2-10s
    state.bge_m3 = await pull_model_from_s3()        # 30-120s
    yield  # erst jetzt sind Health-Endpoints da

Konsequenzen:

Container ist 30-150s lang nicht erreichbar → ALB markiert ihn als unhealthy → ECS startet ihn neu → Race mit dem Init → Restart-Loop
Ein transientes Bedrock-5xx beim Init crasht den ganzen Container statt nur einen Endpoint zu beeintraechtigen
Keine Sichtbarkeit fuer ops welche Dep gerade haengt

Das Pattern

Schritt 1: Liveness vs Readiness trennen

# app/routers/health.py
@router.get("/healthz")
def healthz():
    """Liveness — IMMER 200 sobald uvicorn faehrt. ALB-Target-Group pollt das.
    KEIN Dependency-Check."""
    return {"status": "alive", "version": __version__}
 
 
@router.get("/readyz")
def readyz(request: Request):
    """Readiness — 503 bis alle Dep-Flags True. Internal probes nutzen das."""
    state = request.app.state.app_state
    checks = {
        "db_pool": state.db_pool_ready,
        "bedrock_runtime": state.bedrock_runtime_ready,
    }
    ready = all(checks.values())
    body = {
        "status": "ready" if ready else "not_ready",
        "version": __version__,
        "checks": checks,
    }
    # KEIN state.last_error im Body — kann RDS-Hostname/User leaken.
    # Generischer Marker reicht; Detail steht im strukturierten Log.
    if not ready and state.last_error is not None:
        body["error"] = "warmup_failed"
    return JSONResponse(
        status_code=200 if ready else 503,
        content=body,
    )

Schritt 2: Warmup als asyncio.create_task

# app/main.py
@asynccontextmanager
async def lifespan(app):
    state = AppState()
    app.state.app_state = state
 
    # Fire-and-forget Background-Task — yield kommt SOFORT.
    warmup_task = asyncio.create_task(_warmup_dependencies(app, state))
 
    try:
        yield
    finally:
        warmup_task.cancel()
        with contextlib.suppress(asyncio.CancelledError):
            await warmup_task
 
        # Teardown: nur was wirklich initialisiert wurde
        if state.db_pool is not None:
            await close_pool(state.db_pool)
        for ctx in (state.bedrock_runtime_ctx, state.bedrock_agent_runtime_ctx):
            if ctx is not None:
                try:
                    await ctx.__aexit__(None, None, None)
                except Exception as exc:
                    logger.warning("client_close_failed", error=str(exc))
 
 
async def _warmup_dependencies(app, state):
    """Background — pro Dep eigener try/except, Flag flippen wenn ready."""
    pool = None
    try:
        pool = make_pool(settings)
        await open_and_wait(pool)
        # state.db_pool erst NACH erfolgreichem wait — sonst Cancel-Race
        state.db_pool = pool
        state.db_pool_ready = True
    except Exception as exc:
        state.last_error = f"db_pool_init_failed: {exc}"
        logger.error("db_pool_init_failed", error=str(exc))
        if pool is not None:
            with contextlib.suppress(Exception):
                await pool.close()
 
    # Bedrock-Client analog ...

Schritt 3: aioboto3 als async-Context-Manager-Sidetrack

aioboto3-Clients sind async context managers — die wollen __aenter__ / __aexit__, nicht .close().

session = aioboto3.Session()
ctx = session.client("bedrock-runtime", region_name=settings.bedrock_region)
client = await ctx.__aenter__()       # speichern: client + ctx
state.bedrock_runtime_ctx = ctx
state.bedrock_runtime_client = client
 
# Teardown:
await ctx.__aexit__(None, None, None)

client.close() ist NICHT die richtige API — funktioniert manchmal, leakt aber connections in anderen Faellen.

Schritt 4: ALB-Target-Group auf /healthz

In Terraform/CDK den ALB-Health-Check auf /healthz (Liveness) zeigen — nicht /readyz. Sonst:

Bedrock hat 5min 5xx-Spike → /readyz wird 503 → ALB rotiert alle Targets aus → Service ist forever down weil keine Targets mehr da sind die Bedrock-Pinges aushalten

/readyz ist fuer internal probes (ECS-deployment-circuit-breaker, manuelles curl, Status-Dashboard).

Skip-Switches fuer Tests

# tests/conftest.py
@pytest.fixture
def app_instance():
    from app.main import app
    app.state.skip_db_in_dev = True      # ueberspringt Postgres-Init
    app.state.skip_bedrock_in_dev = True  # ueberspringt aioboto3-Connect
    return app

In _warmup_dependencies pruefen ob diese Flags gesetzt sind und entsprechende Init ueberspringen. So koennen Unit-Tests via TestClient den Lifespan voll durchlaufen ohne echte AWS-Connection.

Was zu vermeiden ist

Anti-Pattern	Warum
Init im lifespan VOR `yield`	blockt uvicorn-Start, ALB-Health rot, Restart-Loop
`/healthz` prueft Dependencies	ALB rotiert Container bei Transient-Fail
`state.last_error` direkt in /readyz-Body	leakt RDS-Hostname/User bei Postgres-Errors
`client.close()` statt `ctx.__aexit__` bei aioboto3	Connection-Leaks
Teardown ohne `state.db_pool is not None`-Check	crasht wenn Warmup-Task vor Pool-Init failed
`warmup_task = asyncio.create_task(...)` ohne `task.cancel()` im finally	Pending-Task-Warning bei Shutdown

Tests

tests/unit/test_health.py in a-icking — verifiziert:

/healthz returnt 200 ohne Auth
/readyz returnt 200 wenn alle Flags True
/readyz returnt 503 wenn ein Flag False, ohne last_error im Body
Security-Headers auf allen Responses (HSTS, X-Frame-Options, etc.)

Quellen

Implementierung: ~/source/a-icking/inference-service/app/main.py
Health-Router: ~/source/a-icking/inference-service/app/routers/health.py
Audit-Insight zu ALB-Health-Choice: architektur-audit

Agentic Ventures Wiki

Explorer

FastAPI Lifespan + Background-Warmup + /healthz vs /readyz