Glossary

5.2 API Layer

Build a FastAPI wrapper around the serving engine with endpoints: - `POST /generate` -- Single completion request. - `POST /generate/stream` -- Streaming completion (SSE). - `POST /chat` -- Multi-turn chat completion (handles message formatting). - `GET /health` -- Health check (model loaded, GPU av

Learn More

Related Terms