OpenAI-compatible endpoint

The fleet and agentruntime use the OpenAI chat completions API contract — any server that implements /v1/chat/completions with function-calling works as a drop-in backend.

What "OpenAI-compatible" means

The fleet sends POST /v1/chat/completions with:

model — the model name string (passed through from the role or --default-model)
messages — [{role, content}, ...] including system prompt, user turn, assistant tool-calls, and tool results
tools — JSON schema array (OpenAI function-calling format)
tool_choice — "auto" (let the model decide)
max_tokens — the per-run cap

It expects a response with choices[0].message.content, choices[0].message.tool_calls[], and usage.prompt_tokens / usage.completion_tokens.

Pointing the fleet at a backend

Backend	`--llm-base-url`
OpenAI (default)	(omit; SDK default)
OpenRouter	`https://openrouter.ai/api/v1`
Ollama	`http://localhost:11434/v1`
Any vendor	their `/v1` base URL
E2E llm-mock	`http://localhost:<mock-port>/v1`

Fleet flags

--llm-base-url   TRIP2G_LLM_BASE_URL   OpenAI-compatible base URL
--llm-api-key    TRIP2G_LLM_API_KEY    API key (falls back to OPENAI_API_KEY)
--default-model  TRIP2G_DEFAULT_MODEL  model name when a role omits model:

The fleet's agentruntime.NewOpenAILLM(apiKey, baseURL) wraps the official go-openai client and sets a custom BaseURL when non-empty.