OpenAI-compatible

Any service that exposes the OpenAI Chat Completions API works through this client. Most modern inference runtimes target the OpenAI shape, so the same configuration covers a wide range of hosted and self-hosted providers.

Verified integrations

Service	Base URL pattern	Auth	Docs
vLLM	`http://localhost:8000/v1`	None or token	docs.vllm.ai
LM Studio	`http://localhost:1234/v1`	None	lmstudio.ai/docs
Text Generation Inference (TEI)	`http://localhost:3000/v1`	None	huggingface.co/docs/text-generation-inference
DeepSeek	`https://api.deepseek.com/v1`	Bearer	platform.deepseek.com
Groq	`https://api.groq.com/openai/v1`	Bearer	console.groq.com
Together AI	`https://api.together.xyz/v1`	Bearer	docs.together.ai
Mistral La Plateforme	`https://api.mistral.ai/v1`	Bearer	docs.mistral.ai
Anyscale	`https://api.endpoints.anyscale.com/v1`	Bearer	anyscale.com/endpoints
Fireworks	`https://api.fireworks.ai/inference/v1`	Bearer	fireworks.ai/docs

Use it

There are three equivalent ways to instantiate the OpenAI-compatible client.

Option 1 — static factory (recommended)

using LogicGrid.Core.Llm;

// vLLM
var vllm = LlmClientBase.Compatible(
    baseUrl: "http://localhost:8000/v1",
    model:   "Qwen/Qwen2.5-7B-Instruct");

// LM Studio
var lmstudio = LlmClientBase.Compatible(
    baseUrl: "http://localhost:1234/v1",
    model:   "lmstudio-community/Llama-3.2-3B-Instruct-GGUF");

// DeepSeek
var deepseek = LlmClientBase.Compatible(
    baseUrl: "https://api.deepseek.com/v1",
    model:   "deepseek-chat",
    apiKey:  Environment.GetEnvironmentVariable("DEEPSEEK_KEY"));

// Groq
var groq = LlmClientBase.Compatible(
    baseUrl: "https://api.groq.com/openai/v1",
    model:   "llama-3.3-70b-versatile",
    apiKey:  Environment.GetEnvironmentVariable("GROQ_KEY"));

Parameter	Type	Default	Notes
`baseUrl`	`string`	(required)	Service base URL — usually ends in `/v1`.
`model`	`string`	(required)	Model identifier expected by the runtime.
`apiKey`	`string?`	`null`	Optional Bearer token. Local runtimes (vLLM, LM Studio, TEI) typically don't need one; hosted services (DeepSeek, Groq, Together) do.

Option 2 — direct construction

using LogicGrid.Core.Providers;

var llm = new OpenAiCompatibleClient(
    baseUrl:      "http://localhost:8000/v1",
    defaultModel: "Qwen/Qwen2.5-7B-Instruct",
    apiKey:       null);

Parameter	Type	Default	Notes
`baseUrl`	`string`	(required)	Same as the factory's `baseUrl`.
`defaultModel`	`string`	(required)	The model used when the agent or call site doesn't override it.
`apiKey`	`string?`	`null`	Same as the factory's `apiKey`.

The factory and the constructor produce equivalent clients. Use direct construction when you need an injected HttpClient explained below (for retries, proxies, or testing).

Option 3 — injected `HttpClient`

using System.Net.Http;
using LogicGrid.Core.Providers;

var http = new HttpClient();
// http.DefaultRequestHeaders.Add("Authorization", "Bearer ...");

var llm = new OpenAiCompatibleClient(
    httpClient:   http,
    baseUrl:      "https://api.deepseek.com/v1",
    defaultModel: "deepseek-chat");

Parameter	Type	Default	Notes
`httpClient`	`HttpClient`	(required)	Pre-configured client. Caller sets any `Authorization` header.
`baseUrl`	`string`	(required)	Same as above.
`defaultModel`	`string`	(required)	Same as above.

This overload takes no apiKey — the caller is responsible for setting the Authorization: Bearer ... header on the supplied HttpClient (typically via IHttpClientFactory, a DelegatingHandler, or a test fake). Use it when auth is managed outside the client, or in unit tests with a mocked transport.

Tool calling

Native tool calling depends on both the runtime and the model. Most OpenAI-compatible backends (vLLM, LM Studio, etc.) implement the OpenAI tool-call protocol, but the model itself has to actually emit tool calls. Stay on PromptSchemaStrategy (the default) and switch to native only after testing the specific runtime + model. See Tool calling strategy for the strategy reference and how to switch.

Embeddings

Same two-way pattern. Works with TEI, vLLM running an embedding model, or any service that exposes /v1/embeddings.

Option 1 — static factory (recommended)

using LogicGrid.Memory.Embeddings;

var embedder = EmbeddingClientBase.Compatible(
    baseUrl: "http://localhost:3000",        // TEI
    model:   "BAAI/bge-large-en-v1.5");

Parameter	Type	Default	Notes
`baseUrl`	`string`	(required)	Service base URL without the trailing `/v1` — the client appends it.
`model`	`string`	(required)	Model identifier expected by the runtime.
`dimensions`	`int`	`0`	Vector size. `0` = auto-detect from the first response.
`apiKey`	`string?`	`null`	Optional Bearer token.

Option 2 — direct construction

Mirrors OpenAiCompatibleClient exactly: baseUrl first, then defaultModel.

using LogicGrid.Memory.Embeddings;

var embedder = new OpenAiCompatibleEmbeddingClient(
    baseUrl:      "http://localhost:3000",
    defaultModel: "BAAI/bge-large-en-v1.5",
    dimensions:   1024,
    apiKey:       null);

Parameter	Type	Default	Notes
`baseUrl`	`string`	(required)	Server base URL without the trailing `/v1`.
`defaultModel`	`string`	(required)	Embedding model used when the call site doesn't override it.
`dimensions`	`int`	`0`	Vector size. `0` = auto-detect from first response.
`apiKey`	`string?`	`null`	Optional bearer token.

A second overload accepts a custom HttpClient for retries, proxies, or DI — the caller is responsible for any required Authorization header:

using System.Net.Http;
using LogicGrid.Memory.Embeddings;

var http = new HttpClient();
// http.DefaultRequestHeaders.Add("Authorization", "Bearer ...");

var embedder = new OpenAiCompatibleEmbeddingClient(
    httpClient:   http,
    baseUrl:      "http://localhost:3000",
    defaultModel: "BAAI/bge-large-en-v1.5",
    dimensions:   1024);

Parameter	Type	Default	Notes
`httpClient`	`HttpClient`	(required)	Pre-configured client. Caller sets any `Authorization` header.
`baseUrl`	`string`	(required)	Same as above.
`defaultModel`	`string`	(required)	Same as above.
`dimensions`	`int`	`0`	Same as above.

Compatibility caveats

Issue	What to do
Service rejects unknown fields	Some compat layers don't ignore `temperature` or `max_tokens` they don't support. Check the runtime's docs.
No `usage` field in response	Cost tracking will be zero. Expected on most local runtimes — they don't bill.
Tool calls return as plain text	Stay on `PromptSchemaStrategy`.

Verified integrations​

Use it​

Option 1 — static factory (recommended)​

Option 2 — direct construction​

Option 3 — injected HttpClient​

Tool calling​

Embeddings​

Option 1 — static factory (recommended)​

Option 2 — direct construction​

Compatibility caveats​

Verified integrations

Use it

Option 1 — static factory (recommended)

Option 2 — direct construction

Option 3 — injected `HttpClient`

Tool calling

Embeddings

Option 1 — static factory (recommended)

Option 2 — direct construction

Compatibility caveats