Skip to main content

OpenAI-compatible

Any service that exposes the OpenAI Chat Completions API works through this client. Most modern inference runtimes target the OpenAI shape, so the same configuration covers a wide range of hosted and self-hosted providers.

Verified integrations

ServiceBase URL patternAuthDocs
vLLMhttp://localhost:8000/v1None or tokendocs.vllm.ai
LM Studiohttp://localhost:1234/v1Nonelmstudio.ai/docs
Text Generation Inference (TEI)http://localhost:3000/v1Nonehuggingface.co/docs/text-generation-inference
DeepSeekhttps://api.deepseek.com/v1Bearerplatform.deepseek.com
Groqhttps://api.groq.com/openai/v1Bearerconsole.groq.com
Together AIhttps://api.together.xyz/v1Bearerdocs.together.ai
Mistral La Plateformehttps://api.mistral.ai/v1Bearerdocs.mistral.ai
Anyscalehttps://api.endpoints.anyscale.com/v1Beareranyscale.com/endpoints
Fireworkshttps://api.fireworks.ai/inference/v1Bearerfireworks.ai/docs

Use it

There are three equivalent ways to instantiate the OpenAI-compatible client.

using LogicGrid.Core.Llm;

// vLLM
var vllm = LlmClientBase.Compatible(
baseUrl: "http://localhost:8000/v1",
model: "Qwen/Qwen2.5-7B-Instruct");

// LM Studio
var lmstudio = LlmClientBase.Compatible(
baseUrl: "http://localhost:1234/v1",
model: "lmstudio-community/Llama-3.2-3B-Instruct-GGUF");

// DeepSeek
var deepseek = LlmClientBase.Compatible(
baseUrl: "https://api.deepseek.com/v1",
model: "deepseek-chat",
apiKey: Environment.GetEnvironmentVariable("DEEPSEEK_KEY"));

// Groq
var groq = LlmClientBase.Compatible(
baseUrl: "https://api.groq.com/openai/v1",
model: "llama-3.3-70b-versatile",
apiKey: Environment.GetEnvironmentVariable("GROQ_KEY"));
ParameterTypeDefaultNotes
baseUrlstring(required)Service base URL — usually ends in /v1.
modelstring(required)Model identifier expected by the runtime.
apiKeystring?nullOptional Bearer token. Local runtimes (vLLM, LM Studio, TEI) typically don't need one; hosted services (DeepSeek, Groq, Together) do.

Option 2 — direct construction

using LogicGrid.Core.Providers;

var llm = new OpenAiCompatibleClient(
baseUrl: "http://localhost:8000/v1",
defaultModel: "Qwen/Qwen2.5-7B-Instruct",
apiKey: null);
ParameterTypeDefaultNotes
baseUrlstring(required)Same as the factory's baseUrl.
defaultModelstring(required)The model used when the agent or call site doesn't override it.
apiKeystring?nullSame as the factory's apiKey.

The factory and the constructor produce equivalent clients. Use direct construction when you need an injected HttpClient explained below (for retries, proxies, or testing).

Option 3 — injected HttpClient

using System.Net.Http;
using LogicGrid.Core.Providers;

var http = new HttpClient();
// http.DefaultRequestHeaders.Add("Authorization", "Bearer ...");

var llm = new OpenAiCompatibleClient(
httpClient: http,
baseUrl: "https://api.deepseek.com/v1",
defaultModel: "deepseek-chat");
ParameterTypeDefaultNotes
httpClientHttpClient(required)Pre-configured client. Caller sets any Authorization header.
baseUrlstring(required)Same as above.
defaultModelstring(required)Same as above.

This overload takes no apiKey — the caller is responsible for setting the Authorization: Bearer ... header on the supplied HttpClient (typically via IHttpClientFactory, a DelegatingHandler, or a test fake). Use it when auth is managed outside the client, or in unit tests with a mocked transport.

Tool calling

Native tool calling depends on both the runtime and the model. Most OpenAI-compatible backends (vLLM, LM Studio, etc.) implement the OpenAI tool-call protocol, but the model itself has to actually emit tool calls. Stay on PromptSchemaStrategy (the default) and switch to native only after testing the specific runtime + model. See Tool calling strategy for the strategy reference and how to switch.

Embeddings

Same two-way pattern. Works with TEI, vLLM running an embedding model, or any service that exposes /v1/embeddings.

using LogicGrid.Memory.Embeddings;

var embedder = EmbeddingClientBase.Compatible(
baseUrl: "http://localhost:3000", // TEI
model: "BAAI/bge-large-en-v1.5");
ParameterTypeDefaultNotes
baseUrlstring(required)Service base URL without the trailing /v1 — the client appends it.
modelstring(required)Model identifier expected by the runtime.
dimensionsint0Vector size. 0 = auto-detect from the first response.
apiKeystring?nullOptional Bearer token.

Option 2 — direct construction

Mirrors OpenAiCompatibleClient exactly: baseUrl first, then defaultModel.

using LogicGrid.Memory.Embeddings;

var embedder = new OpenAiCompatibleEmbeddingClient(
baseUrl: "http://localhost:3000",
defaultModel: "BAAI/bge-large-en-v1.5",
dimensions: 1024,
apiKey: null);
ParameterTypeDefaultNotes
baseUrlstring(required)Server base URL without the trailing /v1.
defaultModelstring(required)Embedding model used when the call site doesn't override it.
dimensionsint0Vector size. 0 = auto-detect from first response.
apiKeystring?nullOptional bearer token.

A second overload accepts a custom HttpClient for retries, proxies, or DI — the caller is responsible for any required Authorization header:

using System.Net.Http;
using LogicGrid.Memory.Embeddings;

var http = new HttpClient();
// http.DefaultRequestHeaders.Add("Authorization", "Bearer ...");

var embedder = new OpenAiCompatibleEmbeddingClient(
httpClient: http,
baseUrl: "http://localhost:3000",
defaultModel: "BAAI/bge-large-en-v1.5",
dimensions: 1024);
ParameterTypeDefaultNotes
httpClientHttpClient(required)Pre-configured client. Caller sets any Authorization header.
baseUrlstring(required)Same as above.
defaultModelstring(required)Same as above.
dimensionsint0Same as above.

Compatibility caveats

IssueWhat to do
Service rejects unknown fieldsSome compat layers don't ignore temperature or max_tokens they don't support. Check the runtime's docs.
No usage field in responseCost tracking will be zero. Expected on most local runtimes — they don't bill.
Tool calls return as plain textStay on PromptSchemaStrategy.