Skip to main content

Map-reduce admin

MapReduceAdmin<TOutput> applies one map agent to every item in a list (in parallel), then runs a reduce agent over all the mapped results.

The map agent runs once per inputinputs.Count map calls in total — concurrently. The reduce agent runs once with all the mapped outputs joined together.

Example — summarise multiple emails, then synthesise

using LogicGrid.Core.Admins;
using LogicGrid.Core.Agents;
using LogicGrid.Core.Llm;

var llm = LlmClientBase.Ollama("llama3.2");

IAgent perEmail = new Agent<string>(
name: "PerEmail",
description: "Summarises one email.",
systemPrompt: "Summarise this email in one sentence and tag it [urgent|fyi|action].",
llm: llm);

IAgent digest = new Agent<string>(
name: "Digest",
description: "Combines email summaries into a daily digest.",
systemPrompt: "Combine the summaries into a 4-bullet daily digest, ordered by priority.",
llm: llm);

var admin = new MapReduceAdmin<string>(
name: "InboxDigest",
llmClient: llm,
mapAgent: perEmail,
reduceAgent: digest);

var emails = new List<string>
{
"Hi team — kicking off the Q2 planning offsite next week …",
"Reminder: please submit your timesheets by Friday …",
"URGENT: prod alerting rule is misfiring on the auth service …",
"Coffee chat anyone? I'm in the office Thursday …",
};

var ctx = new AgentContext().WithLogging();
var output = await admin.RunAsync(input: emails, ctx: ctx);

Console.WriteLine($"\n{output}");
09:30:01.118 [INF] [d8e3f009] Run started — admin: InboxDigest | task: 4 inputs
09:30:01.220 [INF] [d8e3f009] Parallel fan-out — 4 agents: PerEmail, PerEmail, PerEmail, PerEmail
09:30:04.402 [INF] [d8e3f009] [PerEmail] completed | output: Q2 planning offsite is next week. [fyi] | 3182ms | 88 tokens
09:30:04.520 [INF] [d8e3f009] [PerEmail] completed | output: Timesheets due by Friday. [action] | 3300ms | 76 tokens
09:30:04.611 [INF] [d8e3f009] [PerEmail] completed | output: Prod auth alerts are misfiring. [urgent] | 3391ms | 92 tokens
09:30:04.720 [INF] [d8e3f009] [PerEmail] completed | output: Optional coffee chat Thursday. [fyi] | 3500ms | 70 tokens
09:30:04.730 [INF] [d8e3f009] Parallel fan-in — 4 agents completed | 3510ms
09:30:08.512 [INF] [d8e3f009] [Digest] completed | output: • [urgent] Prod auth alerts are misfiring … | 3782ms | 220 tokens
09:30:08.520 [INF] [d8e3f009] Run completed — 5 agents, 5 LLM calls | 7402ms

Constructor

public MapReduceAdmin(
string name,
LlmClientBase llmClient,
IAgent mapAgent,
IAgent reduceAgent,
AdminOptions? options = null,
IAgentEventBus? eventBus = null)

RunAsync takes IList<string> instead of a single string.

Typed reduce output

Like every admin, MapReduceAdmin<TOutput> accepts a typed TOutput. When TOutput isn't string, the reduce agent's response is parsed as JSON and deserialized into your type. You can have any agent return typed output — admins included — by setting the appropriate generic argument and prompting the LLM to respond with matching JSON.

public sealed class DigestDoc
{
public string Headline { get; set; } = "";
public IList<string> Bullets { get; set; } = new List<string>();
}

IAgent digest = new Agent<DigestDoc>(
name: "Digest",
description: "Combines summaries into a structured digest.",
systemPrompt: "Return JSON: { \"headline\": ..., \"bullets\": [...] }",
llm: llm);

var admin = new MapReduceAdmin<DigestDoc>(
name: "InboxDigest",
llmClient: llm,
mapAgent: perEmail,
reduceAgent: digest);

DigestDoc doc = await admin.RunAsync(emails);
Console.WriteLine(doc.Headline);

Use it when

  • You have a list of items to process — emails, log lines, support tickets, code review comments, document chunks.
  • The same agent applies to every item.
  • A final summarisation step reads all the mapped outputs and produces a synthesis.

Don't use it when

  • The items aren't independent — earlier outputs change later prompts → Sequential or Graph.
  • You only have one input → use the map agent directly.

Bounding the burst with MaxParallelism

The map phase runs every input concurrently by default — a 50-item list against a hosted provider will fire 50 simultaneous calls and likely trip rate limits. Set AdminOptions.MaxParallelism to cap how many map calls run at once; the rest queue and start as earlier ones finish:

var admin = new MapReduceAdmin<string>(
name: "InboxDigest",
llmClient: llm,
mapAgent: perEmail,
reduceAgent: digest,
options: new AdminOptions { MaxParallelism = 4 });

MaxParallelism = 0 (the default) means unlimited. The reduce phase is a single call and is unaffected.

Common pitfalls

  • Per-item cost. A 100-item map is 100+ LLM calls. Set AdminOptions.MaxBudgetUsd if you don't want unbounded spend.
  • Verbose logs. Map iterations are quiet by default (ShowMapProgress = false in LogicGridLoggerOptions); enable it only when you're debugging a specific run.
  • Reduce token blow-out. If the map outputs are large and the list is long, the reduce agent's input can exceed the model's context window. The next section shows the standard solution — hierarchical map-reduce.

Hierarchical map-reduce

When the reduce step would overflow the context window, run a second (or third) map-reduce on top of the first one's outputs. Each layer halves or quarters the data volume. The pattern is recursive and composes naturally because every admin's RunAsync returns a single string (or your typed output).

// Layer 1: batch the inputs into chunks, summarise each chunk
var batches = inputs
.Chunk(20) // .NET 6+: System.Linq
.Select(c => c.ToList() as IList<string>)
.ToList();

var layer1 = new MapReduceAdmin<string>(
name: "Layer1",
llmClient: llm,
mapAgent: perItem, // summarises one item
reduceAgent: batchSummariser); // summarises 20 items

var batchSummaries = new List<string>();
foreach (var batch in batches)
batchSummaries.Add(await layer1.RunAsync(batch));

// Layer 2: final reduction over the (now small) batch summaries
var layer2 = new MapReduceAdmin<string>(
name: "Layer2",
llmClient: llm,
mapAgent: passThrough, // identity-ish summariser
reduceAgent: finalDigest);

string digest = await layer2.RunAsync(batchSummaries);

For very large corpora, wrap the layer-1 admin inside an agent (or a custom admin) and chain layers programmatically until the volume fits the model's context window.