Core
Kairo Core (@kairo/core) - The backbone and central nervous system of the Kairo framework
Kairo Core (@kairo/core)
The @kairo/core package is the backbone and central nervous system of the Kairo framework. It provides the essential interfaces, abstract classes, and the orchestration engine (LmPipeline) that glue all discrete components together.
Architecture Overview
Kairo is built on a highly modular, decoupled architecture. The LmPipeline sits at the center, mediating between the raw generation capabilities of AI models and the specialized tools/memory provided by Extensions.
1. LmPipeline (The Orchestrator)
The central director. It takes your application's input, routes it through installed extensions (to inject tools, retrieve memory, or transform inputs), queries the underlying LanguageModel, and manages the bidirectional flow of reasoning and tool executions.
2. Provider & LanguageModel
A Provider implements the connection logistics (such as API keys and base URLs) for a vendor, and acts as a factory providing one or multiple LanguageModel instances. The LanguageModel contains the actual logic for translating Kairo's unified message format into vendor-specific API structures and handling the streaming response (LanguageModelStreamPart).
3. Extensions (ExtensionProtocol)
Extensions are isolated plugins that attach to the pipeline. They intercept specific lifecycle stages (like tool-injection or memory-retrieval). Crucially, they run via an RPC protocol, meaning an extension can safely run in a separate process, an isolated Web Worker, or even a remote server.
The Pipeline Lifecycle
To develop effective Extensions, it's critical to understand the order in which LmPipeline executes its hooks when you call generate():
- Input Enrichment (
enrichInput): Modify the initial user prompt (e.g., adding hidden system instructions). - Memory Retrieval (
retrieveMemory): Search vector databases or history stores and prepend relevant context to the conversation. - Context Pruning (
pruneContext): Trim old or irrelevant messages if the context window limit is approached. - Tool Injection (
injectTools): Query extensions to dynamically append function schemas (e.g., pulling available MCP server tools). - Model Configuration (
configModel): Adjust temperature, top-p, or max tokens dynamically. - --- Model Generation Starts ---_
- Stream Transformation (
streamTransform): As tokens flow back from the LLM, extensions can intercept, mask, or re-route specific chunks in real-time. - Tool Execution (
executeTool): If the model chooses to call a tool, the pipeline pauses, delegates the exact function call back to the Extension that provided it, and awaits the result (fetchToolResult). - --- Turn Completes ---
- Turn Refinement (
refineTurn): Post-process the final generated message. - Context Persistence (
persistContext): Save the newly generated turn back into long-term storage or database.
Isomorphic by Design
@kairo/core is strictly environment-agnostic. It contains absolutely no Node.js-specific native modules (like fs or net), nor does it enforce specific HTTP clients.
This means you can import and run the core pipeline anywhere:
- Standard Node.js Servers
- Serverless Edge Functions (Cloudflare Workers, Vercel Edge)
- Directly in the Browser (making it ideal for Web-LLMs or local-first apps)
- React Native / Mobile
When an Extension requires native capabilities (like reading the local hard drive for RAG), it runs as an ExtensionServer in a Node process, while your UI runs the LmPipeline in the browser, connected via WebSocket or HTTP transports!
Usage Example
A foundational example combining a Provider and the Core Pipeline:
import { LmPipeline } from "@kairo/core";
import { OpenAIProvider } from "@kairo/provider-openai";
// 1. Initialize the Provider with your configuration
const provider = new OpenAIProvider({ apiKey: process.env.OPENAI_API_KEY! });
// 2. Obtain a specific LanguageModel instance
// (e.g., finding the model matching 'gpt-4o')
const myModel = provider.models.find((m) => m.id === "gpt-4o");
// 3. Mount the pipeline
const pipeline = new LmPipeline({
model: myModel,
});
// 4. Run the generation loop
async function run() {
const stream = await pipeline.generate({
messages: [
{ role: "user", content: "Explain quantum computing in one sentence." },
],
});
// Consume the WHATWG ReadableStream
const reader = stream.getReader();
while (true) {
const { done, value } = await reader.read();
if (done) break;
if (value.type === "text-delta") {
process.stdout.write(value.textDelta);
}
}
}
run();