Server Setup

Set up your backend API to handle chat requests from the Copilot SDK.

Overview

The Copilot SDK frontend connects to your backend API endpoint. Your server:

Receives chat messages from the frontend
Calls the LLM with your configuration
Streams the response back to the client

Frontend

React UI

POST /api/chat

Stream Response

Backend

Your API

REST API Contract

Request

Endpoint: POST /api/chat

{
  "messages": [
    { "role": "user", "content": "Hello!" }
  ]
}

Response

The SDK supports three response formats:

Simple text streaming for basic chat (no tools).

Content-Type: text/plain; charset=utf-8

Hello! How can I help you today?

Use result.toTextStreamResponse() to return this format.

SSE format with structured events. Use when you need tools, usage info, or step-by-step data.

Content-Type: text/event-stream

data: {"type":"text-delta","text":"Hello"}
data: {"type":"text-delta","text":"!"}
data: {"type":"finish","finishReason":"stop","usage":{"promptTokens":10,"completionTokens":5}}
data: [DONE]

Use result.toDataStreamResponse() to return this format.

Complete response in a single JSON object. Use for batch processing, logging, or simpler integrations.

Content-Type: application/json

{
  "text": "Hello! How can I help you today?",
  "usage": {
    "promptTokens": 10,
    "completionTokens": 8,
    "totalTokens": 18
  }
}

Use generateText() or runtime.chat() to return this format.

Framework Examples (Streaming)

app/api/chat/route.ts

import { streamText } from '@yourgpt/llm-sdk';
import { openai } from '@yourgpt/llm-sdk/openai';

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = await streamText({
    model: openai('gpt-4o'),
    system: 'You are a helpful assistant.',
    messages,
  });

  return result.toTextStreamResponse();
}

server.ts

import express from 'express';
import cors from 'cors';
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const app = express();
app.use(cors());
app.use(express.json());

// Create runtime once at startup
const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  systemPrompt: 'You are a helpful assistant.',
});

// Chat endpoint - one-liner with StreamResult API!
app.post('/api/chat', async (req, res) => {
  await runtime.stream(req.body).pipeToResponse(res);
});

app.listen(3001, () => console.log('Server on http://localhost:3001'));

server.ts

import { createServer } from 'http';
import { streamText } from '@yourgpt/llm-sdk';
import { openai } from '@yourgpt/llm-sdk/openai';

createServer(async (req, res) => {
  if (req.method === 'POST' && req.url === '/api/chat') {
    const body = await getBody(req);
    const { messages } = JSON.parse(body);

    const result = await streamText({
      model: openai('gpt-4o'),
      system: 'You are a helpful assistant.',
      messages,
    });

    const response = result.toTextStreamResponse();
    res.writeHead(200, Object.fromEntries(response.headers));

    const reader = response.body!.getReader();
    while (true) {
      const { done, value } = await reader.read();
      if (done) break;
      res.write(value);
    }
    res.end();
  }
}).listen(3001);

function getBody(req: any): Promise<string> {
  return new Promise((resolve) => {
    let data = '';
    req.on('data', (chunk: any) => data += chunk);
    req.on('end', () => resolve(data));
  });
}

Framework Examples (Non-Streaming)

For use cases where you need the complete response before returning (batch processing, logging, simpler integration), use the non-streaming approach.

Response Format

Content-Type: application/json

{
  "text": "Hello! How can I help you today?",
  "usage": {
    "promptTokens": 10,
    "completionTokens": 8,
    "totalTokens": 18
  }
}

Using generateText

app/api/chat/route.ts

import { generateText } from '@yourgpt/llm-sdk';
import { openai } from '@yourgpt/llm-sdk/openai';

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = await generateText({
    model: openai('gpt-4o'),
    system: 'You are a helpful assistant.',
    messages,
  });

  return Response.json({
    text: result.text,
    usage: result.usage,
  });
}

server.ts

import express from 'express';
import cors from 'cors';
import { generateText } from '@yourgpt/llm-sdk';
import { openai } from '@yourgpt/llm-sdk/openai';

const app = express();
app.use(cors());
app.use(express.json());

app.post('/api/chat', async (req, res) => {
  const { messages } = req.body;

  const result = await generateText({
    model: openai('gpt-4o'),
    system: 'You are a helpful assistant.',
    messages,
  });

  res.json({
    text: result.text,
    usage: result.usage,
  });
});

app.listen(3001, () => console.log('Server on http://localhost:3001'));

server.ts

import { createServer } from 'http';
import { generateText } from '@yourgpt/llm-sdk';
import { openai } from '@yourgpt/llm-sdk/openai';

createServer(async (req, res) => {
  if (req.method === 'POST' && req.url === '/api/chat') {
    const body = await getBody(req);
    const { messages } = JSON.parse(body);

    const result = await generateText({
      model: openai('gpt-4o'),
      system: 'You are a helpful assistant.',
      messages,
    });

    res.writeHead(200, { 'Content-Type': 'application/json' });
    res.end(JSON.stringify({
      text: result.text,
      usage: result.usage,
    }));
  }
}).listen(3001);

function getBody(req: any): Promise<string> {
  return new Promise((resolve) => {
    let data = '';
    req.on('data', (chunk: any) => data += chunk);
    req.on('end', () => resolve(data));
  });
}

Using Runtime chat()

The runtime also provides a chat() method for non-streaming:

app/api/chat/route.ts

import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  systemPrompt: 'You are a helpful assistant.',
});

export async function POST(req: Request) {
  const body = await req.json();

  const { text, messages, toolCalls } = await runtime.chat(body);

  return Response.json({
    text,
    messages,
    toolCalls,
  });
}

server.ts

import express from 'express';
import cors from 'cors';
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const app = express();
app.use(cors());
app.use(express.json());

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  systemPrompt: 'You are a helpful assistant.',
});

app.post('/api/chat', async (req, res) => {
  const { text, messages, toolCalls } = await runtime.chat(req.body);

  res.json({
    text,
    messages,
    toolCalls,
  });
});

app.listen(3001, () => console.log('Server on http://localhost:3001'));

server.ts

import { createServer } from 'http';
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  systemPrompt: 'You are a helpful assistant.',
});

createServer(async (req, res) => {
  if (req.method === 'POST' && req.url === '/api/chat') {
    const body = await getBody(req);
    const chatRequest = JSON.parse(body);

    const { text, messages, toolCalls } = await runtime.chat(chatRequest);

    res.writeHead(200, { 'Content-Type': 'application/json' });
    res.end(JSON.stringify({ text, messages, toolCalls }));
  }
}).listen(3001);

function getBody(req: any): Promise<string> {
  return new Promise((resolve) => {
    let data = '';
    req.on('data', (chunk: any) => data += chunk);
    req.on('end', () => resolve(data));
  });
}

Using stream().collect()

You can also collect a stream into a single response:

app.post('/api/chat', async (req, res) => {
  const { text, messages, toolCalls } = await runtime.stream(req.body).collect();

  res.json({ text, messages, toolCalls });
});

When to use non-streaming:

Background processing or batch operations
When you need the full response before taking action
Simpler integration without SSE handling
Logging or analytics that need complete responses

With Tools

Add tools to let the AI call functions on your server:

app/api/chat/route.ts

import { streamText, tool } from '@yourgpt/llm-sdk';
import { openai } from '@yourgpt/llm-sdk/openai';
import { z } from 'zod';

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = await streamText({
    model: openai('gpt-4o'),
    system: 'You are a helpful assistant.',
    messages,
    tools: {
      getWeather: tool({
        description: 'Get current weather for a city',
        parameters: z.object({
          city: z.string().describe('City name'),
        }),
        execute: async ({ city }) => {
          const data = await fetchWeatherAPI(city);
          return { temperature: data.temp, condition: data.condition };
        },
      }),
      searchProducts: tool({
        description: 'Search the product database',
        parameters: z.object({
          query: z.string(),
          limit: z.number().optional().default(10),
        }),
        execute: async ({ query, limit }) => {
          return await db.products.search(query, limit);
        },
      }),
    },
    maxSteps: 5,
  });

  return result.toDataStreamResponse();
}

Use toDataStreamResponse() when using tools to stream structured events including tool calls and results.

app/api/chat/route.ts

import { generateText, tool } from '@yourgpt/llm-sdk';
import { openai } from '@yourgpt/llm-sdk/openai';
import { z } from 'zod';

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = await generateText({
    model: openai('gpt-4o'),
    system: 'You are a helpful assistant.',
    messages,
    tools: {
      getWeather: tool({
        description: 'Get current weather for a city',
        parameters: z.object({
          city: z.string().describe('City name'),
        }),
        execute: async ({ city }) => {
          const data = await fetchWeatherAPI(city);
          return { temperature: data.temp, condition: data.condition };
        },
      }),
      searchProducts: tool({
        description: 'Search the product database',
        parameters: z.object({
          query: z.string(),
          limit: z.number().optional().default(10),
        }),
        execute: async ({ query, limit }) => {
          return await db.products.search(query, limit);
        },
      }),
    },
    maxSteps: 5,
  });

  return Response.json({
    text: result.text,
    toolCalls: result.toolCalls,
    toolResults: result.toolResults,
    usage: result.usage,
  });
}

The response includes all tool calls and results:

{
  "text": "The weather in Tokyo is 22°C and sunny.",
  "toolCalls": [
    { "id": "call_123", "name": "getWeather", "args": { "city": "Tokyo" } }
  ],
  "toolResults": [
    { "toolCallId": "call_123", "result": { "temperature": 22, "condition": "sunny" } }
  ],
  "usage": { "promptTokens": 50, "completionTokens": 25, "totalTokens": 75 }
}

Runtime API (Advanced)

For more control over the server, use createRuntime() instead of streamText():

app/api/chat/route.ts

import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  systemPrompt: 'You are a helpful assistant.',
  agentLoop: {
    maxIterations: 20,  // Max tool call cycles
    debug: true,        // Enable debug logging
  },
  tools: [/* server-side tools */],
});

export async function POST(request: Request) {
  return runtime.handleRequest(request);
}

Runtime Configuration

Option	Type	Description
`provider`	`AIProvider`	Provider instance from `createOpenAI()`, `createAnthropic()`, etc.
`model`	`string`	Model ID (e.g., `'gpt-4o'`, `'claude-sonnet-4-20250514'`)
`systemPrompt`	`string`	Default system prompt
`agentLoop.maxIterations`	`number`	Max tool execution cycles (default: 20)
`agentLoop.debug`	`boolean`	Enable debug logging
`tools`	`ToolDefinition[]`	Server-side tools
`toolContext`	`Record<string, unknown>`	Context data passed to all tool handlers
`debug`	`boolean`	Enable request/response logging

Server-Side Persistence

Use the onFinish callback to persist messages after each request:

app/api/chat/route.ts

export async function POST(request: Request) {
  return runtime.handleRequest(request, {
    onFinish: async ({ messages, threadId }) => {
      // Save to your database
      await db.thread.upsert({
        where: { id: threadId },
        update: { messages, updatedAt: new Date() },
        create: { id: threadId, messages },
      });
    },
  });
}

Tool Context

Pass authentication or context data to all tool handlers:

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  toolContext: {
    userId: 'user_123',
    tenantId: 'tenant_456',
  },
  tools: [
    {
      name: 'get_user_data',
      description: 'Get data for the current user',
      location: 'server',
      inputSchema: { type: 'object', properties: {}, required: [] },
      handler: async (args, context) => {
        // Access context.data.userId, context.data.tenantId
        // Also available: context.headers, context.request, context.threadId
        const user = await db.user.findById(context.data.userId);
        return { success: true, data: user };
      },
    },
  ],
});

Provider Examples

import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
});

import { createRuntime } from '@yourgpt/llm-sdk';
import { createAnthropic } from '@yourgpt/llm-sdk/anthropic';

const runtime = createRuntime({
  provider: createAnthropic({ apiKey: process.env.ANTHROPIC_API_KEY }),
  model: 'claude-sonnet-4-20250514',
});

Requires @anthropic-ai/sdk package: npm install @anthropic-ai/sdk

import { createRuntime } from '@yourgpt/llm-sdk';
import { createGoogle } from '@yourgpt/llm-sdk/google';

const runtime = createRuntime({
  provider: createGoogle({ apiKey: process.env.GOOGLE_API_KEY }),
  model: 'gemini-2.0-flash',
});

Google uses OpenAI-compatible API. Requires openai package.

StreamResult API

The runtime.stream() method returns a StreamResult object with multiple ways to consume the response:

Response Methods

Method	Framework	Description
`toResponse()`	Next.js, Cloudflare, Deno	Returns Web `Response` with SSE
`toTextResponse()`	Next.js, Cloudflare, Deno	Returns text-only `Response`
`pipeToResponse(res)`	Express, Node.js	Pipes SSE to ServerResponse
`pipeTextToResponse(res)`	Express, Node.js	Pipes text to ServerResponse
`toReadableStream()`	Any	Returns raw ReadableStream
`collect()`	Any	Collects full result (non-streaming)

Express Example

server.ts

import express from 'express';
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const app = express();
app.use(express.json());

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
});

// One-liner streaming
app.post('/api/chat', async (req, res) => {
  await runtime.stream(req.body).pipeToResponse(res);
});

// Or use the built-in handler
app.post('/api/chat-alt', runtime.expressHandler());

Collecting Results

For non-streaming use cases or logging:

app.post('/api/chat', async (req, res) => {
  const { text, messages, toolCalls } = await runtime.stream(req.body).collect();

  console.log('Response:', text);
  console.log('Tool calls:', toolCalls);

  res.json({ response: text });
});

Event Handlers

Process events as they stream (similar to Anthropic SDK):

const result = runtime.stream(body)
  .on('text', (text) => console.log('Text:', text))
  .on('toolCall', (call) => console.log('Tool:', call.name))
  .on('done', (final) => console.log('Done:', final.text))
  .on('error', (err) => console.error('Error:', err));

await result.pipeToResponse(res);

Environment Variables

Store your API keys in environment variables:

.env.local

OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_AI_API_KEY=...

Access them in your API route:

import { openai } from '@yourgpt/llm-sdk/openai';

// API key is read from OPENAI_API_KEY automatically
const model = openai('gpt-4o');

// Or pass explicitly
const model = openai('gpt-4o', {
  apiKey: process.env.OPENAI_API_KEY,
});

CORS Configuration

For cross-origin requests (e.g., frontend on different port):

app/api/chat/route.ts

export async function OPTIONS() {
  return new Response(null, {
    headers: {
      'Access-Control-Allow-Origin': '*',
      'Access-Control-Allow-Methods': 'POST, OPTIONS',
      'Access-Control-Allow-Headers': 'Content-Type',
    },
  });
}

export async function POST(req: Request) {
  // ... your handler

  const response = result.toTextStreamResponse();

  // Add CORS headers
  response.headers.set('Access-Control-Allow-Origin', '*');

  return response;
}

server.ts

createServer(async (req, res) => {
  // Handle preflight
  if (req.method === 'OPTIONS') {
    res.writeHead(204, {
      'Access-Control-Allow-Origin': '*',
      'Access-Control-Allow-Methods': 'POST, OPTIONS',
      'Access-Control-Allow-Headers': 'Content-Type',
    });
    res.end();
    return;
  }

  // Add CORS headers to response
  res.setHeader('Access-Control-Allow-Origin', '*');

  // ... your handler
});

Error Handling

export async function POST(req: Request) {
  try {
    const { messages } = await req.json();

    const result = await streamText({
      model: openai('gpt-4o'),
      messages,
    });

    return result.toTextStreamResponse();
  } catch (error) {
    console.error('Chat error:', error);

    return Response.json(
      { error: 'Failed to process chat request' },
      { status: 500 }
    );
  }
}

Request Validation

Validate incoming requests with Zod:

import { z } from 'zod';

const ChatRequestSchema = z.object({
  messages: z.array(z.object({
    role: z.enum(['user', 'assistant', 'system']),
    content: z.string(),
  })),
});

export async function POST(req: Request) {
  const body = await req.json();

  const parsed = ChatRequestSchema.safeParse(body);
  if (!parsed.success) {
    return Response.json(
      { error: 'Invalid request', details: parsed.error.errors },
      { status: 400 }
    );
  }

  const { messages } = parsed.data;
  // ... continue with validated data
}

Connecting Frontend

Point your frontend to your API endpoint:

app/providers.tsx

'use client';

import { CopilotProvider } from '@yourgpt/copilot-sdk/react';

export function Providers({ children }: { children: React.ReactNode }) {
  return (
    <CopilotProvider runtimeUrl="/api/chat">
      {children}
    </CopilotProvider>
  );
}

For a separate backend server:

<CopilotProvider runtimeUrl="http://localhost:3001/api/chat">

Next Steps

LLM SDK - Core text generation functions
Backend Tools - Add server-side tools
Providers - Configure different LLM providers

Server Setup

On this page