Server Setup

Configure your backend API for the Copilot SDK

Set up your backend API to handle chat requests from the Copilot SDK.


Overview

The Copilot SDK frontend connects to your backend API endpoint. Your server:

  1. Receives chat messages from the frontend
  2. Calls the LLM with your configuration
  3. Streams the response back to the client
Frontend
React UI
POST /api/chat
Stream Response
Backend
Your API

REST API Contract

Request

Endpoint: POST /api/chat

{
  "messages": [
    { "role": "user", "content": "Hello!" }
  ]
}

Response

The SDK supports three response formats:

Simple text streaming for basic chat (no tools).

Content-Type: text/plain; charset=utf-8

Hello! How can I help you today?

Use result.toTextStreamResponse() to return this format.

SSE format with structured events. Use when you need tools, usage info, or step-by-step data.

Content-Type: text/event-stream

data: {"type":"text-delta","text":"Hello"}
data: {"type":"text-delta","text":"!"}
data: {"type":"finish","finishReason":"stop","usage":{"promptTokens":10,"completionTokens":5}}
data: [DONE]

Use result.toDataStreamResponse() to return this format.

Complete response in a single JSON object. Use for batch processing, logging, or simpler integrations.

Content-Type: application/json

{
  "text": "Hello! How can I help you today?",
  "usage": {
    "promptTokens": 10,
    "completionTokens": 8,
    "totalTokens": 18
  }
}

Use generateText() or runtime.chat() to return this format.


Framework Examples (Streaming)

app/api/chat/route.ts
import { streamText } from '@yourgpt/llm-sdk';
import { openai } from '@yourgpt/llm-sdk/openai';

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = await streamText({
    model: openai('gpt-4o'),
    system: 'You are a helpful assistant.',
    messages,
  });

  return result.toTextStreamResponse();
}
server.ts
import express from 'express';
import cors from 'cors';
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const app = express();
app.use(cors());
app.use(express.json());

// Create runtime once at startup
const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  systemPrompt: 'You are a helpful assistant.',
});

// Chat endpoint - one-liner with StreamResult API!
app.post('/api/chat', async (req, res) => {
  await runtime.stream(req.body).pipeToResponse(res);
});

app.listen(3001, () => console.log('Server on http://localhost:3001'));
server.ts
import { createServer } from 'http';
import { streamText } from '@yourgpt/llm-sdk';
import { openai } from '@yourgpt/llm-sdk/openai';

createServer(async (req, res) => {
  if (req.method === 'POST' && req.url === '/api/chat') {
    const body = await getBody(req);
    const { messages } = JSON.parse(body);

    const result = await streamText({
      model: openai('gpt-4o'),
      system: 'You are a helpful assistant.',
      messages,
    });

    const response = result.toTextStreamResponse();
    res.writeHead(200, Object.fromEntries(response.headers));

    const reader = response.body!.getReader();
    while (true) {
      const { done, value } = await reader.read();
      if (done) break;
      res.write(value);
    }
    res.end();
  }
}).listen(3001);

function getBody(req: any): Promise<string> {
  return new Promise((resolve) => {
    let data = '';
    req.on('data', (chunk: any) => data += chunk);
    req.on('end', () => resolve(data));
  });
}

Framework Examples (Non-Streaming)

For use cases where you need the complete response before returning (batch processing, logging, simpler integration), use the non-streaming approach.

Response Format

Content-Type: application/json

{
  "text": "Hello! How can I help you today?",
  "usage": {
    "promptTokens": 10,
    "completionTokens": 8,
    "totalTokens": 18
  }
}

Using generateText

app/api/chat/route.ts
import { generateText } from '@yourgpt/llm-sdk';
import { openai } from '@yourgpt/llm-sdk/openai';

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = await generateText({
    model: openai('gpt-4o'),
    system: 'You are a helpful assistant.',
    messages,
  });

  return Response.json({
    text: result.text,
    usage: result.usage,
  });
}
server.ts
import express from 'express';
import cors from 'cors';
import { generateText } from '@yourgpt/llm-sdk';
import { openai } from '@yourgpt/llm-sdk/openai';

const app = express();
app.use(cors());
app.use(express.json());

app.post('/api/chat', async (req, res) => {
  const { messages } = req.body;

  const result = await generateText({
    model: openai('gpt-4o'),
    system: 'You are a helpful assistant.',
    messages,
  });

  res.json({
    text: result.text,
    usage: result.usage,
  });
});

app.listen(3001, () => console.log('Server on http://localhost:3001'));
server.ts
import { createServer } from 'http';
import { generateText } from '@yourgpt/llm-sdk';
import { openai } from '@yourgpt/llm-sdk/openai';

createServer(async (req, res) => {
  if (req.method === 'POST' && req.url === '/api/chat') {
    const body = await getBody(req);
    const { messages } = JSON.parse(body);

    const result = await generateText({
      model: openai('gpt-4o'),
      system: 'You are a helpful assistant.',
      messages,
    });

    res.writeHead(200, { 'Content-Type': 'application/json' });
    res.end(JSON.stringify({
      text: result.text,
      usage: result.usage,
    }));
  }
}).listen(3001);

function getBody(req: any): Promise<string> {
  return new Promise((resolve) => {
    let data = '';
    req.on('data', (chunk: any) => data += chunk);
    req.on('end', () => resolve(data));
  });
}

Using Runtime chat()

The runtime also provides a chat() method for non-streaming:

app/api/chat/route.ts
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  systemPrompt: 'You are a helpful assistant.',
});

export async function POST(req: Request) {
  const body = await req.json();

  const { text, messages, toolCalls } = await runtime.chat(body);

  return Response.json({
    text,
    messages,
    toolCalls,
  });
}
server.ts
import express from 'express';
import cors from 'cors';
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const app = express();
app.use(cors());
app.use(express.json());

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  systemPrompt: 'You are a helpful assistant.',
});

app.post('/api/chat', async (req, res) => {
  const { text, messages, toolCalls } = await runtime.chat(req.body);

  res.json({
    text,
    messages,
    toolCalls,
  });
});

app.listen(3001, () => console.log('Server on http://localhost:3001'));
server.ts
import { createServer } from 'http';
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  systemPrompt: 'You are a helpful assistant.',
});

createServer(async (req, res) => {
  if (req.method === 'POST' && req.url === '/api/chat') {
    const body = await getBody(req);
    const chatRequest = JSON.parse(body);

    const { text, messages, toolCalls } = await runtime.chat(chatRequest);

    res.writeHead(200, { 'Content-Type': 'application/json' });
    res.end(JSON.stringify({ text, messages, toolCalls }));
  }
}).listen(3001);

function getBody(req: any): Promise<string> {
  return new Promise((resolve) => {
    let data = '';
    req.on('data', (chunk: any) => data += chunk);
    req.on('end', () => resolve(data));
  });
}

Using stream().collect()

You can also collect a stream into a single response:

app.post('/api/chat', async (req, res) => {
  const { text, messages, toolCalls } = await runtime.stream(req.body).collect();

  res.json({ text, messages, toolCalls });
});

When to use non-streaming:

  • Background processing or batch operations
  • When you need the full response before taking action
  • Simpler integration without SSE handling
  • Logging or analytics that need complete responses

With Tools

Add tools to let the AI call functions on your server:

app/api/chat/route.ts
import { streamText, tool } from '@yourgpt/llm-sdk';
import { openai } from '@yourgpt/llm-sdk/openai';
import { z } from 'zod';

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = await streamText({
    model: openai('gpt-4o'),
    system: 'You are a helpful assistant.',
    messages,
    tools: {
      getWeather: tool({
        description: 'Get current weather for a city',
        parameters: z.object({
          city: z.string().describe('City name'),
        }),
        execute: async ({ city }) => {
          const data = await fetchWeatherAPI(city);
          return { temperature: data.temp, condition: data.condition };
        },
      }),
      searchProducts: tool({
        description: 'Search the product database',
        parameters: z.object({
          query: z.string(),
          limit: z.number().optional().default(10),
        }),
        execute: async ({ query, limit }) => {
          return await db.products.search(query, limit);
        },
      }),
    },
    maxSteps: 5,
  });

  return result.toDataStreamResponse();
}

Use toDataStreamResponse() when using tools to stream structured events including tool calls and results.

app/api/chat/route.ts
import { generateText, tool } from '@yourgpt/llm-sdk';
import { openai } from '@yourgpt/llm-sdk/openai';
import { z } from 'zod';

export async function POST(req: Request) {
  const { messages } = await req.json();

  const result = await generateText({
    model: openai('gpt-4o'),
    system: 'You are a helpful assistant.',
    messages,
    tools: {
      getWeather: tool({
        description: 'Get current weather for a city',
        parameters: z.object({
          city: z.string().describe('City name'),
        }),
        execute: async ({ city }) => {
          const data = await fetchWeatherAPI(city);
          return { temperature: data.temp, condition: data.condition };
        },
      }),
      searchProducts: tool({
        description: 'Search the product database',
        parameters: z.object({
          query: z.string(),
          limit: z.number().optional().default(10),
        }),
        execute: async ({ query, limit }) => {
          return await db.products.search(query, limit);
        },
      }),
    },
    maxSteps: 5,
  });

  return Response.json({
    text: result.text,
    toolCalls: result.toolCalls,
    toolResults: result.toolResults,
    usage: result.usage,
  });
}

The response includes all tool calls and results:

{
  "text": "The weather in Tokyo is 22°C and sunny.",
  "toolCalls": [
    { "id": "call_123", "name": "getWeather", "args": { "city": "Tokyo" } }
  ],
  "toolResults": [
    { "toolCallId": "call_123", "result": { "temperature": 22, "condition": "sunny" } }
  ],
  "usage": { "promptTokens": 50, "completionTokens": 25, "totalTokens": 75 }
}

Runtime API (Advanced)

For more control over the server, use createRuntime() instead of streamText():

app/api/chat/route.ts
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  systemPrompt: 'You are a helpful assistant.',
  agentLoop: {
    maxIterations: 20,  // Max tool call cycles
    debug: true,        // Enable debug logging
  },
  tools: [/* server-side tools */],
});

export async function POST(request: Request) {
  return runtime.handleRequest(request);
}

Runtime Configuration

OptionTypeDescription
providerAIProviderProvider instance from createOpenAI(), createAnthropic(), etc.
modelstringModel ID (e.g., 'gpt-4o', 'claude-sonnet-4-20250514')
systemPromptstringDefault system prompt
agentLoop.maxIterationsnumberMax tool execution cycles (default: 20)
agentLoop.debugbooleanEnable debug logging
toolsToolDefinition[]Server-side tools
toolContextRecord<string, unknown>Context data passed to all tool handlers
debugbooleanEnable request/response logging

Server-Side Persistence

Use the onFinish callback to persist messages after each request:

app/api/chat/route.ts
export async function POST(request: Request) {
  return runtime.handleRequest(request, {
    onFinish: async ({ messages, threadId }) => {
      // Save to your database
      await db.thread.upsert({
        where: { id: threadId },
        update: { messages, updatedAt: new Date() },
        create: { id: threadId, messages },
      });
    },
  });
}

Tool Context

Pass authentication or context data to all tool handlers:

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
  toolContext: {
    userId: 'user_123',
    tenantId: 'tenant_456',
  },
  tools: [
    {
      name: 'get_user_data',
      description: 'Get data for the current user',
      location: 'server',
      inputSchema: { type: 'object', properties: {}, required: [] },
      handler: async (args, context) => {
        // Access context.data.userId, context.data.tenantId
        // Also available: context.headers, context.request, context.threadId
        const user = await db.user.findById(context.data.userId);
        return { success: true, data: user };
      },
    },
  ],
});

Provider Examples

import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
});
import { createRuntime } from '@yourgpt/llm-sdk';
import { createAnthropic } from '@yourgpt/llm-sdk/anthropic';

const runtime = createRuntime({
  provider: createAnthropic({ apiKey: process.env.ANTHROPIC_API_KEY }),
  model: 'claude-sonnet-4-20250514',
});

Requires @anthropic-ai/sdk package: npm install @anthropic-ai/sdk

import { createRuntime } from '@yourgpt/llm-sdk';
import { createGoogle } from '@yourgpt/llm-sdk/google';

const runtime = createRuntime({
  provider: createGoogle({ apiKey: process.env.GOOGLE_API_KEY }),
  model: 'gemini-2.0-flash',
});

Google uses OpenAI-compatible API. Requires openai package.


StreamResult API

The runtime.stream() method returns a StreamResult object with multiple ways to consume the response:

Response Methods

MethodFrameworkDescription
toResponse()Next.js, Cloudflare, DenoReturns Web Response with SSE
toTextResponse()Next.js, Cloudflare, DenoReturns text-only Response
pipeToResponse(res)Express, Node.jsPipes SSE to ServerResponse
pipeTextToResponse(res)Express, Node.jsPipes text to ServerResponse
toReadableStream()AnyReturns raw ReadableStream
collect()AnyCollects full result (non-streaming)

Express Example

server.ts
import express from 'express';
import { createRuntime } from '@yourgpt/llm-sdk';
import { createOpenAI } from '@yourgpt/llm-sdk/openai';

const app = express();
app.use(express.json());

const runtime = createRuntime({
  provider: createOpenAI({ apiKey: process.env.OPENAI_API_KEY }),
  model: 'gpt-4o',
});

// One-liner streaming
app.post('/api/chat', async (req, res) => {
  await runtime.stream(req.body).pipeToResponse(res);
});

// Or use the built-in handler
app.post('/api/chat-alt', runtime.expressHandler());

Collecting Results

For non-streaming use cases or logging:

app.post('/api/chat', async (req, res) => {
  const { text, messages, toolCalls } = await runtime.stream(req.body).collect();

  console.log('Response:', text);
  console.log('Tool calls:', toolCalls);

  res.json({ response: text });
});

Event Handlers

Process events as they stream (similar to Anthropic SDK):

const result = runtime.stream(body)
  .on('text', (text) => console.log('Text:', text))
  .on('toolCall', (call) => console.log('Tool:', call.name))
  .on('done', (final) => console.log('Done:', final.text))
  .on('error', (err) => console.error('Error:', err));

await result.pipeToResponse(res);

Environment Variables

Store your API keys in environment variables:

.env.local
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
GOOGLE_AI_API_KEY=...

Access them in your API route:

import { openai } from '@yourgpt/llm-sdk/openai';

// API key is read from OPENAI_API_KEY automatically
const model = openai('gpt-4o');

// Or pass explicitly
const model = openai('gpt-4o', {
  apiKey: process.env.OPENAI_API_KEY,
});

CORS Configuration

For cross-origin requests (e.g., frontend on different port):

app/api/chat/route.ts
export async function OPTIONS() {
  return new Response(null, {
    headers: {
      'Access-Control-Allow-Origin': '*',
      'Access-Control-Allow-Methods': 'POST, OPTIONS',
      'Access-Control-Allow-Headers': 'Content-Type',
    },
  });
}

export async function POST(req: Request) {
  // ... your handler

  const response = result.toTextStreamResponse();

  // Add CORS headers
  response.headers.set('Access-Control-Allow-Origin', '*');

  return response;
}
server.ts
createServer(async (req, res) => {
  // Handle preflight
  if (req.method === 'OPTIONS') {
    res.writeHead(204, {
      'Access-Control-Allow-Origin': '*',
      'Access-Control-Allow-Methods': 'POST, OPTIONS',
      'Access-Control-Allow-Headers': 'Content-Type',
    });
    res.end();
    return;
  }

  // Add CORS headers to response
  res.setHeader('Access-Control-Allow-Origin', '*');

  // ... your handler
});

Error Handling

export async function POST(req: Request) {
  try {
    const { messages } = await req.json();

    const result = await streamText({
      model: openai('gpt-4o'),
      messages,
    });

    return result.toTextStreamResponse();
  } catch (error) {
    console.error('Chat error:', error);

    return Response.json(
      { error: 'Failed to process chat request' },
      { status: 500 }
    );
  }
}

Request Validation

Validate incoming requests with Zod:

import { z } from 'zod';

const ChatRequestSchema = z.object({
  messages: z.array(z.object({
    role: z.enum(['user', 'assistant', 'system']),
    content: z.string(),
  })),
});

export async function POST(req: Request) {
  const body = await req.json();

  const parsed = ChatRequestSchema.safeParse(body);
  if (!parsed.success) {
    return Response.json(
      { error: 'Invalid request', details: parsed.error.errors },
      { status: 400 }
    );
  }

  const { messages } = parsed.data;
  // ... continue with validated data
}

Connecting Frontend

Point your frontend to your API endpoint:

app/providers.tsx
'use client';

import { CopilotProvider } from '@yourgpt/copilot-sdk/react';

export function Providers({ children }: { children: React.ReactNode }) {
  return (
    <CopilotProvider runtimeUrl="/api/chat">
      {children}
    </CopilotProvider>
  );
}

For a separate backend server:

<CopilotProvider runtimeUrl="http://localhost:3001/api/chat">

Next Steps

On this page