◉ AI Models/2026-04-07Advanced

Rork Max × OpenAI Responses API: to Building Stateful AI Agents in Mobile Apps 2026

A complete guide to implementing stateful AI agents in Rork Max apps using the OpenAI Responses API. Learn how to integrate built-in tools like web search, file search, and code interpreter via Cloudflare Workers, with practical monetization strategies for indie developers.

OpenAI⁵ Responses API AI Agents³ Rork Max²³¹ Cloudflare Workers²⁴ Stateful AI Mobile AI² React Native²¹²

✦ Premium Article

Setup and context: Why the Responses API Matters Now

In March 2025, OpenAI officially released the Responses API — a next-generation successor to the Assistants API that combines the simplicity of Chat Completions with the statefulness and tool integration capabilities that previously required complex infrastructure to build.

For indie developers building apps with Rork Max, this is a game-changer. Features that once required separate implementations for chat, file search, and web browsing can now be achieved through a single, unified API call.

In this guide, we'll walk through everything you need to integrate the Responses API into a Rork Max app — building a truly stateful AI agent with a Cloudflare Workers backend and a polished React Native UI on the frontend. By the end, you'll have architecture that's ready to ship.

Responses API vs. Chat Completions: What's Actually Different

The most significant innovation in the Responses API is server-side conversation state management. With Chat Completions, every API call required sending the entire conversation history. With the Responses API, you simply pass previous_response_id and the context continues seamlessly.

Here's what sets it apart:

Stateful threads: Save response.id and pass it as previous_response_id on the next call — no need to manage history arrays
Built-in tools: web_search_preview (live web search), file_search (vector store retrieval), and code_interpreter (sandboxed code execution) are available natively
Streaming support: Full Server-Sent Events streaming for real-time responses
Multimodal input: Text, images, and audio can all be passed as input

For indie developers, the most welcome change is that the complex thread and run management of the Assistants API is gone entirely. Powerful AI capabilities are now accessible with far less code.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦Learn concrete implementation patterns for using OpenAI Responses API built-in tools (web search, file search, code interpreter) in Rork apps

✦Master the design principles for building a Cloudflare Workers edge AI backend that manages stateful conversation threads smoothly in a mobile UI

✦Discover how to gate premium AI features with Stripe and design an AI SaaS model capable of generating significant monthly revenue as an indie developer

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Architecture: Rork + Cloudflare Workers + OpenAI

System Overview

Rork Max projects use Next.js App Router for the frontend integrated with Cloudflare Workers for edge computing. Here's how we structure the AI agent backend:

Rork Max App (React Native / Expo)
    │
    ▼
Cloudflare Workers (Edge Backend)
    │  ├── /api/agent/chat    ← Main conversation endpoint
    │  ├── /api/agent/files   ← File upload
    │  └── /api/agent/history ← Conversation history
    │
    ▼
OpenAI Responses API
    │  ├── web_search_preview ← Live web search
    │  ├── file_search        ← Vector store retrieval
    │  └── code_interpreter   ← Sandboxed code execution
    │
    ▼
Cloudflare KV (Thread ID and session state management)

KV Schema Design

We use Cloudflare KV to persist conversation thread state across requests:

// Thread ID storage key format
// key: thread:{userId}:{agentType}
// value: { responseId: string, createdAt: number, messageCount: number }
 
interface ThreadState {
  responseId: string;        // Latest response_id
  createdAt: number;         // Unix timestamp
  messageCount: number;      // Total messages (for cost management)
  agentType: 'general' | 'search' | 'analyst';
}

Step 1: Building the Cloudflare Workers Endpoint

Installing Dependencies

# Run from the root of your Rork project
npm install openai zod

Main Conversation Endpoint

// src/app/api/agent/chat/route.ts
 
import OpenAI from 'openai';
import { getCloudflareContext } from '@opennextjs/cloudflare';
 
interface ChatRequest {
  message: string;
  agentType?: 'general' | 'search' | 'analyst';
  resetThread?: boolean;
}
 
export async function POST(request: Request) {
  const { env } = await getCloudflareContext();
  const openai = new OpenAI({ apiKey: env.OPENAI_API_KEY });
 
  // Auth check — Premium members only
  const authHeader = request.headers.get('Authorization');
  if (!authHeader || !await verifyPremiumToken(authHeader, env)) {
    return Response.json({ error: 'Premium membership required' }, { status: 403 });
  }
 
  const userId = await getUserIdFromToken(authHeader);
  const body: ChatRequest = await request.json();
 
  // Retrieve thread state from KV
  const threadKey = `thread:${userId}:${body.agentType || 'general'}`;
  const threadStateStr = await env.KV.get(threadKey);
  const threadState: ThreadState | null = threadStateStr
    ? JSON.parse(threadStateStr)
    : null;
 
  // Handle thread reset
  const previousResponseId = body.resetThread ? undefined : threadState?.responseId;
 
  // Agent-specific system prompts
  const systemPrompt = getSystemPrompt(body.agentType || 'general');
  const tools = getToolsForAgent(body.agentType || 'general');
 
  try {
    // Call Responses API with streaming
    const stream = await openai.responses.stream({
      model: 'gpt-4o',
      input: body.message,
      previous_response_id: previousResponseId,
      instructions: systemPrompt,
      tools: tools,
      stream: true,
    });
 
    const encoder = new TextEncoder();
    const readable = new ReadableStream({
      async start(controller) {
        let finalResponseId: string | undefined;
        let fullText = '';
 
        for await (const event of stream) {
          if (event.type === 'response.output_text.delta') {
            const chunk = `data: ${JSON.stringify({
              type: 'text_delta',
              delta: event.delta,
            })}\n\n`;
            controller.enqueue(encoder.encode(chunk));
            fullText += event.delta;
          }
 
          if (event.type === 'response.done') {
            finalResponseId = event.response.id;
 
            const toolUses = event.response.output
              .filter(item => item.type === 'tool_use')
              .map(item => ({ type: item.type, name: (item as any).name }));
 
            if (toolUses.length > 0) {
              const toolChunk = `data: ${JSON.stringify({
                type: 'tool_uses',
                tools: toolUses,
              })}\n\n`;
              controller.enqueue(encoder.encode(toolChunk));
            }
 
            const doneChunk = `data: ${JSON.stringify({
              type: 'done',
              responseId: finalResponseId,
            })}\n\n`;
            controller.enqueue(encoder.encode(doneChunk));
          }
        }
 
        // Persist thread state to KV (TTL: 7 days)
        if (finalResponseId) {
          const newState: ThreadState = {
            responseId: finalResponseId,
            createdAt: threadState?.createdAt || Date.now(),
            messageCount: (threadState?.messageCount || 0) + 1,
            agentType: body.agentType || 'general',
          };
          await env.KV.put(threadKey, JSON.stringify(newState), {
            expirationTtl: 60 * 60 * 24 * 7,
          });
        }
 
        controller.close();
      },
    });
 
    return new Response(readable, {
      headers: {
        'Content-Type': 'text/event-stream',
        'Cache-Control': 'no-cache',
        'Connection': 'keep-alive',
      },
    });
  } catch (error) {
    console.error('OpenAI Responses API error:', error);
    return Response.json({ error: 'AI service error' }, { status: 500 });
  }
}
 
function getToolsForAgent(agentType: string) {
  const baseTools = [];
  if (agentType === 'search' || agentType === 'general') {
    baseTools.push({ type: 'web_search_preview' });
  }
  if (agentType === 'analyst') {
    baseTools.push({ type: 'code_interpreter', container: { type: 'auto' } });
  }
  return baseTools;
}
 
function getSystemPrompt(agentType: string): string {
  const prompts = {
    general: `You are a premium assistant for Rork Lab.
Provide specific, practical advice on app development questions.
Include code examples where appropriate, and keep Rork's tech stack
(React Native / Expo / Cloudflare Workers) in mind.`,
 
    search: `You are a research assistant with access to real-time web data.
Use the web search tool to provide accurate, up-to-date information.
Always cite your sources with URLs and indicate how recent the information is.`,
 
    analyst: `You are a data analysis expert.
Use the code interpreter to analyze data provided by the user,
generate charts, and deliver clear statistical insights.`,
  };
  return prompts[agentType as keyof typeof prompts] || prompts.general;
}

Step 2: Building the Rork Frontend

Custom Hook: useAgentChat

// hooks/useAgentChat.ts
import { useState, useCallback, useRef } from 'react';
 
interface Message {
  id: string;
  role: 'user' | 'assistant';
  content: string;
  toolUses?: { type: string; name: string }[];
  timestamp: Date;
}
 
interface UseAgentChatOptions {
  agentType?: 'general' | 'search' | 'analyst';
  onError?: (error: Error) => void;
}
 
export function useAgentChat({ agentType = 'general', onError }: UseAgentChatOptions = {}) {
  const [messages, setMessages] = useState<Message[]>([]);
  const [isStreaming, setIsStreaming] = useState(false);
  const [currentStreamText, setCurrentStreamText] = useState('');
  const abortControllerRef = useRef<AbortController | null>(null);
 
  const sendMessage = useCallback(async (content: string) => {
    if (isStreaming) return;
 
    const userMessage: Message = {
      id: `user-${Date.now()}`,
      role: 'user',
      content,
      timestamp: new Date(),
    };
    setMessages(prev => [...prev, userMessage]);
    setIsStreaming(true);
    setCurrentStreamText('');
 
    const controller = new AbortController();
    abortControllerRef.current = controller;
 
    try {
      const response = await fetch('/api/agent/chat', {
        method: 'POST',
        headers: {
          'Content-Type': 'application/json',
          'Authorization': `Bearer ${await getAuthToken()}`,
        },
        body: JSON.stringify({ message: content, agentType }),
        signal: controller.signal,
      });
 
      if (!response.ok) throw new Error('API request failed');
 
      const reader = response.body!.getReader();
      const decoder = new TextDecoder();
      let accumulatedText = '';
      let toolUses: { type: string; name: string }[] = [];
 
      while (true) {
        const { done, value } = await reader.read();
        if (done) break;
 
        const chunk = decoder.decode(value, { stream: true });
        const lines = chunk.split('\n');
 
        for (const line of lines) {
          if (!line.startsWith('data: ')) continue;
          const data = JSON.parse(line.slice(6));
 
          if (data.type === 'text_delta') {
            accumulatedText += data.delta;
            setCurrentStreamText(accumulatedText);
          } else if (data.type === 'tool_uses') {
            toolUses = data.tools;
          } else if (data.type === 'done') {
            const assistantMessage: Message = {
              id: `assistant-${Date.now()}`,
              role: 'assistant',
              content: accumulatedText,
              toolUses: toolUses.length > 0 ? toolUses : undefined,
              timestamp: new Date(),
            };
            setMessages(prev => [...prev, assistantMessage]);
            setCurrentStreamText('');
          }
        }
      }
    } catch (error) {
      if (error instanceof Error && error.name !== 'AbortError') {
        onError?.(error);
      }
    } finally {
      setIsStreaming(false);
      abortControllerRef.current = null;
    }
  }, [isStreaming, agentType, onError]);
 
  const stopStreaming = useCallback(() => {
    abortControllerRef.current?.abort();
    setIsStreaming(false);
    setCurrentStreamText('');
  }, []);
 
  const resetThread = useCallback(async () => {
    setMessages([]);
    await fetch('/api/agent/chat', {
      method: 'POST',
      headers: {
        'Content-Type': 'application/json',
        'Authorization': `Bearer ${await getAuthToken()}`,
      },
      body: JSON.stringify({ message: '---reset---', agentType, resetThread: true }),
    });
  }, [agentType]);
 
  return { messages, isStreaming, currentStreamText, sendMessage, stopStreaming, resetThread };
}

Chat UI Component

// components/AgentChatScreen.tsx
import React, { useState, useRef, useEffect } from 'react';
import {
  View, ScrollView, TextInput, TouchableOpacity,
  Text, StyleSheet, ActivityIndicator, Animated,
} from 'react-native';
import { useAgentChat } from '../hooks/useAgentChat';
 
export function AgentChatScreen({ agentType = 'general' }: { agentType?: string }) {
  const [inputText, setInputText] = useState('');
  const scrollViewRef = useRef<ScrollView>(null);
  const cursorOpacity = useRef(new Animated.Value(1)).current;
  const { messages, isStreaming, currentStreamText, sendMessage, stopStreaming } = useAgentChat({
    agentType: agentType as any,
  });
 
  useEffect(() => {
    if (isStreaming) {
      Animated.loop(
        Animated.sequence([
          Animated.timing(cursorOpacity, { toValue: 0, duration: 500, useNativeDriver: true }),
          Animated.timing(cursorOpacity, { toValue: 1, duration: 500, useNativeDriver: true }),
        ])
      ).start();
    } else {
      cursorOpacity.setValue(1);
    }
  }, [isStreaming]);
 
  useEffect(() => {
    setTimeout(() => scrollViewRef.current?.scrollToEnd({ animated: true }), 100);
  }, [messages, currentStreamText]);
 
  const handleSend = async () => {
    if (!inputText.trim() || isStreaming) return;
    const text = inputText;
    setInputText('');
    await sendMessage(text);
  };
 
  return (
    <View style={styles.container}>
      <ScrollView ref={scrollViewRef} style={styles.list} contentContainerStyle={styles.listContent}>
        {messages.map((msg) => (
          <View key={msg.id} style={[styles.bubble, msg.role === 'user' ? styles.user : styles.assistant]}>
            {msg.toolUses?.map((tool, i) => (
              <View key={i} style={styles.badge}>
                <Text style={styles.badgeText}>
                  {tool.name === 'web_search_preview' ? '🔍 Web Search' :
                   tool.name === 'code_interpreter' ? '💻 Code' : tool.name}
                </Text>
              </View>
            ))}
            <Text style={[styles.text, msg.role === 'user' ? styles.userText : styles.aiText]}>
              {msg.content}
            </Text>
          </View>
        ))}
        {isStreaming && currentStreamText && (
          <View style={[styles.bubble, styles.assistant]}>
            <Text style={[styles.text, styles.aiText]}>
              {currentStreamText}
              <Animated.Text style={{ opacity: cursorOpacity }}>▌</Animated.Text>
            </Text>
          </View>
        )}
        {isStreaming && !currentStreamText && (
          <View style={[styles.bubble, styles.assistant, { padding: 16 }]}>
            <ActivityIndicator size="small" color="#666" />
          </View>
        )}
      </ScrollView>
      <View style={styles.inputRow}>
        <TextInput
          style={styles.input}
          value={inputText}
          onChangeText={setInputText}
          placeholder="Ask anything..."
          multiline
          editable={!isStreaming}
        />
        <TouchableOpacity
          style={[styles.btn, isStreaming && styles.stopBtn]}
          onPress={isStreaming ? stopStreaming : handleSend}
        >
          <Text style={styles.btnText}>{isStreaming ? '⏹ Stop' : 'Send'}</Text>
        </TouchableOpacity>
      </View>
    </View>
  );
}
 
const styles = StyleSheet.create({
  container: { flex: 1, backgroundColor: '#f8f9fa' },
  list: { flex: 1 },
  listContent: { padding: 16, gap: 12 },
  bubble: { maxWidth: '85%', borderRadius: 16, padding: 12 },
  user: { alignSelf: 'flex-end', backgroundColor: '#007AFF' },
  assistant: {
    alignSelf: 'flex-start', backgroundColor: '#fff',
    shadowColor: '#000', shadowOffset: { width: 0, height: 1 },
    shadowOpacity: 0.1, shadowRadius: 4, elevation: 2,
  },
  badge: {
    backgroundColor: '#e8f4f8', borderRadius: 8,
    paddingHorizontal: 8, paddingVertical: 3, marginBottom: 6,
    alignSelf: 'flex-start',
  },
  badgeText: { fontSize: 11, color: '#0066cc', fontWeight: '600' },
  text: { fontSize: 15, lineHeight: 22 },
  userText: { color: '#fff' },
  aiText: { color: '#1a1a1a' },
  inputRow: {
    flexDirection: 'row', padding: 12, backgroundColor: '#fff',
    borderTopWidth: 1, borderTopColor: '#e0e0e0', gap: 8, alignItems: 'flex-end',
  },
  input: {
    flex: 1, borderWidth: 1, borderColor: '#e0e0e0', borderRadius: 20,
    paddingHorizontal: 16, paddingVertical: 10, maxHeight: 120,
    fontSize: 15, backgroundColor: '#f8f9fa',
  },
  btn: {
    backgroundColor: '#007AFF', borderRadius: 20,
    paddingHorizontal: 16, paddingVertical: 10,
  },
  stopBtn: { backgroundColor: '#FF3B30' },
  btnText: { color: '#fff', fontWeight: '600', fontSize: 15 },
});

Step 3: Agent Type Strategies

Search Agent: Real-Time Information Use Cases

The web_search_preview tool opens up compelling use cases for apps that need fresh data:

News digest apps: AI summarizes the latest articles on topics the user follows
Competitive intelligence tools: Automatically surface App Store reviews and competitor updates
Market research assistants: Pull pricing, trends, and industry news on demand

// Restricting search to specific domains for focused agents
const restrictedSearchPrompt = `
You are an ASO (App Store Optimization) research specialist.
Use web search ONLY to investigate:
1. App Store category ranking shifts
2. Competitor review trends
3. Latest ASO best practices
 
Do not collect personal or confidential information.
`;

Analyst Agent: Data-Driven Decision Making

The code_interpreter tool makes it possible to build analyst features that would otherwise require a dedicated data infrastructure:

Revenue dashboards: Upload AdMob or Stripe CSV exports and let AI generate charts and insights
Behavioral analytics: Query Firebase Analytics exports in plain English
A/B test evaluation: AI automatically calculates statistical significance

// File upload + analyst agent integration
const analyzeRevenue = async (csvData: string) => {
  // Step 1: Upload file
  const uploadResponse = await fetch('/api/agent/files', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${authToken}`,
    },
    body: JSON.stringify({ content: csvData, filename: 'revenue.csv' }),
  });
  const { fileId } = await uploadResponse.json();
 
  // Step 2: Request analysis
  await sendMessage(
    `Please analyze the revenue CSV I just uploaded (fileId: ${fileId}).
    Identify monthly trends, top revenue sources, anomalies, and provide
    actionable recommendations to grow revenue.`
  );
};

Step 4: Cost Management and Rate Limiting

Monitoring Token Consumption

The Responses API charges per token. For any commercial app, cost tracking is non-negotiable:

// src/app/api/agent/usage/route.ts
export async function GET(request: Request) {
  const { env } = await getCloudflareContext();
  const userId = await getUserIdFromToken(request.headers.get('Authorization')!);
 
  const month = new Date().toISOString().slice(0, 7); // e.g. "2026-04"
  const usageKey = `usage:${userId}:${month}`;
  const usageStr = await env.KV.get(usageKey);
  const usage = usageStr
    ? JSON.parse(usageStr)
    : { inputTokens: 0, outputTokens: 0, requests: 0 };
 
  const plan = await getUserPlan(userId, env);
  const limit = plan === 'premium' ? Infinity : 500;
 
  return Response.json({ usage, limit, remaining: Math.max(0, limit - usage.requests), plan });
}

Per-User Rate Limiting

// 10 requests per user per minute
async function checkRateLimit(userId: string, env: CloudflareEnv): Promise<boolean> {
  const key = `ratelimit:${userId}:${Math.floor(Date.now() / 60000)}`;
  const current = parseInt(await env.KV.get(key) || '0');
  if (current >= 10) return false;
  await env.KV.put(key, String(current + 1), { expirationTtl: 120 });
  return true;
}

Step 5: Monetizing with Stripe-Gated Premium AI Features

Pairing the Responses API's advanced tools with Stripe subscriptions creates a compelling monetization model that scales with your user base.

Access Control by Plan

async function getAvailableAgentTypes(userId: string, env: CloudflareEnv): Promise<string[]> {
  const plan = await getUserPlan(userId, env);
  switch (plan) {
    case 'premium':
    case 'pro':
      return ['general', 'search', 'analyst']; // Full access
    case 'article':
      return ['general', 'search'];             // Web search included
    default:
      return ['general'];                        // Basic chat only
  }
}

Revenue Model Design

Here's a tiered approach that balances OpenAI API costs against subscription revenue:

Free tier: Basic AI chat (general agent, 20 requests/month) — for acquisition and trial
Pro plan (¥580/month): Web search enabled (search agent, 500 requests/month)
Premium plan (¥2,480 lifetime): All features unlimited (analyst agent included)

This structure lets you cover API costs while maximizing LTV for committed users — a proven model for indie AI SaaS products.

Common Errors and How to Fix Them

Error 1: `previous_response_id` becomes invalid

Cause: Response IDs expire after 30 days. They can also become invalid if the model version changes.

// Fix: Add expiration check when reading from KV
const isValidThread = (state: ThreadState): boolean => {
  const thirtyDaysAgo = Date.now() - 30 * 24 * 60 * 60 * 1000;
  return state.createdAt > thirtyDaysAgo;
};
 
const previousResponseId = threadState && isValidThread(threadState)
  ? threadState.responseId
  : undefined; // Start a fresh thread if expired

Error 2: Unexpected web search costs

Cause: Each web_search_preview call incurs an additional fee (~$0.03 as of 2026).

// Fix: Track and limit daily search calls
const trackSearchUsage = async (userId: string, env: CloudflareEnv) => {
  const today = new Date().toISOString().slice(0, 10);
  const key = `search_count:${userId}:${today}`;
  const count = parseInt(await env.KV.get(key) || '0');
  if (count >= 50) throw new Error('Daily search limit reached');
  await env.KV.put(key, String(count + 1), { expirationTtl: 86400 });
};

Error 3: Streaming timeout in Cloudflare Workers

Cause: Workers have a 30-second CPU time limit. Long streaming responses can hit this ceiling.

// Fix: Limit output tokens and use max_output_tokens
const stream = await openai.responses.stream({
  model: 'gpt-4o',
  input: message,
  max_output_tokens: 1500, // Prevents CPU timeout
  // ...
});

Step 6: TypeScript Type Safety and Best Practices

A production-grade AI agent integration deserves the same engineering rigor as the rest of your app. Here are the patterns that prevent the most common issues.

Defining Strict Response Types

// types/agent.ts
 
// Mirrors the OpenAI Responses API response structure
export interface ResponsesApiOutput {
  type: 'output_text' | 'tool_use' | 'tool_result';
  id: string;
  text?: string;
  name?: string;
  arguments?: Record<string, unknown>;
}
 
export interface AgentResponse {
  id: string;
  model: string;
  output: ResponsesApiOutput[];
  usage: {
    input_tokens: number;
    output_tokens: number;
    total_tokens: number;
  };
  status: 'completed' | 'failed' | 'in_progress';
}
 
// SSE event types received by the client
export type AgentStreamEvent =
  | { type: 'text_delta'; delta: string }
  | { type: 'tool_uses'; tools: { type: string; name: string }[] }
  | { type: 'done'; responseId: string }
  | { type: 'error'; message: string };

Zod Validation for Request Bodies

// src/app/api/agent/chat/validation.ts
import { z } from 'zod';
 
export const ChatRequestSchema = z.object({
  message: z.string().min(1).max(4000),
  agentType: z.enum(['general', 'search', 'analyst']).default('general'),
  resetThread: z.boolean().optional().default(false),
});
 
// Usage in the route handler
export async function POST(request: Request) {
  const body = await request.json();
  const parsed = ChatRequestSchema.safeParse(body);
 
  if (!parsed.success) {
    return Response.json(
      { error: 'Invalid request', details: parsed.error.flatten() },
      { status: 400 }
    );
  }
 
  const { message, agentType, resetThread } = parsed.data;
  // ... continue with validated data
}

Graceful Degradation Patterns

When the AI service is unavailable, your app should degrade gracefully rather than crashing:

// hooks/useAgentChat.ts — enhanced error handling
const sendMessage = useCallback(async (content: string) => {
  try {
    // ... main implementation
  } catch (error) {
    if (error instanceof Error) {
      if (error.name === 'AbortError') return; // User cancelled — no-op
 
      // Network error — show offline message
      if (!navigator.onLine) {
        const offlineMessage: Message = {
          id: `system-${Date.now()}`,
          role: 'assistant',
          content: "You appear to be offline. Please check your connection and try again.",
          timestamp: new Date(),
        };
        setMessages(prev => [...prev, offlineMessage]);
        return;
      }
 
      // Rate limit exceeded
      if (error.message.includes('429') || error.message.includes('rate limit')) {
        onError?.(new Error('You\'ve reached the request limit. Please wait a moment.'));
        return;
      }
 
      // Generic error
      onError?.(error);
    }
  } finally {
    setIsStreaming(false);
  }
}, [isStreaming, agentType, onError]);

Step 7: Testing Your Agent Integration

Thorough testing for AI agent integrations requires a combination of unit tests, integration tests, and end-to-end validation.

Unit Testing the Custom Hook

// __tests__/useAgentChat.test.ts
import { renderHook, act } from '@testing-library/react-hooks';
import { useAgentChat } from '../hooks/useAgentChat';
 
// Mock fetch to simulate SSE streaming
const mockStreamResponse = (events: string[]) => {
  const encoder = new TextEncoder();
  const stream = new ReadableStream({
    start(controller) {
      events.forEach(event => controller.enqueue(encoder.encode(event)));
      controller.close();
    },
  });
  return new Response(stream, {
    headers: { 'Content-Type': 'text/event-stream' },
  });
};
 
describe('useAgentChat', () => {
  beforeEach(() => {
    jest.spyOn(global, 'fetch').mockImplementation(() =>
      Promise.resolve(mockStreamResponse([
        'data: {"type":"text_delta","delta":"Hello"}\n\n',
        'data: {"type":"text_delta","delta":", world"}\n\n',
        'data: {"type":"done","responseId":"resp_123"}\n\n',
      ]))
    );
  });
 
  it('should stream messages correctly', async () => {
    const { result, waitForNextUpdate } = renderHook(() => useAgentChat());
 
    act(() => { result.current.sendMessage('Hi'); });
    await waitForNextUpdate();
 
    expect(result.current.messages).toHaveLength(2); // user + assistant
    expect(result.current.messages[1].content).toBe('Hello, world');
    expect(result.current.isStreaming).toBe(false);
  });
 
  it('should handle abort correctly', async () => {
    const { result } = renderHook(() => useAgentChat());
 
    act(() => {
      result.current.sendMessage('Hello');
      result.current.stopStreaming();
    });
 
    expect(result.current.isStreaming).toBe(false);
  });
});

Integration Testing the Cloudflare Workers Endpoint

// __tests__/api/agent.integration.test.ts
import { unstable_dev } from 'wrangler';
 
describe('Agent Chat API', () => {
  let worker: Awaited<ReturnType<typeof unstable_dev>>;
 
  beforeAll(async () => {
    worker = await unstable_dev('src/worker.ts', {
      experimental: { disableExperimentalWarning: true },
    });
  });
 
  afterAll(async () => { await worker.stop(); });
 
  it('should reject unauthenticated requests', async () => {
    const response = await worker.fetch('/api/agent/chat', {
      method: 'POST',
      body: JSON.stringify({ message: 'test' }),
      headers: { 'Content-Type': 'application/json' },
    });
    expect(response.status).toBe(403);
  });
 
  it('should validate request body', async () => {
    const response = await worker.fetch('/api/agent/chat', {
      method: 'POST',
      body: JSON.stringify({ message: '' }), // Empty message
      headers: {
        'Content-Type': 'application/json',
        'Authorization': 'Bearer test-token',
      },
    });
    expect(response.status).toBe(400);
  });
});

Load Testing with Artillery

For apps expecting significant traffic, test your Cloudflare Workers endpoint under load before launch:

# artillery-config.yml
config:
  target: "https://your-app.workers.dev"
  phases:
    - duration: 60
      arrivalRate: 10
      name: "Ramp up"
    - duration: 120
      arrivalRate: 50
      name: "Sustained load"
 
scenarios:
  - name: "Agent chat"
    flow:
      - post:
          url: "/api/agent/chat"
          headers:
            Authorization: "Bearer {{ $processEnvironment.TEST_TOKEN }}"
          json:
            message: "What are the best practices for React Native performance?"
            agentType: "general"

Run with: npx artillery run artillery-config.yml

Step 8: Production Deployment Checklist

Before shipping your AI agent feature to users, work through this checklist:

Security

OPENAI_API_KEY stored as wrangler secret (never in code or .env files committed to git)
All API endpoints require valid authentication tokens
Input validation via Zod schemas on all endpoints
Rate limiting implemented (both per-user and global)
No sensitive user data logged to Cloudflare Workers logs

Performance

max_output_tokens set appropriately to prevent CPU timeout in Workers
KV read/write operations are non-blocking (using await correctly)
SSE connections include proper Cache-Control: no-cache headers
Client-side abort controller implemented to prevent orphaned requests

Cost Control

Monthly token usage tracked per user in KV
Web search call count tracked and capped per user per day
Stripe plan-based access control enforced server-side (not just client-side)
Alerting set up in Cloudflare dashboard for unexpected cost spikes

User Experience

Loading indicator shown while waiting for first token
Streaming cursor animation during response generation
Stop button to cancel long responses
Graceful error messages for network failures, rate limits, and service errors
Thread reset option available to start fresh conversations

Observability

Worker error logging to Cloudflare Logpush or a third-party service like Sentry
Request latency tracked (P50, P95, P99)
Stripe webhook events logged for subscription changes
KV hit/miss rates monitored for thread state health

Summary

The OpenAI Responses API represents a meaningful step forward in the complexity-versus-capability tradeoff for mobile AI development. Combined with Rork Max and Cloudflare Workers, it puts the following capabilities within reach of a single indie developer:

Stateful AI agents that maintain conversation context without complex history management
Web search, code execution, and file retrieval as native, built-in capabilities
A monetization model built around Stripe-gated premium AI features

To deepen your understanding of AI agent patterns in Rork, check out Building Intelligent Assistant Apps with Rork × AI Function Calling — Context Management, Tool Integration, and Conversation Memory and the Rork × Multi-AI Orchestration Guide — Designing Intelligent Apps with Claude, Gemini, and GPT-4o.

Thank You for Reading

Rork Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.