◉ AI Models/2026-06-24Advanced

Receiving On-Device AI Output as Typed Data with Foundation Models Guided Generation

How to receive Foundation Models output as typed Swift structs instead of free text, with working code for Guided Generation and Tool Calling on-device.

Foundation Models⁴ On-Device AI³ Rork Max¹⁸² SwiftUI⁵¹ Guided Generation

✦ Premium Article

When I hand an app feature to on-device AI, the part I have worried about most as an indie developer is reshaping the model's free-text reply into structured data inside the app. Even when I ask the prompt to "return JSON," the model sometimes wraps it in a sentence of explanation, drifts on a key name, or changes shape only when an array is empty. Each time, I add another regular expression or try? to absorb it, and that becomes the most fragile place in the codebase.

The iOS 26 Foundation Models framework ships two mechanisms that remove this problem at the root: Guided Generation, which binds output to a Swift type, and Tool Calling, which lets the model call your own functions. Because Rork Max generates native Swift, you can drop these APIs straight into the generated code. This article walks through moving from free-text parsing to receiving AI output as a type, with code that actually runs.

Why parsing free-text replies is so fragile

With a typical cloud-LLM integration, you tend to write code like this.

// Fragile: take free text and reshape it into JSON yourself
let text = try await callCloudLLM(prompt: "Return 3 recommended meditation themes as a JSON array")
 
// text may include a preamble like "Sure, here you go: [...]"
guard let jsonStart = text.firstIndex(of: "["),
      let data = String(text[jsonStart...]).data(using: .utf8),
      let themes = try? JSONDecoder().decode([String].self, from: data) else {
    // You hit this branch more often than you'd expect
    return fallbackThemes
}

The trouble is that text is prose meant for humans, so its format is never guaranteed. When I added an AI feature to one of my own apps, it was stable during testing, yet after release I got reports that for certain inputs a preamble appeared and parsing failed. The more often you fall back to fallbackThemes, the thinner the value of the AI feature becomes.

Guided Generation removes the "reshape the prose" step entirely. You hand the model a type up front, and it generates only values that conform to that type.

Binding output to a type with @Generable

First, declare the data you want back as a Swift struct and annotate it with @Generable.

import FoundationModels
 
@Generable
struct MeditationTheme {
    @Guide(description: "Session title. Short, under 6 words")
    var title: String
 
    @Guide(description: "What to focus on, in one sentence")
    var focus: String
 
    @Guide(description: "Recommended length in minutes", .range(3...30))
    var durationMinutes: Int
}

A type marked @Generable becomes a blueprint that tells the model "produce output in this shape." Then you just ask the session to generate it.

let session = LanguageModelSession()
 
let response = try await session.respond(
    to: "Suggest one meditation theme for a beginner",
    generating: MeditationTheme.self
)
 
// response.content is a MeditationTheme. There is no parsing step
let theme = response.content
print(theme.title)            // e.g. "Return to the Breath"
print(theme.durationMinutes)  // e.g. 5

This is the heart of it. response.content is not a string; it is a MeditationTheme. No JSONDecoder, no regular expressions. Even if the model tries to produce a value that violates a constraint (say, 40 for durationMinutes), the framework steers generation to respect the type and the @Guide constraints, so by the time it reaches your app it already sits within 3...30.

Here is why this is more trustworthy than regex parsing: hand-rolled parsing is a reactive defense that inspects the string after it is generated, while Guided Generation is a forward constraint that shapes the output during generation. The difference is between fixing something broken and never letting it be built broken.

✦

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN

✦You can replace fragile free-text parsing with @Generable, receiving AI output as type-safe Swift structs

✦You'll learn how to let the model call your own app functions with Tool Calling, building assistant features that work without a network

✦You'll be able to ship production-grade on-device AI, including streamed partial generation and graceful fallback when the model is unavailable

Secure payment via Stripe · Cancel anytime

✦

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

Unlock all articles with Membership →

Telling the model intent and constraints with @Guide

@Guide is not just a description; it is an instruction that controls generation. Beyond a text description, you can specify numeric ranges or selection from an enum.

@Generable
struct TodoItem {
    @Guide(description: "The thing to do, as a noun phrase, not imperative")
    var task: String
 
    // An enum can be @Generable too
    @Guide(description: "Priority")
    var priority: Priority
 
    @Guide(description: "Estimated minutes", .range(5...240))
    var estimatedMinutes: Int
}
 
@Generable
enum Priority {
    case high
    case medium
    case low
}

If you make an enum like Priority @Generable, the model must choose exactly one of high / medium / low. There is no room for an unexpected string like "very high." Having the type system guarantee that your switch is exhaustive is genuinely reassuring in production.

Arrays work just as naturally. Specify generating: [TodoItem].self and you get an array whose every element satisfies the constraints. The old headache of the shape collapsing only when the array is empty disappears here.

Filling the UI gradually with streamed partial generation

As the struct grows, waiting for full generation makes the UI feel frozen. Foundation Models lets you receive partial generation as a stream while staying typed.

@Generable
struct DailyPlan {
    @Guide(description: "A one-line theme for today")
    var theme: String
    @Guide(description: "A small habit for morning, noon, and night")
    var habits: [String]
    @Guide(description: "A closing word of encouragement")
    var encouragement: String
}
 
let stream = session.streamResponse(
    to: "Build a calm day plan",
    generating: DailyPlan.self
)
 
for try await partial in stream {
    // partial is a partially generated DailyPlan with optional fields
    // theme fills first, then habits grow one by one
    await MainActor.run {
        self.theme = partial.theme ?? self.theme
        if let habits = partial.habits { self.habits = habits }
    }
}

In the partial type returned by streamResponse, each property fills in turn: theme shows first, habits grows item by item, and encouragement arrives last. You get a "typing in progress" experience without sacrificing any type safety. I use partial generation only on screens where generation takes a few seconds, and fall back to a plain respond for light, instant generations.

Tool Calling: letting the model call your app's functions

If Guided Generation constrains the shape of the output, Tool Calling lends the model your app's capabilities. When the model cannot answer from its own knowledge, it can call a function you provide.

Create a type that conforms to the Tool protocol. Its arguments are @Generable too.

import FoundationModels
 
struct FavoriteLookupTool: Tool {
    let name = "lookupFavorites"
    let description = "Returns the tags of wallpapers the user has favorited"
 
    @Generable
    struct Arguments {
        @Guide(description: "Category to filter by; nil means all")
        var category: String?
    }
 
    // Reads the local data store. No network required
    func call(arguments: Arguments) async throws -> ToolOutput {
        let tags = FavoriteStore.shared.tags(in: arguments.category)
        return ToolOutput(tags.joined(separator: ", "))
    }
}

Hand this tool to the session and the model calls call itself when it decides it needs to.

let session = LanguageModelSession(tools: [FavoriteLookupTool()])
 
let response = try await session.respond(
    to: "Suggest a new theme that fits my favorites"
)
// The model calls lookupFavorites internally and grounds its suggestion in the result
print(response.content)

The key point is that the tool's body is a local data store lookup. The favorite tags live on the device; no network, no cloud API key. In other words, you can build an assistant that works offline. As an indie developer who checks "does this still work in airplane mode" every release, whether an AI feature can be built offline-first has been a deciding factor for me. The combination of Tool Calling and an on-device model meets that bar.

For deciding when you do need to switch to cloud inference, the on-device-first, cloud-fallback design is a useful companion.

Offline pitfalls and fallback design

The on-device model is not always available. It may be downloading, the device may be ineligible, or resources may be constrained. Check availability before you attempt generation.

import FoundationModels
 
switch SystemLanguageModel.default.availability {
case .available:
    // Run generation
    break
case .unavailable(let reason):
    // reason includes .deviceNotEligible / .modelNotReady / .appleIntelligenceNotEnabled
    showFallbackUI(reason: reason)
}

What matters here is designing the AI feature as a welcome bonus rather than a dependency. Even on screens that lean on on-device AI, I always keep a manual option for when the model is unavailable. If AI cannot produce theme suggestions, for example, I show a prepared list of standard themes instead. The table below is the line I draw when I build a feature.

Situation	Hand to on-device AI	Build yourself
Suggest / summarize / classify	Typed output via Guided Generation	Validating and formatting the result
Local data lookup	Decide to call via Tool Calling	The data store and its permissions
Model unavailable	—	Always provide a manual option

To guard against generation failing or taking too long, it helps to wrap calls with a timeout and drop to the fallback. Receiving typed output does not erase the possibility that generation never returns.

What to leave to generated code, and what to finish yourself

Rork Max will scaffold @Generable structs and Tool-conforming types in one pass. In practice, the type definitions and the session.respond calls come out remarkably cleanly. On the other hand, the soundness of @Guide constraints (the accuracy of ranges and descriptions) and the fallback for an unavailable model are parts I would not leave to generation. These shape the experience, so after the AI produces something that "runs for now," a human still has to finish it with an operational eye.

If you want to use images as input, see tagging on-device with Foundation Models image input. If you want to call it from Expo (React Native) through a native module, the Expo-to-Foundation-Models bridge is worth weighing alongside this.

Where to start if you're adding this

If you are adding on-device AI to an existing app, start with the smallest "classify" or "suggest" feature: define one @Generable type and replace a single call site with respond(to:generating:). Removing just one block of free-text parsing changes how stable that feature feels. Once you have a feel for typed generation there, widening into Tool Calling and streaming is a manageable next step. I am still finding the right line in my own apps, and I would be glad to keep exploring it together.

Thank You for Reading

Rork Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.