RORK LABJP
MAX — Rork Max generates native Swift apps for iPhone, iPad, Watch, TV, Vision Pro, and iMessageNATIVE — It reaches AR/LiDAR, Metal 3D, Dynamic Island, Live Activities, HealthKit, and Core MLPUBLISH — Two-click App Store submission sharply cuts the overhead of shipping an appPRICING — Rork Max is 00/month, while the original Rork starts free with paid plans from 5/monthFUNDING — Rork raised .8M from a16z, with over 743k monthly visits and 85% growthTOOL — The original Rork builds native iOS and Android apps from plain English using React Native (Expo)MAX — Rork Max generates native Swift apps for iPhone, iPad, Watch, TV, Vision Pro, and iMessageNATIVE — It reaches AR/LiDAR, Metal 3D, Dynamic Island, Live Activities, HealthKit, and Core MLPUBLISH — Two-click App Store submission sharply cuts the overhead of shipping an appPRICING — Rork Max is 00/month, while the original Rork starts free with paid plans from 5/monthFUNDING — Rork raised .8M from a16z, with over 743k monthly visits and 85% growthTOOL — The original Rork builds native iOS and Android apps from plain English using React Native (Expo)
Articles/App Dev
App Dev/2026-07-01Advanced

Building a Song-Recognition App with ShazamKit in Rork Max's Native Swift

Implementation notes on building a song-recognition app with SHManagedSession in Rork Max's native Swift. Covers the difference from hand-rolling AVAudioEngine, designing the idle / prerecording / matching states, using prerecording to improve initial accuracy, and the boundary design for bridging from Expo — with the pitfalls I actually hit.

ShazamKitRork Max199Music RecognitioniOS93SwiftUI57

Premium Article

Once, in a café I'd wandered into with a friend, a song played that felt familiar but neither of us could name, and we sat there stumped for a while. It would be fun, I thought, to have that "hold up your phone and instantly know" experience as a feature of one of my own small apps. I've built a handful of utility apps as an indie developer, and the feel of "point it and it just knows" carries a pleasure that needs no explanation.

The doorway to that is ShazamKit. It's Apple's music-recognition machinery you can embed in an app, matching a song from the ambient sound or from your app's own playback. It used to require you to manage the recording yourself, but with iOS 17's SHManagedSession you can hand most of that off. And because Rork Max now generates native Swift, this API is something you can wire into an indie app cleanly. Here is what I learned building a small "point it and it knows" app, step by step.

Hand-rolled classic, or the managed session?

ShazamKit offers two broad ways to build. Choosing this first settles the rest of your design.

AspectSHSession (hand-rolled)SHManagedSession (iOS 17+)
Recording managementYou build AVAudioEngine yourselfLeft to the framework
Mic permissionYou request and check itThe session takes care of it
Receiving resultsDelegateAsync result sequence (results())
State visibilityManage it yourselfObserve idle / prerecording / matching
Best forFine control, e.g. matching against playbackThe classic "point and recognize ambient song"

For the classic use — hold up the phone and identify the song playing around you — I reach for SHManagedSession without hesitation. It carries the recording and mic-permission burden for you, so the amount you write visibly shrinks. I hand-rolled AVAudioEngine at first, meaning to learn, but permissions and buffer wrangling ate more time than expected; switching to SHManagedSession left only the essence.

Map the state onto SwiftUI

SHManagedSession conforms to Observable, so SwiftUI picks up its state changes directly. There are three states:

  • idle: waiting, neither recording nor matching
  • prerecording: prepared for matching, prerecording ahead of time
  • matching: actively attempting a match

Reflecting these three straight onto the button's look tells the user "what it's doing right now." Here's a minimal screen that changes its display by state.

import SwiftUI
import ShazamKit
 
@MainActor
final class RecognizerModel: ObservableObject {
    let session = SHManagedSession()
    @Published var title: String?
    @Published var artist: String?
    @Published var isWorking = false
 
    /// Point and match. Results arrive from the results() async sequence.
    func recognize() async {
        isWorking = true
        defer { isWorking = false }
 
        // Attempt one match, stop on the first result
        for await result in session.results {
            switch result {
            case .match(let match):
                if let item = match.mediaItems.first {
                    self.title = item.title
                    self.artist = item.artist
                }
                return                    // Stop once we've got a hit
            case .noMatch:
                self.title = "No match"
                return
            case .error(let error, _):
                self.title = "Error: \(error.localizedDescription)"
                return
            @unknown default:
                return
            }
        }
    }
}
 
struct RecognizeView: View {
    @StateObject private var model = RecognizerModel()
 
    var body: some View {
        VStack(spacing: 24) {
            if let title = model.title {
                Text(title).font(.title2).bold()
                if let artist = model.artist {
                    Text(artist).foregroundStyle(.secondary)
                }
            } else {
                Text(model.isWorking ? "Listening…" : "Tap to recognize a song")
                    .foregroundStyle(.secondary)
            }
 
            Button {
                Task { await model.recognize() }
            } label: {
                Image(systemName: "shazam.logo.fill")
                    .font(.system(size: 64))
            }
            .disabled(model.isWorking)
        }
        .padding()
    }
}

session.results returns an async sequence of match results. Returning once you get a hit is because one identified song is enough for this use. To keep identifying continuously, don't return — keep spinning the sequence and results arrive each time the playing song changes.

Thank you for reading this far.

Continue Reading

What follows includes implementation code, benchmarks, and practical content we hope you'll find useful. This site runs without ads — server and development costs are supported entirely by members like you. If it's been helpful, we'd be truly grateful for your support.

WHAT YOU'LL LEARN
The difference between hand-rolling AVAudioEngine and letting SHManagedSession do the work, and how to choose between them
A complete, working Swift flow: mapping idle / prerecording / matching to SwiftUI and using prerecording to sharpen first-touch accuracy
How to use ShazamKit in the native Swift Rork Max generates, and the boundary design for bridging it from an Expo / React Native native module
Secure payment via Stripe · Cancel anytime

Unlock This Article

Get full access to the rest of this article. Buy once, read anytime. This site is ad-free — your support goes directly toward keeping it running.

or
Unlock all articles with Membership →
Share

Thank You for Reading

Rork Lab is ad-free, supported entirely by members like you. We publish practical guides daily with implementation code, benchmarks, and production-ready patterns. If you've found it useful, we'd love to have you on board.

  • Copy-paste ready implementation code
  • New advanced guides published daily
  • $5/mo or $10 for lifetime access
View Membership →

Related Articles

App Dev2026-06-28
Building a Live Barcode and Text Scanner in Rork Max with VisionKit's DataScanner
Add a live barcode and text scanner to your Rork Max native Swift app using VisionKit's DataScanner. Covers the SwiftUI bridge, availability handling, throttling repeated detections, and on-device verification with working code.
App Dev2026-07-01
Building an AlarmKit Timer in Rork Max's Native Swift — Alerts That Cut Through Silent Mode and Focus
Implementation notes on solving the 'local notifications stay silent in Silent Mode and Focus' problem with iOS 26's AlarmKit. Covers authorization, scheduling a countdown, the Live Activity for the Lock Screen and Dynamic Island, and listing active timers in Swift — plus the boundary design for Rork Max native code and bridging from Expo, with the pitfalls I actually hit.
App Dev2026-06-30
Screening Images On Device Before They Appear — Notes on SensitiveContentAnalysis
Implementation notes on blocking inappropriate images before they render, right on the device, for apps that handle AI-generated or user-submitted photos. Covers calling Apple's SensitiveContentAnalysis framework from Swift and wiring it into Rork Max native code or an Expo native module, with the pitfalls I actually hit.
📚RECOMMENDED BOOKS
Build a Large Language Model (From Scratch)
Sebastian Raschka
LLM Dev
Prompt Engineering for LLMs
Berryman & Ziegler
Prompting
AI Engineering
Chip Huyen
AI Eng
* Contains affiliate links
See all →