Visual Intelligence Skill

Visual Intelligence

Integrate your app with iOS Visual Intelligence to let users find app content by pointing their camera at objects.

When to Use

User wants camera-based search in their app
User asks about visual search integration
User wants to surface app content in system searches
User needs to handle visual intelligence queries

Overview

Visual Intelligence lets users:

Point camera at objects or use screenshots
System identifies what they're looking at
Your app provides matching content
Results appear in system UI

Your app implements:

IntentValueQuery to receive search requests
AppEntity types for searchable content
Display representations for results

Quick Start

1. Import Frameworks

import VisualIntelligence
import AppIntents

2. Create App Entity

struct ProductEntity: AppEntity {
    var id: String
    var name: String
    var price: String
    var imageName: String

    static var typeDisplayRepresentation: TypeDisplayRepresentation {
        TypeDisplayRepresentation(
            name: LocalizedStringResource("Product"),
            numericFormat: "\(placeholder: .int) products"
        )
    }

    var displayRepresentation: DisplayRepresentation {
        DisplayRepresentation(
            title: "\(name)",
            subtitle: "\(price)",
            image: .init(named: imageName)
        )
    }

    // Deep link URL
    var appLinkURL: URL? {
        URL(string: "myapp://product/\(id)")
    }
}

3. Create Intent Value Query

struct ProductIntentValueQuery: IntentValueQuery {
    func values(for input: SemanticContentDescriptor) async throws -> [ProductEntity] {
        // Search using labels
        if !input.labels.isEmpty {
            return await searchProducts(matching: input.labels)
        }

        // Search using image
        if let pixelBuffer = input.pixelBuffer {
            return await searchProducts(from: pixelBuffer)
        }

        return []
    }

    private func searchProducts(matching labels: [String]) async -> [ProductEntity] {
        // Search your database using provided labels
        // Return matching products
    }

    private func searchProducts(from pixelBuffer: CVReadOnlyPixelBuffer) async -> [ProductEntity] {
        // Use image recognition on the pixel buffer
        // Return matching products
    }
}

SemanticContentDescriptor

The system provides this object with information about what the user is looking at.

Properties

| Property | Type | Description | |----------|------|-------------| | labels | [String] | Classification labels from Visual Intelligence | | pixelBuffer | CVReadOnlyPixelBuffer? | Raw image data |

Usage Patterns

Label-based Search:

func values(for input: SemanticContentDescriptor) async throws -> [ProductEntity] {
    // Labels like "shoe", "sneaker", "Nike" etc.
    let labels = input.labels

    // Search your content using these labels
    return products.filter { product in
        labels.contains { label in
            product.tags.contains(label.lowercased())
        }
    }
}

Image-based Search:

func values(for input: SemanticContentDescriptor) async throws -> [ProductEntity] {
    guard let pixelBuffer = input.pixelBuffer else {
        return []
    }

    // Convert to CGImage for processing
    let ciImage = CIImage(cvPixelBuffer: pixelBuffer)
    let context = CIContext()

    guard let cgImage = context.createCGImage(ciImage, from: ciImage.extent) else {
        return []
    }

    // Use your ML model or image matching logic
    return await imageSearch.findMatches(for: cgImage)
}

Multiple Result Types

Use @UnionValue when your app has different content types.

@UnionValue
enum SearchResult {
    case product(ProductEntity)
    case category(CategoryEntity)
    case store(StoreEntity)
}

struct VisualSearchQuery: IntentValueQuery {
    func values(for input: SemanticContentDescriptor) async throws -> [SearchResult] {
        var results: [SearchResult] = []

        // Search products
        let products = await productSearch(input.labels)
        results.append(contentsOf: products.map { .product($0) })

        // Search categories
        let categories = await categorySearch(input.labels)
        results.append(contentsOf: categories.map { .category($0) })

        return results
    }
}

Display Representations

Create compelling visual representations for search results.

Basic Display

var displayRepresentation: DisplayRepresentation {
    DisplayRepresentation(
        title: "\(name)",
        subtitle: "\(description)",
        image: .init(named: thumbnailName)
    )
}

With System Image

var displayRepresentation: DisplayRepresentation {
    DisplayRepresentation(
        title: "\(name)",
        subtitle: "\(category)",
        image: .init(systemName: "tag.fill")
    )
}

Rich Display

var displayRepresentation: DisplayRepresentation {
    DisplayRepresentation(
        title: LocalizedStringResource("\(name)"),
        subtitle: LocalizedStringResource("\(formatPrice(price))"),
        image: DisplayRepresentation.Image(named: imageName)
    )
}

Deep Linking

Enable users to open specific content from search results.

URL-based Deep Links

struct ProductEntity: AppEntity {
    // ... other properties

    var appLinkURL: URL? {
        URL(string: "myapp://product/\(id)")
    }
}

Handle in App

@main
struct MyApp: App {
    var body: some Scene {
        WindowGroup {
            ContentView()
                .onOpenURL { url in
                    handleDeepLink(url)
                }
        }
    }

    func handleDeepLink(_ url: URL) {
        guard url.scheme == "myapp" else { return }

        switch url.host {
        case "product":
            let id = url.lastPathComponent
            navigationState.showProduct(id: id)
        default:
            break
        }
    }
}

"More Results" Button

Provide access to additional results beyond the initial set.

struct ViewMoreProductsIntent: AppIntent, VisualIntelligenceSearchIntent {
    static var title: LocalizedStringResource = "View More Products"

    @Parameter(title: "Semantic Content")
    var semanticContent: SemanticContentDescriptor

    func perform() async throws -> some IntentResult {
        // Store search context for your app
        SearchContext.shared.currentSearch = semanticContent.labels

        // Return empty result - system will open your app
        return .result()
    }
}

Complete Example

import SwiftUI
import AppIntents
import VisualIntelligence

// MARK: - Entities

struct RecipeEntity: AppEntity {
    var id: String
    var name: String
    var cuisine: String
    var prepTime: String
    var imageName: String

    static var typeDisplayRepresentation: TypeDisplayRepresentation {
        TypeDisplayRepresentation(
            name: LocalizedStringResource("Recipe"),
            numericFormat: "\(placeholder: .int) recipes"
        )
    }

    var displayRepresentation: DisplayRepresentation {
        DisplayRepresentation(
            title: "\(name)",
            subtitle: "\(cuisine) · \(prepTime)",
            image: .init(named: imageName)
        )
    }

    var appLinkURL: URL? {
        URL(string: "recipes://recipe/\(id)")
    }
}

// MARK: - Intent Value Query

struct RecipeVisualSearchQuery: IntentValueQuery {
    @Dependency var recipeStore: RecipeStore

    func values(for input: SemanticContentDescriptor) async throws -> [RecipeEntity] {
        // Use labels to find recipes
        // Labels might include: "pasta", "tomato", "Italian", etc.
        let matchingRecipes = await recipeStore.search(
            ingredients: input.labels,
            limit: 15
        )

        return matchingRecipes.map { recipe in
            RecipeEntity(
                id: recipe.id,
                name: recipe.name,
                cuisine: recipe.cuisine,
                prepTime: recipe.prepTimeFormatted,
                imageName: recipe.thumbnailName
            )
        }
    }
}

// MARK: - More Results Intent

struct ViewMoreRecipesIntent: AppIntent, VisualIntelligenceSearchIntent {
    static var title: LocalizedStringResource = "View More Recipes"

    @Parameter(title: "Semantic Content")
    var semanticContent: SemanticContentDescriptor

    func perform() async throws -> some IntentResult {
        // Save search context
        await MainActor.run {
            RecipeSearchState.shared.searchTerms = semanticContent.labels
        }
        return .result()
    }
}

// MARK: - Recipe Store

@Observable
class RecipeStore {
    private var recipes: [Recipe] = []

    func search(ingredients: [String], limit: Int) async -> [Recipe] {
        recipes
            .filter { recipe in
                ingredients.contains { ingredient in
                    recipe.ingredients.contains { recipeIngredient in
                        recipeIngredient.lowercased().contains(ingredient.lowercased())
                    }
                }
            }
            .prefix(limit)
            .map { $0 }
    }
}

Best Practices

Performance

Return results quickly (< 1 second)
Limit initial results (10-20 items)
Use "More Results" for additional content
Cache search indexes for fast lookup

func values(for input: SemanticContentDescriptor) async throws -> [ProductEntity] {
    // Limit results for quick response
    let results = await search(input.labels)
    return Array(results.prefix(15))
}

Relevance

Prioritize exact matches
Consider context (location, time)
Filter low-confidence matches

func values(for input: SemanticContentDescriptor) async throws -> [ProductEntity] {
    let results = await search(input.labels)

    // Sort by relevance score
    return results
        .filter { $0.relevanceScore > 0.5 }
        .sorted { $0.relevanceScore > $1.relevanceScore }
        .prefix(15)
        .map { $0 }
}

Quality Representations

Use clear, concise titles
Include helpful subtitles
Provide relevant thumbnails
Localize all text

var displayRepresentation: DisplayRepresentation {
    DisplayRepresentation(
        title: LocalizedStringResource(stringLiteral: name),
        subtitle: LocalizedStringResource(
            stringLiteral: "\(category) · \(formattedPrice)"
        ),
        image: .init(named: thumbnailName)
    )
}

Testing

Build and run on physical device
Open Camera or take screenshot
Activate Visual Intelligence
Point at objects relevant to your app
Verify results appear
Test tapping results opens your app correctly

Checklist

[ ] Import VisualIntelligence and AppIntents
[ ] Create AppEntity types for searchable content
[ ] Implement IntentValueQuery
[ ] Handle both labels and pixelBuffer
[ ] Create DisplayRepresentation for each entity
[ ] Implement deep linking URLs
[ ] Handle URLs in app with onOpenURL
[ ] Add "More Results" intent if needed
[ ] Test on physical device
[ ] Optimize for performance (< 1s response)
[ ] Localize display text

Agent Skills: Visual Intelligence

Install this agent skill to your local

Skill Files