ZeticMLangeModel

This page reflects ZeticMLange iOS 1.6.0.

The ZeticMLangeModel class is the primary interface for running on-device AI inference on iOS. It handles model downloading, Neural Engine context initialization, and hardware-accelerated execution through a single unified Swift API.

Import

import ZeticMLange

Initializers

ZeticMLangeModel exposes three initializers. Use the default (automatic) initializer unless you need to pin the runtime backend or processor:

Default (Automatic Selection) — selects the optimal runtime via ModelMode. Available on all tiers.
Explicit Target — pins the runtime backend. Requires Lite tier or higher.
Explicit Target + APType — pins both the runtime backend and the application processor. Requires Lite tier or higher.

Default (Automatic Selection)

Creates a new model instance using automatic runtime selection. All parameters after name have defaults, so the shortest call takes only personalKey and name.

public init(
    personalKey: String,
    name: String,
    version: Int? = nil,
    modelMode: ModelMode = .RUN_AUTO,
    quantType: QuantType? = nil,
    cacheHandlingPolicy: ZeticMLangeCacheHandlingPolicy = .REMOVE_OVERLAPPING,
    onDownload: ((Float) -> Void)? = nil
) throws

Parameter	Type	Default	Description
`personalKey`	`String`	—	Your personal authentication key. See Personal Key.
`name`	`String`	—	Full model identifier in `account_name/project_name` format (e.g., `"Steve/YOLOv11_comparison"`).
`version`	`Int?`	`nil`	Specific model version to load. `nil` uses the latest.
`modelMode`	`ModelMode`	`.RUN_AUTO`	Inference strategy used by automatic selection. See Enums.
`quantType`	`QuantType?`	`nil`	Quantization precision filter (`.FP32`, `.FP16`, `.INT`). When set, only targets matching this precision are considered during automatic selection. `nil` disables precision-based filtering. See Enums → QuantType.
`cacheHandlingPolicy`	`ZeticMLangeCacheHandlingPolicy`	`.REMOVE_OVERLAPPING`	Managed artifact cache policy. See Enums.
`onDownload`	`((Float) -> Void)?`	`nil`	Optional download progress callback from `0.0` to `1.0`.

Throws: An error if the model cannot be downloaded or the Neural Engine context fails to initialize.

let model = try ZeticMLangeModel(
    personalKey: PERSONAL_KEY,
    name: "Steve/YOLOv11_comparison"
)

let model = try ZeticMLangeModel(
    personalKey: PERSONAL_KEY,
    name: MODEL_NAME,
    modelMode: .RUN_SPEED,
    quantType: .FP16, // only consider FP16 targets during automatic selection
    onDownload: { progress in print("downloading: \(progress)") }
)

The initializer performs a network call on first use to download the model binary. The binary is cached locally after the first download.

Explicit `Target`

Pins the runtime backend (e.g., CoreML, TFLite) instead of relying on ModelMode's automatic pick.

public init(
    personalKey: String,
    name: String,
    version: Int? = nil,
    target: Target,
    cacheHandlingPolicy: ZeticMLangeCacheHandlingPolicy = .REMOVE_OVERLAPPING,
    onDownload: ((Float) -> Void)? = nil
) throws

Parameter	Type	Default	Description
`target`	`Target`	—	Runtime target (e.g., `.ZETIC_MLANGE_TARGET_COREML`, `.ZETIC_MLANGE_TARGET_TFLITE_FP32`).

Other parameters match the default initializer above.

let model = try ZeticMLangeModel(
    personalKey: PERSONAL_KEY,
    name: MODEL_NAME,
    target: .ZETIC_MLANGE_TARGET_COREML
)

Requires a Lite tier or higher subscription. Free-tier keys cannot use explicit Target selection.

Explicit `Target` + `APType`

Pins both the runtime backend and the application processor.

public init(
    personalKey: String,
    name: String,
    version: Int? = nil,
    target: Target,
    apType: APType = .NA,
    cacheHandlingPolicy: ZeticMLangeCacheHandlingPolicy = .REMOVE_OVERLAPPING,
    onDownload: ((Float) -> Void)? = nil
) throws

Parameter	Type	Default	Description
`target`	`Target`	—	Runtime target.
`apType`	`APType`	`.NA`	Application processor: `.CPU`, `.GPU`, or `.NPU`. See Enums.

Other parameters match the default initializer above.

let model = try ZeticMLangeModel(
    personalKey: PERSONAL_KEY,
    name: MODEL_NAME,
    target: .ZETIC_MLANGE_TARGET_COREML,
    apType: .NPU
)

Requires a Lite tier or higher subscription.

Methods

`run(inputs:)`

Executes inference on the loaded model using the provided input tensors. Each call copies the bytes of every element in inputs into the model's internal input buffer before running.

func run(inputs: [Tensor]) throws -> [Tensor]

Parameter	Type	Description
`inputs`	`[Tensor]`	An array of input tensors matching the model's expected input shapes and data types.

Returns: [Tensor]: The model's output tensors.

Throws: An error if input shapes do not match the model's expected inputs, or if inference execution fails.

let outputs = try model.run(inputs: inputs)

Zero-copy path is not yet available on iOS. The iOS ZeticMLangeModel does not currently expose model-owned input buffers, so the per-inference copy cannot be avoided today. An equivalent of Android's getInputBuffers() + run() idiom is planned for a future SDK release.

`getOutputTensors()`

Deprecated. This method will be removed in a future SDK release and has no counterpart on Android. Use the return value of run(inputs:) directly instead.

Returns the output tensors produced by the most recent run(inputs:) call.

func getOutputTensors() -> [Tensor]

Returns: [Tensor]: The same tensors that the previous run(inputs:) call returned. Returns an empty array if run(inputs:) has not been called yet.

_ = try model.run(inputs: inputs)
let outputs = model.getOutputTensors()

`getOutputDataArray()`

Deprecated. This method will be removed in a future SDK release and has no counterpart on Android. Read .data off the tensors returned by run(inputs:) instead.

Returns the raw Data buffers of the outputs produced by the most recent run(inputs:) call, skipping the Tensor wrapping.

func getOutputDataArray() -> [Data]

Returns: [Data]: The underlying Data of each output tensor, in the model's output order. Returns an empty array if run(inputs:) has not been called yet.

_ = try model.run(inputs: inputs)
let rawOutputs: [Data] = model.getOutputDataArray()

Full Working Example

import ZeticMLange

class ViewController: UIViewController {
    override func viewDidLoad() {
        super.viewDidLoad()

        do {
            // (1) Load model
            // Downloads the optimized binary on first run, then caches locally
            let model = try ZeticMLangeModel(personalKey: PERSONAL_KEY, name: "Steve/YOLOv11_comparison")

            // (2) Prepare model inputs
            // Build one Tensor per model input. Shapes and dtypes must match the
            // model's input specification — see the Tensor reference below.
            let inputs: [Tensor] = prepareInputs()

            // (3) Run Inference
            // Executes the fully automated hardware graph.
            // No manual delegate configuration or memory syncing required.
            let outputs = try model.run(inputs: inputs)

            // (4) Process outputs
            // outputs is a [Tensor] containing the model's results
            for output in outputs {
                // Process each output tensor
            }
        } catch {
            print("Melange error: \(error)")
        }
    }
}

Always ensure your input tensor shapes exactly match what the model expects. A shape mismatch will throw an error. Check the model's input specification on the Melange Dashboard.

Swift Package Manager Setup

Add the Melange package to your Xcode project:

Open your project in Xcode
Go to File then Add Package Dependencies
Enter the repository URL: https://github.com/zetic-ai/ZeticMLangeiOS
Click Add Package
Select ZeticMLange and link it to your app target