Melange
How-To Guides

Inference Mode Selection

Choose between CPU, GPU, and NPU inference modes in ZETIC Melange.

Melange provides several inference modes for general (non-LLM) models to balance between speed and accuracy based on your application's requirements.

Available Modes

Default / Auto (RUN_AUTO)

Intelligently balances speed and accuracy for optimal performance. This mode automatically selects the fastest configuration while ensuring high-quality results (SNR > 20dB). This is the recommended mode for most use cases.

Speed-First (RUN_SPEED)

Maximizes inference speed with minimum latency. Recommended for real-time applications where response time is the top priority.

Accuracy-First (RUN_ACCURACY)

Delivers the highest precision based on maximum SNR scores. Best suited for applications where accuracy is more critical than speed.

The optimal mode is automatically determined based on:

  • Speed metrics: Inference time (latency in ms)
  • Accuracy metrics: SNR (Signal-to-Noise Ratio in dB)

You can override this automatic selection by explicitly specifying a mode.

API Usage

// Default: Auto mode
// Speed first, but maintains SNR above 20dB
val modelDefault = ZeticMLangeModel(
    context = this,
    personalKey = PERSONAL_KEY,
    name = MODEL_NAME,
    modelMode = ModelMode.RUN_AUTO
)

// Speed First Mode
val modelFast = ZeticMLangeModel(
    context = this,
    personalKey = PERSONAL_KEY,
    name = MODEL_NAME,
    modelMode = ModelMode.RUN_SPEED
)

// Accuracy First Mode
val modelAccurate = ZeticMLangeModel(
    context = this,
    personalKey = PERSONAL_KEY,
    name = MODEL_NAME,
    modelMode = ModelMode.RUN_ACCURACY
)
// Default: Auto mode
// Speed first, but maintains SNR above 20dB
let modelDefault = try ZeticMLangeModel(
    personalKey: PERSONAL_KEY,
    name: MODEL_NAME,
    modelMode: .runAuto
)

// Speed First Mode
let modelFast = try ZeticMLangeModel(
    personalKey: PERSONAL_KEY,
    name: MODEL_NAME,
    modelMode: .runSpeed
)

// Accuracy First Mode
let modelAccurate = try ZeticMLangeModel(
    personalKey: PERSONAL_KEY,
    name: MODEL_NAME,
    modelMode: .runAccuracy
)

Choosing the Right Mode

Use CaseRecommended ModeWhy
Real-time video processingRUN_SPEEDMinimize frame processing latency
Medical image analysisRUN_ACCURACYPrecision is critical
General mobile appRUN_AUTOBest balance for most users
Prototype / testingRUN_AUTOGood default behavior

Next Steps