Inference Mode Selection

Melange provides several inference modes for general (non-LLM) models to balance between speed and accuracy based on your application's requirements.

Intelligently balances speed and accuracy for optimal performance. This mode automatically selects the fastest configuration while ensuring high-quality results (SNR > 20dB). This is the recommended mode for most use cases.

Speed-First (`RUN_SPEED`)

Maximizes inference speed with minimum latency. Recommended for real-time applications where response time is the top priority.

Accuracy-First (`RUN_ACCURACY`)

Delivers the highest precision based on maximum SNR scores. Best suited for applications where accuracy is more critical than speed.

The optimal mode is automatically determined based on:

Speed metrics: Inference time (latency in ms)
Accuracy metrics: SNR (Signal-to-Noise Ratio in dB)

You can override this automatic selection by explicitly specifying a mode.

API Usage

// Default: Auto mode
// Speed first, but maintains SNR above 20dB
val modelDefault = ZeticMLangeModel(
    context = this,
    personalKey = PERSONAL_KEY,
    name = MODEL_NAME,
    modelMode = ModelMode.RUN_AUTO
)

// Speed First Mode
val modelFast = ZeticMLangeModel(
    context = this,
    personalKey = PERSONAL_KEY,
    name = MODEL_NAME,
    modelMode = ModelMode.RUN_SPEED
)

// Accuracy First Mode
val modelAccurate = ZeticMLangeModel(
    context = this,
    personalKey = PERSONAL_KEY,
    name = MODEL_NAME,
    modelMode = ModelMode.RUN_ACCURACY
)

// Default: Auto mode
// Speed first, but maintains SNR above 20dB
let modelDefault = try ZeticMLangeModel(
    personalKey: PERSONAL_KEY,
    name: MODEL_NAME,
    modelMode: .RUN_AUTO
)

// Speed First Mode
let modelFast = try ZeticMLangeModel(
    personalKey: PERSONAL_KEY,
    name: MODEL_NAME,
    modelMode: .RUN_SPEED
)

// Accuracy First Mode
let modelAccurate = try ZeticMLangeModel(
    personalKey: PERSONAL_KEY,
    name: MODEL_NAME,
    modelMode: .RUN_ACCURACY
)

// Default: Auto mode
// Speed first, but maintains SNR above 20dB
final modelDefault = await ZeticMLangeModel.create(
  personalKey: personalKey,
  name: modelName,
  modelMode: ModelMode.runAuto,
);

// Speed First Mode
final modelFast = await ZeticMLangeModel.create(
  personalKey: personalKey,
  name: modelName,
  modelMode: ModelMode.runSpeed,
);

// Accuracy First Mode
final modelAccurate = await ZeticMLangeModel.create(
  personalKey: personalKey,
  name: modelName,
  modelMode: ModelMode.runAccuracy,
);

Choosing the Right Mode

Use Case	Recommended Mode	Why
Real-time video processing	`RUN_SPEED`	Minimize frame processing latency
Medical image analysis	`RUN_ACCURACY`	Precision is critical
General mobile app	`RUN_AUTO`	Best balance for most users
Prototype / testing	`RUN_AUTO`	Good default behavior

Next Steps

LLM Inference Modes: Modes specific to LLM models
Performance Optimization: Additional tuning tips
Performance-Adaptive Deployment: How Melange selects optimal binaries

Inference Mode Selection

Available Modes

Default / Auto (`RUN_AUTO`)

Speed-First (`RUN_SPEED`)

Accuracy-First (`RUN_ACCURACY`)

API Usage

Choosing the Right Mode

Next Steps

On this page

Inference Mode Selection

Available Modes

Default / Auto (RUN_AUTO)

Speed-First (RUN_SPEED)

Accuracy-First (RUN_ACCURACY)

API Usage

Choosing the Right Mode

Next Steps

On this page

Default / Auto (`RUN_AUTO`)

Speed-First (`RUN_SPEED`)

Accuracy-First (`RUN_ACCURACY`)