API ReferenceiOS
Enums and Constants
Reference for all enums and constants in the Melange iOS SDK.
This page documents the enums and constants available in the ZETIC Melange iOS SDK.
ModelMode
Controls the inference strategy for general (non-LLM) models.
import ZeticMLange| Value | Description |
|---|---|
RUN_AUTO | Default. Balanced speed and accuracy (SNR > 20dB). |
RUN_SPEED | Maximizes inference speed with minimum latency. |
RUN_ACCURACY | Maximizes precision based on SNR scores. |
Usage
let model = try ZeticMLangeModel(
personalKey: PERSONAL_KEY,
name: MODEL_NAME,
modelMode: RUN_SPEED
)LLMModelMode
Controls the inference strategy for LLM models.
| Value | Description | Status |
|---|---|---|
RUN_SPEED | Most aggressive quantization for minimum latency. | Available |
RUN_AUTO | Balanced speed and accuracy across benchmark datasets. | Paused |
RUN_ACCURACY | Highest precision quantization. | Paused |
Usage
let model = try ZeticMLangeLLMModel(
personalKey: PERSONAL_KEY,
name: MODEL_NAME,
version: VERSION,
modelMode: RUN_SPEED
)LLMDataSetType
Specifies the benchmark dataset for accuracy evaluation in LLM Accurate mode.
| Value | Description |
|---|---|
MMLU | Massive Multitask Language Understanding |
TRUTHFULQA | TruthfulQA benchmark |
CNN_DAILYMAIL | CNN/DailyMail summarization |
GSM8K | Grade School Math 8K |
Usage
let model = try ZeticMLangeLLMModel(
personalKey: PERSONAL_KEY,
name: MODEL_NAME,
version: VERSION,
modelMode: RUN_ACCURACY,
dataSetType: MMLU
)LLMKVCacheCleanupPolicy
Controls how the LLM engine handles a full KV cache.
| Value | Description |
|---|---|
CLEAN_UP_ON_FULL | Clears the entire context when the KV cache is full. (Default) |
DO_NOT_CLEAN_UP | Keeps the context without cleanup when the KV cache is full. |
When using DO_NOT_CLEAN_UP, calling run() again without calling cleanUp() may cause unexpected behavior.
See Also
- ZeticMLangeModel (iOS): Main model class
- Inference Mode Selection: Choosing the right mode
- LLM Inference Modes: LLM-specific modes