Enums and Constants
Reference for all enums and constants in the Melange Android SDK.
This page documents all enums and constants used by the Melange Android SDK, including inference mode selectors, dataset types, and cache management policies.
Package
com.zeticai.mlangeModelMode
Controls inference mode selection for general (non-LLM) models. Used with ZeticMLangeModel.
enum class ModelMode {
RUN_AUTO,
RUN_SPEED,
RUN_ACCURACY
}| Value | Description |
|---|---|
RUN_AUTO | Automatically balances speed and accuracy. Selects the fastest configuration while maintaining SNR above 20 dB. Recommended for most use cases. |
RUN_SPEED | Maximizes inference speed with minimum latency. Best for real-time applications where response time is the top priority. |
RUN_ACCURACY | Delivers the highest precision based on maximum SNR scores. Best for applications where accuracy is more critical than speed. |
val model = ZeticMLangeModel(
context,
PERSONAL_KEY,
MODEL_NAME,
VERSION,
ModelMode.RUN_AUTO
)LLMModelMode
Controls inference mode selection for LLM models. Used with ZeticMLangeLLMModel.
enum class LLMModelMode {
RUN_SPEED,
RUN_AUTO,
RUN_ACCURACY
}| Value | Status | Description |
|---|---|---|
RUN_SPEED | Available | Maximizes inference speed by selecting the most aggressive quantization. Recommended for real-time applications where response latency is the top priority. |
RUN_AUTO | Paused | Intelligently balances speed and accuracy. Evaluates quantized models across benchmark datasets and selects the fastest model that maintains accuracy within acceptable thresholds (maximum 15% accuracy drop). |
RUN_ACCURACY | Paused | Delivers the highest precision. Without a dataset specified, prioritizes less quantized (higher bit-width) models. With a dataset specified, selects the quantization with the highest benchmark score. |
RUN_AUTO and RUN_ACCURACY modes are currently paused while the LLM accuracy metric system is being updated. New models are available with RUN_SPEED mode only.
val model = ZeticMLangeLLMModel(
context,
PERSONAL_KEY,
MODEL_NAME,
null,
LLMModelMode.RUN_SPEED
)For more details on mode selection behavior, see the LLM Inference Modes guide.
LLMDataSetType
Specifies the benchmark dataset used for accuracy-based model selection in RUN_ACCURACY mode.
enum class LLMDataSetType {
NONE,
MMLU,
TRUTHFULQA,
CNN_DAILYMAIL,
GSM8K
}| Value | Description |
|---|---|
NONE | No dataset specified. In RUN_ACCURACY mode, the engine prioritizes less quantized models by default. |
MMLU | Massive Multitask Language Understanding. Evaluates broad knowledge and reasoning across 57 subjects. |
TRUTHFULQA | TruthfulQA benchmark. Measures the model's ability to generate truthful and informative answers. |
CNN_DAILYMAIL | CNN/DailyMail summarization benchmark. Evaluates text summarization quality. |
GSM8K | Grade School Math 8K. Measures mathematical reasoning ability on grade-school-level word problems. |
val model = ZeticMLangeLLMModel(
context,
PERSONAL_KEY,
MODEL_NAME,
null,
LLMModelMode.RUN_ACCURACY,
LLMDataSetType.MMLU
)Dataset-specific selection is only meaningful when LLMModelMode is set to RUN_ACCURACY. When using RUN_SPEED, the dataset type is ignored.
LLMKVCacheCleanupPolicy
Controls how the KV (Key-Value) cache is managed when it reaches capacity during token generation.
enum class LLMKVCacheCleanupPolicy {
CLEAN_UP_ON_FULL,
DO_NOT_CLEAN_UP
}| Value | Description |
|---|---|
CLEAN_UP_ON_FULL | Clears the entire conversation context when the KV cache is full and continues generation. This is the default behavior. |
DO_NOT_CLEAN_UP | Maintains the conversation context without cleanup when the KV cache is full. Requires manual cleanup via cleanUp() before starting a new conversation. |
// Default behavior: context is automatically cleared when cache fills up
val model = ZeticMLangeLLMModel(
context,
PERSONAL_KEY,
MODEL_NAME,
kvCacheCleanupPolicy = LLMKVCacheCleanupPolicy.CLEAN_UP_ON_FULL
)// Manual cleanup required between conversations
val model = ZeticMLangeLLMModel(
context,
PERSONAL_KEY,
MODEL_NAME,
kvCacheCleanupPolicy = LLMKVCacheCleanupPolicy.DO_NOT_CLEAN_UP
)
// After a conversation ends, clean up before starting a new one
model.cleanUp()
model.run("New prompt here")When using DO_NOT_CLEAN_UP, calling run() again without first calling cleanUp() may cause unexpected behavior or bugs. Always call cleanUp() between conversations.
See Also
- ZeticMLangeModel (Android): General model API reference
- ZeticMLangeLLMModel (Android): LLM model API reference
- LLM Inference Modes: Detailed mode selection guide
- Inference Mode Selection: General model mode guide