ZETIC.MLange LLM Model
Abstraction layer for LLM implementations using ZETIC.ai's infrastructure
Overview
ZETIC.MLange LLM Model provides an abstraction layer for LLM (Large Language Model) implementations using ZETIC.ai's infrastructure. It offers a developer-friendly interface for downloading and running LLM models on mobile devices, managing model downloads.
Model Support
Current tested models include:
- DeepSeek-R1-Distill-Qwen-1.5B
- TinyLlama-1.1B-Chat-v1.0
- EXAONE-Deep-2.4B-GGUF
For more model examples and use cases, visit the Natural Language Processing section on the ZETIC.MLange website.
Backend Abstraction
- Supports multiple LLM backends including LLaMA.cpp
- Handles model initialization and runtime management
- Provides unified interface across different backend implementations
How to Implement LLM
Prepare model
The input for MLange (LLM case) is:
-
Model: Public Hugging Face model repository ID
e.g.,TinyLlama/TinyLlama-1.1B-Chat-v1.0,deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B, or any other publicly available model with an open license.Most private repositories or models with restricted licenses are not supported at this time.
Prepare On-device Model and Personal Key
Create Model Project
Use Web Dashboard to create Model Project.
Generate Personal Key
Use Web Dashboard to generate Personal Key.
For more details, refer to Generate Personal Key.
Initialize and run your model in mobile app
Please follow the Deploy to Android Studio guide for details.
val model = ZeticMLangeLLMModel(context, tokenKey, modelKey, LLMModelMode.RUN_DEFAULT)
model.run("prompt")
while (true) {
val token = model.waitForNextToken()
if (token == "") break
// Add token to your AI agent's chat message
}Please follow the Deploy to Xcode guide for details.
let model = ZeticMLangeLLMModel(tokenKey, modelKey, .RUN_DEFAULT)
model.run("prompt")
while true {
let token = model.waitForNextToken()
if token == "" {
break
}
// Add token to your AI agent's chat message
}API Reference
Initialization
Option 1: Automatic configuration (Recommended)
init(personalKey: String, modelKey: String)Automatically downloads and initializes the model with default settings optimized for the device.
Parameters:
personalKey: Your personal API keymodelKey: Identifier for the model to download
Option 2: Custom configuration (Advanced)
init(
personalKey: String,
modelKey: String,
modelMode: LLMModelMode,
dataSetType: LLMDataSetType,
kvCacheCleanupPolicy: LLMKVCacheCleanupPolicy = LLMKVCacheCleanupPolicy.CLEAN_UP_ON_FULL,
onProgress: ((Float) -> Unit)? = null
)Downloads and initializes the model with custom configuration for fine-tuned control.
Parameters:
personalKey: Your personal API keymodelKey: Identifier for the model to downloadmodelMode: (Optional) LLM inference mode for device-appropriate backend selectiondataSetType: (Optional) Type of dataset to use for the modelkvCacheCleanupPolicy: (Optional) Policy for handling KV cache when full. Defaults toCLEAN_UP_ON_FULL-
CLEAN_UP_ON_FULL: Clears the entire context when KV cache is full -
DO_NOT_CLEAN_UP: Keeps the context without cleanup when KV cache is fullRunning
run()again without callingcleanUp()may cause unexpected behavior or bugs
-
onProgress: (Optional) Callback function that reports model download progress as aFloatvalue (0.0 to 1.0)
Example:
// Monitor download progress
init(
personalKey = "your_key",
modelKey = "model_key",
modelMode = LLMModelMode.RUN_AUTO,
dataSetType = LLMDataSetType.DEFAULT,
onProgress = { progress ->
println("Download progress: ${(progress * 100).toInt()}%")
}
)For more information about mode selection, please follow the LLM Inference Mode Select page.
Conversation
Start conversation
run(prompt: String)Starts a conversation with the provided prompt.
Get next token
waitForNextToken(): StringReturns the next generated token. Empty string indicates completion.
Clean the context
cleanUp()Cleans up the context of the running model.
Implement ZETIC.LLM.Model to your project
Quick Start Templates
Build a complete chat app with just your PERSONAL_KEY and PROJECT_NAME. Check each repository's README for detailed instructions.