ZETIC.MLange

ZETIC.MLange LLM Model

Abstraction layer for LLM implementations using ZETIC.ai's infrastructure

Overview

ZETIC.MLange LLM Model provides an abstraction layer for LLM (Large Language Model) implementations using ZETIC.ai's infrastructure. It offers a developer-friendly interface for downloading and running LLM models on mobile devices, managing model downloads.

Model Support

Current tested models include:

  • DeepSeek-R1-Distill-Qwen-1.5B
  • TinyLlama-1.1B-Chat-v1.0
  • EXAONE-Deep-2.4B-GGUF

For more model examples and use cases, visit the Natural Language Processing section on the ZETIC.MLange website.

Backend Abstraction

  • Supports multiple LLM backends including LLaMA.cpp
  • Handles model initialization and runtime management
  • Provides unified interface across different backend implementations

How to Implement LLM

Prepare model

The input for MLange (LLM case) is:

  1. Model: Public Hugging Face model repository ID
    e.g., TinyLlama/TinyLlama-1.1B-Chat-v1.0, deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B, or any other publicly available model with an open license.

    Most private repositories or models with restricted licenses are not supported at this time.

Prepare On-device Model and Personal Key

Create Model Project

Use Web Dashboard to create Model Project.

Generate Personal Key

Use Web Dashboard to generate Personal Key.

For more details, refer to Generate Personal Key.

Initialize and run your model in mobile app

Please follow the Deploy to Android Studio guide for details.

val model = ZeticMLangeLLMModel(context, tokenKey, modelKey, LLMModelMode.RUN_DEFAULT)

model.run("prompt")

while (true) {
    val token = model.waitForNextToken()
    
    if (token == "") break

    // Add token to your AI agent's chat message
}

Please follow the Deploy to Xcode guide for details.

let model = ZeticMLangeLLMModel(tokenKey, modelKey, .RUN_DEFAULT)

model.run("prompt")

while true {
    let token = model.waitForNextToken()
    
    if token == "" {
        break
    }

    // Add token to your AI agent's chat message
}

API Reference

Initialization

Option 1: Automatic configuration (Recommended)

init(personalKey: String, modelKey: String)

Automatically downloads and initializes the model with default settings optimized for the device.

Parameters:

  • personalKey: Your personal API key
  • modelKey: Identifier for the model to download

Option 2: Custom configuration (Advanced)

init(
    personalKey: String, 
    modelKey: String, 
    modelMode: LLMModelMode, 
    dataSetType: LLMDataSetType,
    kvCacheCleanupPolicy: LLMKVCacheCleanupPolicy = LLMKVCacheCleanupPolicy.CLEAN_UP_ON_FULL,
    onProgress: ((Float) -> Unit)? = null
)

Downloads and initializes the model with custom configuration for fine-tuned control.

Parameters:

  • personalKey: Your personal API key
  • modelKey: Identifier for the model to download
  • modelMode: (Optional) LLM inference mode for device-appropriate backend selection
  • dataSetType: (Optional) Type of dataset to use for the model
  • kvCacheCleanupPolicy: (Optional) Policy for handling KV cache when full. Defaults to CLEAN_UP_ON_FULL
    • CLEAN_UP_ON_FULL: Clears the entire context when KV cache is full

    • DO_NOT_CLEAN_UP: Keeps the context without cleanup when KV cache is full

      Running run() again without calling cleanUp() may cause unexpected behavior or bugs

  • onProgress: (Optional) Callback function that reports model download progress as a Float value (0.0 to 1.0)

Example:

// Monitor download progress
init(
    personalKey = "your_key",
    modelKey = "model_key",
    modelMode = LLMModelMode.RUN_AUTO,
    dataSetType = LLMDataSetType.DEFAULT,
    onProgress = { progress ->
        println("Download progress: ${(progress * 100).toInt()}%")
    }
)

For more information about mode selection, please follow the LLM Inference Mode Select page.

Conversation

Start conversation

run(prompt: String)

Starts a conversation with the provided prompt.

Get next token

waitForNextToken(): String

Returns the next generated token. Empty string indicates completion.

Clean the context

cleanUp()

Cleans up the context of the running model.

Implement ZETIC.LLM.Model to your project

Quick Start Templates

Build a complete chat app with just your PERSONAL_KEY and PROJECT_NAME. Check each repository's README for detailed instructions.