Platform IntegrationAndroid
Basic Inference
Run your first AI model inference on Android with ZETIC Melange.
This guide shows how to run inference on Android after completing the SDK setup.
Prerequisites
- Melange SDK added to your project (Android Setup)
- A compiled model on the Melange Dashboard
- Your Personal Key and Model Key
Running Inference
// (1) Load model
// This handles model download (if needed) and NPU context creation
val model = ZeticMLangeModel(CONTEXT, PERSONAL_KEY, MODEL_NAME)
// (2) Prepare model inputs
// Ensure input shapes match your model's requirement (e.g., Float32 arrays)
val inputs: Array<Tensor> = // Prepare your inputs
// (3) Run Inference
// Executes the fully automated hardware graph.
// No manual delegate configuration or memory syncing required.
val outputs = model.run(inputs)// (1) Load model
// This handles model download (if needed) and NPU context creation
ZeticMLangeModel model = new ZeticMLangeModel(CONTEXT, PERSONAL_KEY, MODEL_NAME);
// (2) Prepare model inputs
// Ensure input shapes match your model's requirement (e.g., Float32 arrays)
Tensor[] inputs = // Prepare your inputs;
// (3) Run Inference
// Executes the hardware-accelerated graph. This is a blocking call.
Tensor[] outputs = model.run(inputs);Understanding the Flow
- Model Download: On first use, the SDK downloads the pre-compiled, hardware-optimized model binary from the Melange CDN. This binary is specific to your device's NPU chipset.
- NPU Context Creation: Melange initializes the appropriate hardware accelerator (Qualcomm HTP, MediaTek APU, Samsung DSP) and loads the model into NPU memory using zero-copy memory mapping.
- Inference Execution: Your input tensor is processed through the NPU-accelerated computation graph, and the output tensor is returned. No data leaves the device.
Always ensure your input tensor shapes exactly match what the model expects. A shape mismatch will throw a RuntimeException. Check the model's input specification on the Melange Dashboard.
Accessing Raw Output Buffers
For advanced use cases that require zero-copy access to output data:
val outputs = model.run(inputs)
val outputBuffers = model.outputBuffers
// Process raw byte buffers for custom post-processingSample Application
Please refer to the ZETIC Melange Apps repository for complete sample applications and more details.
Next Steps
- Advanced Configuration: Inference modes and pipeline usage
- Custom Preprocessing: Implement input preprocessing
- Multi-Model Pipelines: Chain models together