ZeticMLangeLLMModel

This page reflects zetic_mlange 1.9.1.

ZeticMLangeLLMModel loads an on-device LLM and supports text generation, token streaming, function calling, and image response for LFM-VL models.

Import

import 'package:zetic_mlange/zetic_mlange.dart';

`create`

static Future<ZeticMLangeLLMModel> create({
  required String personalKey,
  required String name,
  int? version,
  LLMModelMode modelMode = LLMModelMode.runAuto,
  APType? apType,
  LLMQuantType? quantType,
  ModelCacheHandlingPolicy cacheHandlingPolicy =
      CacheHandlingPolicy.removeOverlapping,
  LLMInitOption? initOption,
  LLMKVCacheCleanupPolicy kvCacheCleanupPolicy =
      LLMKVCacheCleanupPolicy.cleanUpOnFull,
  MlangeProgressCallback? onDownload,
})

Parameter	Type	Default	Description
`personalKey`	`String`	-	Personal key for accessing the model.
`name`	`String`	-	Model name in `account_name/project_name` format.
`version`	`int?`	`null`	Model version. `null` loads the latest version.
`modelMode`	`LLMModelMode`	`runAuto`	Backend selection strategy.
`apType`	`APType?`	`null`	Optional processor filter.
`quantType`	`LLMQuantType?`	`null`	Optional quantization filter.
`cacheHandlingPolicy`	`ModelCacheHandlingPolicy`	`removeOverlapping`	Managed artifact cache cleanup policy.
`initOption`	`LLMInitOption?`	`null`	LLM initialization options.
`kvCacheCleanupPolicy`	`LLMKVCacheCleanupPolicy`	`cleanUpOnFull`	Convenience default used when `initOption` is omitted.
`onDownload`	`MlangeProgressCallback?`	`null`	Download progress callback from `0.0` to `1.0`.

final model = await ZeticMLangeLLMModel.create(
  personalKey: personalKey,
  name: 'account_name/project_name',
  initOption: const LLMInitOption(nCtx: 4096),
);

Text Generation

`run`

Starts generation for a prompt.

LLMRunResult run(String text)

final result = model.run('Explain on-device AI in one paragraph.');

`waitForNextToken`

Waits for the next generated token.

LLMNextTokenResult waitForNextToken()

while (true) {
  final next = model.waitForNextToken();
  if (next.isFinished) break;
  append(next.token);
}

Vision-Language Response

Use respond(...) with an LFM-VL-capable model.

Future<String> respond({
  String systemPrompt = '',
  required String userText,
  required ZeticMLangeLLMImage image,
})

final image = ZeticMLangeLLMImage(
  rgb: rgbBytes,
  width: width,
  height: height,
);

final response = await model.respond(
  systemPrompt: 'Answer briefly.',
  userText: 'What is in this image?',
  image: image,
);

Function Calling

String? functionCallingSystemPrompt

void registerTool(LLMToolSpec spec, LLMToolExecutor executor)
bool unregisterTool(String name)
void clearTools()
List<LLMToolSpec> registeredTools()

model.registerTool(
  const LLMToolSpec(
    name: 'lookup',
    description: 'Look up local app data.',
    parametersJson: '{"type":"object","properties":{"query":{"type":"string"}}}',
  ),
  (call) => const LLMToolResult(content: '{"result":"Found"}'),
);

model.run('Use lookup to answer the question.');

Lifecycle

bool get isClosed
void cleanUp()
Future<void> resetSession()
Future<void> close()
@Deprecated('Use close() instead.')
Future<void> deinit()

Call cleanUp() or resetSession() before starting a fresh conversation. Call close() when the model is no longer needed.

ZeticMLangeLLMModel

On this page