@built-in-ai/web-llm
Usage
Features and usage examples for @built-in-ai/web-llm with AI SDK v6
Basic Text Generation
Streaming Text
import { streamText } from "ai";
import { webLLM } from "@built-in-ai/web-llm";
const result = streamText({
model: webLLM("Qwen3-0.6B-q0f16-MLC"),
prompt: 'Invent a new holiday and describe its traditions.',
});
for await (const textPart of result.textStream) {
console.log(textPart);
}Non-streaming Text
import { generateText } from "ai";
import { webLLM } from "@built-in-ai/web-llm";
const result = await generateText({
model: webLLM("Qwen3-0.6B-q0f16-MLC"),
prompt: 'Invent a new holiday and describe its traditions.',
});
console.log(result.text);Download Progress Tracking
When using WebLLM models for the first time, the model needs to be downloaded. Track progress to improve UX:
import { streamText } from "ai";
import { webLLM } from "@built-in-ai/web-llm";
const model = webLLM("Qwen3-0.6B-q0f16-MLC");
const availability = await model.availability();
if (availability === "unavailable") {
console.log("Browser doesn't support WebLLM");
return;
}
if (availability === "downloadable") {
await model.createSessionWithProgress((progress) => {
console.log(`Download: ${progress.text}`);
});
}
// Model is ready
const result = streamText({
model,
messages: [{ role: "user", content: "Hello!" }],
});Tool Calling
For best tool calling results, use reasoning models like Qwen3.
The webLLM model supports tool calling with multi-step execution:
import { streamText, tool, stepCountIs } from "ai";
import { webLLM } from "@built-in-ai/web-llm";
import { z } from "zod";
const result = await streamText({
model: webLLM("Qwen3-0.6B-q0f16-MLC"),
messages: [{ role: "user", content: "What's the weather in San Francisco?" }],
tools: {
weather: tool({
description: 'Get the weather in a location',
inputSchema: z.object({
location: z.string().describe('The location to get the weather for'),
}),
execute: async ({ location }) => ({
location,
temperature: 72 + Math.floor(Math.random() * 21) - 10,
}),
}),
},
stopWhen: stepCountIs(5), // multiple steps
});It also supports tool execution approval (needsApproval).
Tool Calling with Structured Output
import { Output, ToolLoopAgent, tool } from "ai";
import { webLLM } from "@built-in-ai/web-llm";
import { z } from "zod";
const agent = new ToolLoopAgent({
model: webLLM("Qwen3-0.6B-q0f16-MLC"),
tools: {
weather: tool({
description: "Get the weather in a location",
inputSchema: z.object({ city: z.string() }),
execute: async ({ city }) => {
// ...
},
}),
},
output: Output.object({
schema: z.object({
summary: z.string(),
temperature: z.number(),
recommendation: z.string(),
}),
}),
});
const { output } = await agent.generate({
prompt: "What is the weather in San Francisco and what should I wear?",
});Structured Output
Generate structured JSON output with schema validation:
Using generateText
import { generateText, Output } from "ai";
import { webLLM } from "@built-in-ai/web-llm";
import { z } from "zod";
const { output } = await generateText({
model: webLLM("Qwen3-0.6B-q0f16-MLC"),
output: Output.object({
schema: z.object({
recipe: z.object({
name: z.string(),
ingredients: z.array(
z.object({ name: z.string(), amount: z.string() }),
),
steps: z.array(z.string()),
}),
}),
}),
prompt: "Generate a lasagna recipe.",
});Using streamText
import { streamText, Output } from "ai";
import { webLLM } from "@built-in-ai/web-llm";
import { z } from "zod";
const { partialOutputStream } = streamText({
model: webLLM("Qwen3-0.6B-q0f16-MLC"),
output: Output.object({
schema: z.object({
recipe: z.object({
name: z.string(),
ingredients: z.array(
z.object({ name: z.string(), amount: z.string() }),
),
steps: z.array(z.string()),
}),
}),
}),
prompt: 'Generate a lasagna recipe.',
});Web Worker Usage
For better performance, run models off the main thread:
1. Create worker.ts
import { WebWorkerMLCEngineHandler } from "@built-in-ai/web-llm";
const handler = new WebWorkerMLCEngineHandler();
self.onmessage = (msg: MessageEvent) => {
handler.onmessage(msg);
};2. Use the worker
import { streamText } from "ai";
import { webLLM } from "@built-in-ai/web-llm";
const model = webLLM("Qwen3-0.6B-q0f16-MLC", {
worker: new Worker(new URL("./worker.ts", import.meta.url), {
type: "module",
}),
});
const result = streamText({
model,
messages: [{ role: "user", content: "Hello!" }],
});
for await (const chunk of result.textStream) {
console.log(chunk);
}