AI Module
SDK reference for AI completions and streaming
The ai module provides LLM completions using your platform-configured provider.
Import
Section titled “Import”from bifrost import aiMethods
Section titled “Methods”ai.complete()
Section titled “ai.complete()”Get a completion from the configured LLM.
async def complete( prompt: str | None = None, *, messages: list[dict[str, str]] | None = None, system: str | None = None, response_format: type[T] | None = None, knowledge: list[str] | None = None, max_tokens: int | None = None, temperature: float | None = None, org_id: str | None = None, model: str | None = None,) -> AIResponse | TParameters
Section titled “Parameters”| Parameter | Type | Description |
|---|---|---|
prompt | str | Simple prompt string |
messages | list[dict] | Chat messages with role and content |
system | str | System prompt (prepended to messages) |
response_format | type | Pydantic model for structured output |
knowledge | list[str] | Knowledge namespaces for RAG |
max_tokens | int | Override default max tokens |
temperature | float | Override default temperature (0.0-2.0) |
org_id | str | Organization context (auto-set in workflows) |
model | str | Override default model |
Returns
Section titled “Returns”AIResponse or instance of response_format if provided.
Examples
Section titled “Examples”# Simple promptresponse = await ai.complete("Explain Kubernetes")print(response.content)
# With system promptresponse = await ai.complete( "What should I do?", system="You are a helpful assistant.")
# Message formatresponse = await ai.complete( messages=[ {"role": "user", "content": "Hello"}, {"role": "assistant", "content": "Hi there!"}, {"role": "user", "content": "How are you?"} ])
# Structured outputfrom pydantic import BaseModel
class Analysis(BaseModel): sentiment: str score: float
result = await ai.complete( "Analyze: Great product!", response_format=Analysis)print(result.sentiment) # "positive"
# With RAGresponse = await ai.complete( "What is the refund policy?", knowledge=["policies", "faq"])ai.stream()
Section titled “ai.stream()”Stream tokens as they’re generated.
async def stream( prompt: str | None = None, *, messages: list[dict[str, str]] | None = None, system: str | None = None, knowledge: list[str] | None = None, max_tokens: int | None = None, temperature: float | None = None, org_id: str | None = None, model: str | None = None,) -> AsyncGenerator[AIStreamChunk, None]Parameters
Section titled “Parameters”Same as ai.complete() except no response_format.
| Parameter | Type | Description |
|---|---|---|
prompt | str | Simple prompt string |
messages | list[dict] | Chat messages with role and content |
system | str | System prompt (prepended to messages) |
knowledge | list[str] | Knowledge namespaces for RAG |
max_tokens | int | Override default max tokens |
temperature | float | Override default temperature (0.0-2.0) |
org_id | str | Organization context (auto-set in workflows) |
model | str | Override default model |
Yields
Section titled “Yields”AIStreamChunk objects with:
content- Text content of this chunkdone- Whether this is the final chunkinput_tokens- Total input tokens (only on final chunk)output_tokens- Total output tokens (only on final chunk)
Example
Section titled “Example”async for chunk in ai.stream("Write a story"): print(chunk.content, end="", flush=True) if chunk.done: print(f"\nTokens: {chunk.input_tokens + chunk.output_tokens}")ai.get_model_info()
Section titled “ai.get_model_info()”Get information about the configured LLM provider.
async def get_model_info() -> dict[str, Any]Returns
Section titled “Returns”Dictionary with provider, model, and configuration details.
Example
Section titled “Example”info = await ai.get_model_info()print(f"Using {info['provider']}/{info['model']}")AIResponse
Section titled “AIResponse”class AIResponse(BaseModel): content: str # Generated text input_tokens: int # Tokens in prompt output_tokens: int # Tokens in response model: str # Model usedAIStreamChunk
Section titled “AIStreamChunk”class AIStreamChunk(BaseModel): content: str # Chunk text done: bool # Is final chunk input_tokens: int | None = None # Only on final output_tokens: int | None = None # Only on finalSee Also
Section titled “See Also”- Using AI in Workflows - Usage guide
- Knowledge Module - RAG integration