Skip to main content

Overview

Model Management allows you to configure which AI models power your agents, adjust model parameters, and control costs.

Choose Models

Select GPT-4, Claude, Qwen, or custom models

Control Costs

See pricing and manage API usage

Tune Parameters

Adjust temperature, tokens, and more

Per-Agent

Different models for different agents

Available Models

  • OpenAI
  • Anthropic
  • Qwen
  • OpenRouter
GPT-4 Family
  • GPT-4 Turbo: Best reasoning, most capable
    • Cost: $$$ (Highest)
    • Speed: Moderate
    • Best for: Complex logic, architecture
  • GPT-4: Slightly less capable than Turbo
    • Cost: $$$
    • Speed: Slower
    • Best for: General development
  • GPT-3.5 Turbo: Fast and affordable
    • Cost: $ (Low)
    • Speed: Very fast
    • Best for: Simple UI, quick iterations
Requires: OpenAI API key

Model Configuration

Global Default Models

Set default models for all new agents:
1

Open Settings

Profile → SettingsModel Management
2

Stream Agent Default

Choose default model for Stream agents
3

Iterative Agent Default

Choose default model for Iterative agents
4

Save

New agents will use these defaults
Existing custom agents keep their configured models. Only affects newly created agents.

Per-Agent Models

Customize models for individual agents:
1

Open Library

Go to LibraryAgents
2

Edit Agent

Click Edit on an open-source agent
3

Select Model

Choose from available models
4

Save

Agent now uses the new model
5

Test

Try the agent with new model
You can only change models for open-source agents you own. Closed-source agents have fixed models.

Model Parameters

Temperature

Controls randomness and creativity:
  • Low (0.0 - 0.3)
  • Medium (0.4 - 0.7)
  • High (0.8 - 1.0)
Deterministic and focused
  • More predictable output
  • Consistent code style
  • Less creative
  • Repeatable results
Best for:
  • Production code
  • Bug fixes
  • Refactoring
  • Documentation
Example: 0.2

Max Tokens

Limits response length:
  • Stream Agents: 4000-8000 tokens typical
  • Iterative Agents: 2000-4000 per step
  • Higher: More complete responses, higher cost
  • Lower: Faster, cheaper, may truncate
1 token ≈ 4 characters. A component with 100 lines ≈ 400-800 tokens.

Other Parameters

Alternative to temperature:
  • 0.1: Very focused
  • 0.5: Balanced
  • 0.9: Diverse
  • 1.0: All possibilities
Default: 1.0
Reduces repetition:
  • 0.0: No penalty (default)
  • 0.5: Some reduction
  • 2.0: Maximum reduction
Higher = less repetitive code
Encourages new topics:
  • 0.0: No penalty (default)
  • 0.5: Moderate encouragement
  • 2.0: Maximum encouragement
Higher = more topic diversity

Cost Management

Understanding Pricing

AI model costs are based on tokens:
  • Input Tokens
  • Output Tokens
What you send
  • Your prompts
  • System messages
  • Context/history
  • Code being edited
Usually cheaper than output

Model Cost Comparison

ModelInputOutputRelative Cost
GPT-4 Turbo$$$$$$High
Claude Opus$$$$$$High
GPT-3.5$$Low
Claude Haiku$$Low
Qwen 32B$$Very Low
Exact pricing varies - check provider websites for current rates. Tesslate doesn’t mark up model costs.

Reducing Costs

  • GPT-3.5/Haiku for UI
  • GPT-4/Opus for complex logic
  • Match model to task complexity
  • Don’t use premium for simple tasks
  • Be specific and concise
  • Avoid unnecessary context
  • Clear, direct requests
  • Reduce back-and-forth
  • Clear chat history when changing topics
  • Don’t carry unnecessary context
  • Start fresh for new features
  • Reduce token usage
  • Lower max_tokens when appropriate
  • Use temperature 0.3 for predictable tasks
  • Disable verbose explanations
  • Request concise responses

Usage Monitoring

Tracking API Usage

1

Check Provider Dashboard

  • OpenAI: platform.openai.com/usage
  • Anthropic: console.anthropic.com/settings/billing
  • OpenRouter: openrouter.ai/activity
2

Monitor in Tesslate

Coming soon: Built-in usage dashboard
3

Set Budgets

Configure spending limits in provider dashboards
4

Review Regularly

Check usage weekly or monthly

Usage Alerts

Set up alerts in provider dashboards:
  • Email when hitting thresholds
  • Warnings at 50%, 75%, 90%
  • Hard limits to prevent overspending
  • Monthly budget caps

Model Selection Strategy

By Task Type

Simple UI

Use: GPT-3.5, Claude Haiku, Qwen
  • Fast and cheap
  • Good for visual components
  • Tailwind CSS

Complex Logic

Use: GPT-4, Claude Opus
  • Better reasoning
  • Handles complexity
  • Fewer errors

API Integration

Use: GPT-4 Turbo, Claude Sonnet
  • Understands APIs
  • Good error handling
  • Async patterns

Debugging

Use: GPT-4, Claude Opus
  • Analyzes code deeply
  • Finds root causes
  • Suggests fixes

By Project Stage

  • Prototyping
  • Development
  • Production
Fast and cheap
  • GPT-3.5 Turbo
  • Claude Haiku
  • Qwen models
  • Quick iterations

Switching Models

You can switch models anytime:
1

Library → Agents

Find the agent to update
2

Edit Agent

Click Edit (open-source only)
3

Change Model

Select new model from dropdown
4

Save

Changes apply immediately
5

Test

Try agent with new model
Each agent can use a different model. Mix and match based on needs.

Best Practices

Don’t use GPT-4 for simple UI - waste of money Don’t use GPT-3.5 for complex logic - poor results Choose appropriately for each task
Same prompt, different models, different results Test to find best model for your use case Balance cost vs quality
Check usage regularly Set budget alerts Optimize expensive operations Review monthly spending
New models released regularly Better performance over time Lower costs as prices drop Stay current with updates

Next Steps