Skip to content

Day 4: Connecting AI models in Silly Tavern

Silly Tavern Day 4 Header

After Day 3, you connect an AI model—the “brain” that turns your card into chat. Without a backend, the character cannot reply.

This article explains what AI models are, how to connect them, and free or low-cost options with 2026-oriented notes for OpenAI, Claude, Gemini, OpenRouter, and more. On day 4, finish model setup and start talking to your character.


What is an AI model? | The “brain” behind the card

An AI model understands language and generates replies. Silly Tavern sends your card and chat to the model, which responds in character.

Analogy

Card = script; model = actor. You need both for a performance.

Two families

1. Cloud (online)

Cloud models run on provider servers—like borrowing books from a central library. Strong and convenient; usually needs internet; often paid or rate-limited.

Examples: OpenAI (ChatGPT), Anthropic (Claude), Google (Gemini), OpenRouter, Groq.

2. Local (offline)

Local models run on your machine—like books on your shelf. No API bill for inference; needs capable hardware.

Examples: Ollama, LM Studio, KoboldAI, Oobabooga.

💡 Tip: Beginners often start with a cloud model for simpler setup and strong quality.


API basics

What is an API?

An API is how Silly Tavern talks to the provider—like a restaurant counter: you order (prompt), kitchen (model) returns food (reply).

What is an API key?

An API key is your credential—without it, the provider rejects requests. Create keys in each vendor’s console.

Chat completion vs text completion

ModeDescriptionTypical use
Chat completionMulti-turn chat formatChatGPT, Claude, Gemini, most modern stacks
Text completionContinues raw textSome older or local setups

Most 2026 setups use chat completion.


Comparing major services (2026-oriented)

ServiceQualityPriceNotesBeginner pick
Google GeminiHighFree tierStrong multilingual★★★★★
OpenAI ChatGPTVery highPaid usageTop-tier general★★★★☆
Anthropic ClaudeVery highPaid usageLong context, careful tone★★★★☆
OpenRouterVariesUsage-basedOne key, many models★★★★☆
GroqHighFree tierVery fast inference★★★☆☆

Why many beginners try Gemini first

  • Generous free tier (with limits)
  • Strong quality
  • Solid multilingual support including Japanese
  • Simple key flow

Google Gemini (free tier)

Step 1: API key

  1. Open Google AI Studio
  2. Sign in with Google
  3. Get API keyCreate API key
  4. Copy and store the key privately

💡 Tip: Treat keys like passwords—never commit them to git or paste in public chats.

Step 2: Silly Tavern

  1. Open Silly Tavern at http://localhost:8000
  2. API Connections
  3. Chat Completion tab
  4. Provider: Google
  5. Paste the key
  6. Pick a model (e.g. current Gemini Pro family name shown in ST)
  7. Connect

Step 3: Test

  1. Select a character from Day 3
  2. Send “Hello”
  3. A reply means success

OpenAI ChatGPT

Step 1: API key

  1. OpenAI Platform
  2. Account → API keysCreate new secret key
  3. Save the key once—many consoles hide it after creation

Step 2: Billing

OpenAI API is usage-based.

  1. Billing → add payment method
  2. Set a usage cap (e.g. ~$5/month) while learning

💡 Tip: Caps prevent surprise bills during experiments.

Step 3: Silly Tavern

  1. API ConnectionsOpenAI
  2. Paste key
  3. Choose model (e.g. GPT-4 family or GPT-3.5-class per your plan)
  4. Connect

Anthropic Claude

Step 1: API key

  1. Anthropic Console
  2. API keysCreate Key
  3. Copy the key

Step 2: Silly Tavern

  1. API ConnectionsClaude
  2. Paste key
  3. Pick a Claude 3.x model offered in your UI
  4. Connect

OpenRouter | Many models, one key

OpenRouter routes many vendor models behind one account—like a mall for LLMs.

Benefits

  • One key for many models
  • Usage-based billing
  • Some free or low-cost models appear over time

Setup

  1. openrouter.ai → account
  2. Create API key
  3. In Silly Tavern, choose OpenRouter and paste the key

OpenRouter - unified model access

Claude on OpenRouter - pricing

Silly Tavern OpenRouter connection


API key hygiene

Do

  • Never share keys publicly
  • Rotate periodically
  • Set budget caps where offered
  • Revoke unused keys

Don’t

  • Commit keys to GitHub
  • Screenshot keys
  • Store plaintext in insecure notes

Common errors

“Invalid API key”

Regenerate or re-paste; check for trailing spaces.

“Rate limit exceeded”

Wait, upgrade plan, or switch model/provider.

“Model not found”

Pick a model string your account actually supports.

“Connection timeout”

Check network, VPN, or corporate proxy settings.


Advanced: proxy and legacy

Proxy

Some networks need a proxy. In Advanced API settings, set Proxy URL if required.

Legacy mode

Rarely needed—enables older API shapes for niche backends.


Free or cheap paths

  1. Gemini free tier
  2. Groq free tier
  3. OpenRouter free/low-cost models when listed
  4. Local models—fully free compute-side (Day 5)

Free tiers have rate and token limits; if you hit them often, consider local models or a small paid budget.

💡 Tip: When quotas bite, Day 5 covers Ollama and local use.


Next steps | Local models

Continue with:

Easier path: MiniTavern can reduce API wiring—see Day 7.


Summary

You learned how to connect AI models in Silly Tavern—Gemini, OpenAI, Claude, OpenRouter—plus security, errors, and free-tier strategy. Next: local models for offline and $0 inference (aside from electricity).



About the author

花

花(Hana)

AI工具評価の専門家。東京・新宿三丁目周辺で活動し、最新のAIアプリケーションやツールを実際に使用してレビューを提供しています。


FAQ

Q1: Can I use it completely free?

Yes, with Gemini or Groq free tiers (limits apply) or local models on your own hardware.

Q2: Which service do you recommend first?

Many beginners start with Gemini for a free tier and solid multilingual quality.

Q3: Where do I get API keys?

Each vendor’s console: Google AI Studio, OpenAI Platform, Anthropic Console, OpenRouter dashboard, etc.

Q4: Typical cost?

Varies. OpenAI is pay-as-you-go (often a few dollars a month for light hobby use). Gemini offers free quotas.

Q5: Can I configure multiple APIs?

Yes—Silly Tavern lets you switch backends per need.

Q6: My key leaked—what now?

Revoke it immediately, issue a new key, and check billing/usage for abuse.

Q7: Offline cloud models?

No—cloud backends need the internet. Use Day 5 for offline.

Q8: Japanese and other languages?

Major cloud models handle Japanese well; Gemini is often cited for multilingual strength.

Q9: Connection still fails

Verify key, model name, billing (if required), and network. Discord communities can help debug.

Q10: Strongest models in 2026?

Top-tier paid APIs (e.g. flagship GPT-4-class and Claude Opus-class) remain among the strongest—check each vendor’s current flagship names.


Published: March 14, 2026
Last updated: March 27, 2026



Last updated: