How to Set Up LM Studio MCP Server (2026)
Use LM Studio as an MCP server for AI-powered tool calling
Start Building with Hypereal
Access Kling, Flux, Sora, Veo & more through a single API. Free credits to start, scale to millions.
No credit card required • 100k+ developers • Enterprise ready
How to Set Up LM Studio MCP Server (2026)
LM Studio lets you run large language models locally with a polished desktop interface. With MCP (Model Context Protocol) support, you can now connect LM Studio's local models to MCP clients, or use LM Studio itself as an MCP client that connects to external MCP servers. This guide covers both directions: using LM Studio as an MCP server for other clients, and connecting MCP tool servers to LM Studio.
What Is LM Studio?
LM Studio is a desktop application for running open-source language models locally on macOS, Windows, and Linux. It provides a user-friendly interface for downloading models from Hugging Face, configuring inference parameters, and running a local API server. Key features include:
- One-click model download from Hugging Face
- GPU acceleration (NVIDIA, AMD, Apple Silicon)
- OpenAI-compatible local API server
- Built-in chat interface with conversation management
- Model quantization and parameter control
What Is MCP?
MCP (Model Context Protocol) is an open standard by Anthropic that connects AI models to external tools and data sources. In the context of LM Studio, MCP enables two workflows:
- LM Studio as MCP server: Other applications connect to LM Studio's local LLM through the MCP protocol
- LM Studio as MCP client: LM Studio connects to MCP tool servers (file systems, databases, APIs) so your local models can use external tools
Prerequisites
| Requirement | Details |
|---|---|
| LM Studio | Version 0.3.x or later (MCP support added in 0.3.6) |
| RAM | 8 GB minimum, 16+ GB recommended |
| GPU (optional) | NVIDIA 6+ GB VRAM or Apple Silicon |
| Storage | 10+ GB free for models |
| Node.js | v18+ (for custom MCP servers) |
Part 1: Using LM Studio's Built-in API as an MCP-Compatible Server
LM Studio's local server exposes an OpenAI-compatible API. While this is not a native MCP server, you can bridge it to MCP clients using a proxy.
Step 1: Download and Install LM Studio
Download LM Studio from lmstudio.ai for your platform. Install and launch the application.
Step 2: Download a Model
In LM Studio's search bar, find a model that supports tool calling well:
- Qwen 2.5 7B Instruct -- excellent tool calling support
- Llama 3.3 8B Instruct -- strong general-purpose model
- Mistral 7B Instruct v0.3 -- good balance of speed and quality
- Hermes 3 8B -- specifically tuned for function calling
Click the download button next to your chosen model. Wait for the download to complete.
Step 3: Start the Local Server
- Go to the Local Server tab in LM Studio (the
<->icon) - Select your downloaded model from the dropdown
- Toggle the server ON
- Note the server URL (default:
http://localhost:1234)
The server is now running and accepting OpenAI-compatible API requests.
Step 4: Verify the Server
Test the server with a cURL request:
curl http://localhost:1234/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "qwen2.5-7b-instruct",
"messages": [
{"role": "user", "content": "Hello, are you working?"}
],
"temperature": 0.7
}'
Step 5: Bridge to MCP with lmstudio-mcp-server
To expose LM Studio as a proper MCP server, use the community lmstudio-mcp-server bridge:
npm install -g lmstudio-mcp-server
Run the bridge:
lmstudio-mcp-server --port 1234
This creates an MCP server that routes tool calls through LM Studio's local model.
Step 6: Connect to Claude Desktop
Add the bridge to Claude Desktop's MCP configuration:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"lmstudio": {
"command": "npx",
"args": ["lmstudio-mcp-server", "--port", "1234"]
}
}
}
Restart Claude Desktop. The LM Studio server will appear as an available MCP connection.
Part 2: Connecting MCP Servers to LM Studio
LM Studio 0.3.6+ supports connecting to external MCP servers, giving your local models the ability to use tools like file access, web search, or database queries.
Step 1: Open MCP Settings in LM Studio
- Open LM Studio
- Go to Settings (gear icon)
- Navigate to the MCP section
- Click Add Server
Step 2: Add an MCP Server
You can add MCP servers by specifying the command to run them. For example, to add a filesystem MCP server:
{
"filesystem": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-filesystem",
"/Users/yourname/Documents"
]
}
}
Step 3: Add Multiple MCP Servers
Here is a configuration with several useful MCP servers:
{
"filesystem": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-filesystem",
"/Users/yourname/projects"
]
},
"github": {
"command": "npx",
"args": ["-y", "@modelcontextprotocol/server-github"],
"env": {
"GITHUB_TOKEN": "ghp_your_github_token"
}
},
"sqlite": {
"command": "npx",
"args": [
"-y",
"@modelcontextprotocol/server-sqlite",
"/path/to/your/database.db"
]
}
}
Step 4: Use Tools in Chat
After adding MCP servers, load a model that supports tool calling (Qwen 2.5, Hermes 3, or Llama 3.3 Instruct). In the chat, the model can now use the connected tools.
Example interactions:
- "List all Python files in my projects directory" (uses filesystem MCP)
- "Show me the open issues on my GitHub repo" (uses GitHub MCP)
- "Query the users table and show me the top 10 by signup date" (uses SQLite MCP)
The model will automatically detect which MCP tool to call based on your request.
Best Models for MCP Tool Calling
Not all models handle tool calling equally well. Here are the best options for MCP use:
| Model | Size | Tool Calling | Speed | Quality |
|---|---|---|---|---|
| Qwen 2.5 7B Instruct | 4.5 GB (Q4) | Excellent | Fast | High |
| Llama 3.3 8B Instruct | 5 GB (Q4) | Very Good | Fast | High |
| Hermes 3 8B | 5 GB (Q4) | Excellent | Fast | High |
| Qwen 2.5 72B Instruct | 42 GB (Q4) | Excellent | Slow | Very High |
| Mistral Small 24B | 14 GB (Q4) | Good | Medium | High |
For most users, Qwen 2.5 7B Instruct offers the best balance of tool-calling reliability and performance.
Configuring LM Studio Server Parameters
Fine-tune your local server for MCP workloads:
{
"contextLength": 8192,
"temperature": 0.1,
"maxTokens": 4096,
"gpu": {
"offloadLayers": -1
}
}
Key settings:
- contextLength: Set to 8192 or higher for complex tool-calling chains
- temperature: Use 0.1 or lower for reliable tool calling (higher values cause erratic tool use)
- maxTokens: Set high enough for the model to complete tool call responses
- GPU offload: Set to -1 to offload all layers to GPU for maximum speed
Using LM Studio with Python
You can also interact with LM Studio's MCP-enabled server programmatically:
from openai import OpenAI
client = OpenAI(
api_key="lm-studio",
base_url="http://localhost:1234/v1"
)
# Define tools
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string", "description": "City name"}
},
"required": ["city"]
}
}
}
]
response = client.chat.completions.create(
model="qwen2.5-7b-instruct",
messages=[
{"role": "user", "content": "What is the weather in Tokyo?"}
],
tools=tools,
tool_choice="auto"
)
# Check if the model wants to call a tool
message = response.choices[0].message
if message.tool_calls:
for tool_call in message.tool_calls:
print(f"Tool: {tool_call.function.name}")
print(f"Args: {tool_call.function.arguments}")
Troubleshooting
"Model does not support tool calling" Make sure you are using an instruct-tuned model that supports function calling. Base models and some older fine-tunes do not support the tool-calling format.
MCP server not showing up in LM Studio Restart LM Studio after adding MCP server configurations. Check that Node.js is installed and the MCP server package installs correctly by running the command manually in your terminal.
Tool calls returning errors Lower the temperature to 0.1 or below. Higher temperatures can cause the model to generate malformed tool call JSON. Also ensure the context length is set high enough (at least 4096).
Slow inference with tool calling Tool calling requires extra tokens for the function definitions and responses. Use a smaller model or increase GPU offloading. Consider using Q4_K_M quantization for the best speed/quality tradeoff.
Port conflicts If port 1234 is already in use, change the LM Studio server port in settings. Update all MCP configurations to match the new port.
LM Studio MCP vs. Other Options
| Feature | LM Studio + MCP | Ollama + MCP | Claude Desktop | Cursor |
|---|---|---|---|---|
| Local models | Yes | Yes | No | No |
| GUI interface | Yes | No (CLI only) | Yes | Yes |
| MCP client | Yes (v0.3.6+) | Via bridge | Yes | Yes |
| MCP server | Via bridge | Via bridge | N/A | N/A |
| Tool calling | Model-dependent | Model-dependent | Built-in | Built-in |
| Cost | Free (local compute) | Free (local compute) | $20/month Pro | $20/month |
| Setup difficulty | Easy | Medium | Easy | Easy |
Wrapping Up
LM Studio's MCP support bridges the gap between local AI models and the growing ecosystem of MCP tools. Whether you want to use local models as an MCP server for other applications or connect external tools to your local LLM, the setup is straightforward. The key to success is choosing a model with strong tool-calling capabilities and keeping the temperature low for reliable function execution.
If your workflow also involves AI media generation like images, video, or talking avatars, check out Hypereal AI for a unified API that handles all of it alongside your LLM workflows.
Try Hypereal AI free -- 35 credits, no credit card required.
Related Articles
Start Building Today
Get 35 free credits on signup. No credit card required. Generate your first image in under 5 minutes.
