1. The "USB-C" Moment for AI
If you have built AI agents before, you know the pain. You write a Python function to fetch data. Then you wrap it in OpenAI's function schema. Then you migrate to Anthropic and have to rewrite the schema. Then you try LangChain and add another abstraction layer.
The Solution: The Model Context Protocol (MCP). Think of it as a universal standard—like USB-C—for connecting AI models to data and tools.
In this guide, we aren't just building a server. We are going Full Stack. We will build:
- The Hands (Server): A Python tool that tells jokes and fetches trivia.
- The Brain (Client): A custom Python script using Google's Gemini 2.5 Flash (Free Tier) to decide which tool to use.
Why This Stack?
FastMCP (Server) + Google GenAI SDK (Client) = $0 Cost & 100% Open Standards.
2. Prerequisites & Setup
You need Python 3.10 or higher. We will use the mcp SDK for the protocol and google-generativeai for the brain.
# Create a virtual environment
python -m venv venv
source venv/bin/activate # or venvScriptsactivate on Windows
# Install dependencies
pip install "mcp[cli]" google-generativeai
Note: You will need a free API key from Google AI Studio.
3. The Server: Building "FunTools"
We'll use FastMCP, a high-level wrapper that makes creating servers incredibly simple. Instead of a boring calculator, let's build a "Fun & Trivia" server.
Create a file named server.py:
from mcp.server.fastmcp import FastMCP
import random
# 1. Initialize the Server
mcp = FastMCP("FunTools")
# 2. Define a Tool: Joke Generator
@mcp.tool()
def tell_joke(category: str = "programming") -> str:
"""Returns a random joke based on a category (programming, dad, or general)."""
jokes = {
"programming": [
"Why do programmers prefer dark mode? Because light attracts bugs.",
"There are 10 types of people in the world: those who understand binary, and those who don't."
],
"dad": [
"I'm afraid for the calendar. Its days are numbered.",
"Why don't skeletons fight each other? They don't have the guts."
],
"general": [
"I told my wife she was drawing her eyebrows too high. She looked surprised."
]
}
return random.choice(jokes.get(category, jokes["general"]))
# 3. Define a Tool: Fun Fact
@mcp.tool()
def get_fun_fact(topic: str) -> str:
"""Returns a fun fact about a specific topic."""
facts = {
"space": "One day on Venus is longer than one year on Venus.",
"animals": "Octopuses have three hearts and blue blood.",
"history": "Cleopatra lived closer in time to the moon landing than to the Great Pyramid's construction."
}
return facts.get(topic.lower(), f"I don't have a fun fact about {topic} yet!")
if __name__ == "__main__":
# This runs the server over Stdio (Standard Input/Output)
mcp.run()
Pro Tip: You can test this immediately without writing a client! Run mcp inspector server.py in your terminal to interact with your tools in a web UI.
4. The Client: The "Brain" (Gemini 2.5)
Now for the hard part (that most tutorials skip). We need a client that:
- Connects to our
server.pysubprocess. - Discover the tools (
tell_joke,get_fun_fact). - Passes those tool definitions to Gemini.
- Executes the tool when Gemini asks for it.
Create a file named client.py:
import asyncio
import os
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
import google.generativeai as genai
from google.generativeai.types import FunctionDeclaration, Tool
# CONFIGURATION
API_KEY = os.environ.get("GEMINI_API_KEY") # Make sure this is set!
genai.configure(api_key=API_KEY)
# 1. Define Server Connection
server_params = StdioServerParameters(
command="python",
args=["server.py"],
env=None
)
async def run_agent():
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
# 2. Initialize & List Tools
await session.initialize()
mcp_tools = await session.list_tools()
# 3. Convert MCP Tools to Gemini Format
# (In a real app, you'd map schemas dynamically. For this demo, we manually map.)
# This is the "glue" code.
gemini_tools = [
# We tell Gemini these functions exist
genai.protos.Tool(function_declarations=[
genai.protos.FunctionDeclaration(
name=tool.name,
description=tool.description,
parameters={
"type": "OBJECT",
"properties": tool.inputSchema["properties"],
"required": tool.inputSchema.get("required", [])
}
) for tool in mcp_tools.tools
])
]
# 4. Initialize the Model
model = genai.GenerativeModel(
model_name='gemini-1.5-flash', # Or 2.5 if available in your region
tools=gemini_tools
)
chat = model.start_chat(enable_automatic_function_calling=True)
# Note: Gemini's auto-function calling executes the logic IF we provide the python func
# But here, the logic lives in the MCP server.
# So we will handle the "Agent Loop" manually for transparency.
# Let's try a manual loop to show exactly what happens
prompt = "Tell me a dad joke and then a fun fact about space."
print(f"User: {prompt}")
# Step A: Send prompt to Model
response = model.generate_content(prompt)
# Check if model wants to call a function
for part in response.parts:
if fn := part.function_call:
print(f"🤖 AI wants to call: {fn.name} with args: {dict(fn.args)}")
# Step B: Execute tool on MCP Server
result = await session.call_tool(fn.name, dict(fn.args))
print(f"🔧 Tool Output: {result.content[0].text}")
# Step C: Feed result back to Model (simplified for demo)
# In a full loop, you'd append this to history and call generate_content again.
print("-" * 20)
if __name__ == "__main__":
asyncio.run(run_agent())
5. Why This Architecture Matters
Notice what just happened. Your Client (Gemini) has no idea how the joke code works. It just knows the interface. Your Server has no idea which AI is calling it.
You have successfully decoupled intelligence from execution. This is the foundation of robust System Design for AI.
Next Steps
- Add a Database: Create a tool that reads/writes to a local SQLite file.
- Dockerize: Wrap
server.pyin a Docker container and run it on AWS Lambda.