Introduction to MCP - the Model Context Protocol
How MCP turns "LLM with tools" from bespoke glue code into a stable, language-agnostic contract you can build a platform on.
By Georgo OnyangoMar 25, 20269 min read
Every team building with LLMs eventually hits the same wall: the model is great at reasoning, but useless without a way to do things in your systems. Read a row from the database, send a Slack message, look up a customer record, kick off a deploy. So you start writing glue code - a function the model can call, a JSON schema describing it, a router that takes the model's output and dispatches to the right Go function.
Multiply that by every tool, every model provider, every product. You end up reinventing the same plumbing in slightly different shapes, in three languages, across five repositories.
The Model Context Protocol (MCP) is the boring, useful answer to that mess. It's a small open spec that says: here's how a model client and a tool server talk to each other. Here's how the server advertises what it can do. Here's how the client invokes a tool and gets a result back. That's it. Once you speak MCP on both sides, an MCP server you wrote in Go for one product can be wired into Claude Desktop, Cursor, your own agent, or anything else that speaks MCP - without changing a line.
The mental model
MCP defines three roles:
- Host - the user-facing application (a chat UI, an IDE, your own agent). It owns the LLM calls and the user.
- Client - a connection inside the host that speaks MCP to one specific server. A host typically runs one client per connected server.
- Server - the thing exposing capabilities. A wrapper around your database, your CRM, your build system, whatever. Servers don't know about LLMs; they just expose tools, resources, and prompts.
The wire format is JSON-RPC 2.0 - request/response with optional notifications, identical to what LSP (Language Server Protocol) chose for the same reasons: simple, debuggable, supported everywhere. Transport is either stdio (for local servers spawned as subprocesses) or HTTP with optional SSE for streaming. Auth, when needed, sits at the transport layer.
The three primitives
An MCP server exposes capabilities through three primitives. Knowing which one to reach for is most of the design work:
- Tools - model-invoked actions. The LLM decides when to call them. Each tool has a name, a description, and a JSON Schema for its input.
send_email,create_ticket,query. These are the verbs. - Resources - readable data the host can attach to context. The user (or host) decides what to attach; the model doesn't summon a resource. URIs and MIME types identify them.
file:///tmp/log.txt,postgres://prod/users/schema. Think "files in the model's clipboard." - Prompts - parameterised templates the user can pick from a menu. "Summarise this PR," "Generate a release note from these commits." Reusable conversation starters, not autonomous capabilities.
Most of the code you'll write is tools. Resources are how you give the model context without eating tool-call budget; prompts are how you turn ad-hoc workflows into one-click commands.
A minimal MCP server in Go
Here's the contract for a single tool - the JSON shape the server returns when the client asks "what can you do?":
{ "tools": [ { "name": "list_open_jobs", "description": "List currently-published job postings.", "inputSchema": { "type": "object", "properties": { "department": { "type": "string", "description": "Optional filter, e.g. 'engineering'." }, "limit": { "type": "integer", "default": 20 } } } } ] } And the Go side - using an MCP SDK to wrap the JSON-RPC plumbing. This is what a server for our recruitment neuron might look like:
package main import ( "context" "encoding/json" "fmt" "os" "github.com/modelcontextprotocol/go-sdk/mcp" pbR "internal.tb.techbridge.build/protobuf/techbridge/tb/recruitment/v1" ) type listOpenJobsArgs struct { Department string `json:"department,omitempty"` Limit int32 `json:"limit,omitempty"` } func main() { s := mcp.NewServer("jobs-mcp", "0.1.0") // Register a single tool. The name + description go straight to the // LLM — write them like documentation, not like Go identifiers. s.AddTool(mcp.Tool{ Name: "list_open_jobs", Description: "List currently-published job postings, newest first.", InputSchema: mcp.SchemaFor(listOpenJobsArgs{}), }, func(ctx context.Context, req mcp.CallToolRequest) (mcp.CallToolResult, error) { var args listOpenJobsArgs if err := json.Unmarshal(req.Arguments, &args); err != nil { return mcp.CallToolResult{}, fmt.Errorf("decode args: %w", err) } if args.Limit <= 0 || args.Limit > 100 { args.Limit = 20 } jobs, err := jobsClient.ListJobs(ctx, &pbR.ListJobsRequest{ Filter: buildFilter(args.Department), PageSize: args.Limit, }) if err != nil { return mcp.CallToolResult{}, err } // The model wants something it can read; return concise summaries // not raw protobufs. The tool description sets expectations of // what shape comes back. return mcp.TextResult(formatJobs(jobs.GetJobs())), nil }) // Stdio is the simplest transport: the host launches us as a // subprocess and pipes JSON-RPC over stdin/stdout. Same binary works // over HTTP too — see mcp.HTTPHandler if you want to expose it // remotely. if err := s.RunStdio(context.Background(), os.Stdin, os.Stdout); err != nil { fmt.Fprintf(os.Stderr, "mcp server: %v\n", err) os.Exit(1) } } That's a complete MCP server. The host (Claude Desktop, your agent, whatever) launches this binary, asks tools/list, sees list_open_jobs, and from then on the LLM can invoke it whenever a user asks "what jobs are open in engineering?". You didn't write a router. You didn't define a JSON schema by hand (the SDK generates one from your Go struct). You didn't think about HTTP. The protocol handled it.
Where protobufs come in
MCP itself is JSON over JSON-RPC - by design, since it has to interoperate with TypeScript hosts, Python hosts, anything. But inside the server, the calls you wrap are almost always gRPC over protobuf. That's the pattern across our products: an MCP server is a thin adapter from a freeform JSON tool call to a typed gRPC call into a real backend.
Why bother with the adapter at all? Two reasons:
- The boundary moves. Inside your service mesh you want type safety, codegen, contracts. At the LLM boundary you want flexibility, descriptions, and the model's own ability to reason about partial inputs. The adapter is where the conversion happens.
- One server, multiple clients. The same protobuf-defined recruitment service powers the SPA (gRPC-Web), the admin console (gRPC), AND the MCP server above. Each speaks its native dialect; the protobuf is the shared core.
What we use it for
On the products we build, MCP servers are the standard way to expose a neuron's capabilities to "any agent, any host." The recruitment service has a Go MCP server that wraps a small set of useful tools (list jobs, get an application, summarise candidate). The same protobufs power our internal admin console and the public marketing site, but the MCP wrapper means a recruiter can also point Claude Desktop at it and ask questions during candidate review.
The shape we land on consistently:
- One MCP server per neuron, named after the neuron (
jobs-mcp,leads-mcp). - Tools map 1-to-1 to "useful read or write actions." Not 1-to-1 to gRPC methods - bundle related calls when the model would always need them together.
- Descriptions are the API surface. Spend the time. The model literally reads these.
- Auth via the transport (Bearer token on HTTP, a config file for stdio); the underlying gRPC service still enforces its own IAM.
When MCP is the wrong answer
MCP solves tool exposure. It doesn't solve agent orchestration, multi-agent coordination, or persistent state. If you find yourself wanting one MCP server to "call another MCP server," you've outgrown it - what you actually want is something like A2A, the agent-to-agent protocol, sitting one level up.
Use MCP when the answer is "I want a model to be able to do this thing." Use a richer protocol when the answer is "I want these two systems to collaborate."