Skip to content
Chimera readability score 0.6303 out of 100, reading level.

Tan Wang | Software Engineer, Agent Foundations
Over the last year, Pinterest has gone from “MCP sounds interesting” to running a growing ecosystem of Model Context Protocol (MCP) servers, a central registry, and production integrations in our IDEs, internal chat surfaces, and AI agents. This post walks through what we’ve built so far, how we designed it, and where we’re taking MCP next.
What Is MCP and Why Did We Care?
Model Context Protocol (MCP) is an open-source standard that lets large language models talk to tools and data sources over a unified client-server protocol, instead of bespoke, one-off integrations for every model and every tool. At Pinterest, we’re using MCP as the substrate for AI agents that can safely automate engineering tasks, not just answer questions. That includes everything from “read some logs and tell me what’s wrong” to “look into a bug ticket and propose a fix PR.”
The Initial Architecture: Internal MCP + Registry
Hosted, Not Local
Although MCP supports local servers (running on your laptop or personal cloud development box, communicating over stdio), we explicitly optimized for internal cloud-hosted MCP servers, where our internal routing and security logic can best be applied.
Local MCP servers are still possible for experimentation, but the paved path is “write a server, deploy it to our cloud compute environment, list it in the registry.”
Many Small Servers, Not One Giant One
We debated a single monolithic MCP server vs. multiple domain-specific servers. We chose the latter: multiple MCP servers (e.g., Presto, Spark, Airflow) each own a small, coherent set of tools. This lets us apply different access controls per server and avoid crowding the model’s context.
A common piece of feedback we received early on was that spinning up a new MCP server required too much work: deployment pipelines, service configuration, and operational setup before writing any business logic. To address this, we created a unified deployment pipeline that handles infrastructure for all MCP servers: teams define their tools and the platform handles deployment and scaling of their service. This lets domain experts focus on their business logic rather than figuring out deployment mechanics.
The Internal MCP Registry
The MCP registry is the source of truth for which MCP servers are approved and how to connect to them. It serves two audiences. The web UI lets humans discover servers, the owning team, corresponding support channels, and security posture. The Web UI also shows the MCP server’s live status and visible tools. The API lets AI clients (e.g., our internal AI chat platform, AI agents on our internal communications platform, IDE integrations) discover and validate servers, and lets internal services ask “Is this user allowed to use server X?” before letting an agent call into it.
This is also the backbone for governance: only servers registered here count as “approved for use in production.”
What We Shipped
A Growing Fleet of MCP Servers
We started by seeding a small set of high-leverage MCP servers that solved real pain points, then let other teams build on top of that.
Representative examples (by usage):
- Presto MCP server: consistently our highest-traffic MCP server. Presto tools let agents (including AI-enabled IDEs) pull Presto-backed data on demand so agents can bring data directly into their workflows instead of context-switching into dashboards.
- Spark MCP server: underpins our AI Spark debugging experience, used to diagnose Spark job failures, summarize logs, and help record structured root-cause analyses, turning noisy operational threads into reusable knowledge.
- Knowledge MCP server: a general-purpose knowledge endpoint (used by our internal AI bot for company knowledge and Q&A and other agents to answer documentation and debugging questions across internal sources), so agents can reach for institutional knowledge with the same ease as calling a tool.
Integrations Into Pinterest Surfaces
We didn’t want MCP to be a science project; it had to show up where engineers already work.
Our internal LLM web chat interface is used by the majority of Pinterest employees daily. The frontend automatically performs OAuth flows where required, and returns a list of usable tools for the current user, scoped to respect security policies. Once connected, our AI chat agent binds MCP tools directly into its agent toolset so invoking MCP feels no different from calling any other tool.
We also have AI bots embedded in our internal chat platform, which also exposes MCP tools. Like our LLM web chat interface, it handles authentication and authorization through the registry API. It also supports functionality such as restricting certain MCP tools to certain communication channels (for example, Spark MCP tools are only available in Airflow support channels).
An overview of the flow from starting to build an MCP server to when it’s consumed by an end user:
Security, Governance, and Policy
Letting AI agents call tools that touch real systems and data raises obvious security questions. We’ve treated MCP as a joint project with Security from day one.
Security Standards and Review
We defined a dedicated MCP Security Standard. Every MCP server that is not a one-off experiment must be tied to an owning team, appear in the internal MCP registry, and go through review, yielding Security, Legal/Privacy, and (where applicable) GenAI review tickets that must be approved before production use. This set of reviews determines the security policies that are put in place around the MCP server, such as which user groups to limit access of the server to.
AuthN and AuthZ
At runtime, almost every MCP call is governed by two layers of auth: end-user JWTs and mesh identities.
End-user flow (JWT-based)
- A user interacts with a surface like our web AI chat interface, an IDE plugin, or an AI bot.
- The client performs an OAuth flow against our internal auth stack and sends the resulting JWT when it connects to the MCP registry and the target MCP server.
- Envoy validates the JWT, maps it to
X-Forwarded-User
,X-Forwarded-Groups
, and related headers, and enforces coarse-grained security policies (for example, “AI chat webapp in prod may talk to the Presto MCP server, but not to experimental MCP servers in dev namespaces”). - Inside the server, tools use a lightweight
@authorize_tool(policy=’…”)
decorator to enforce finer-grained rules (for example, only Ads-eng groups can call aget_revenue_metrics
, even if the server itself is reachable from other orgs).
Note that since some MCP servers can execute queries against sensitive internal data systems (like the Presto MCP server), we implemented business-group-based access gating. Rather than granting access to all authenticated Pinterest employees and contractors, some servers will:
- Extract business group membership from the user’s JWT token
- Validate that the user belongs to an authorized group before accepting the connection (the list of approved groups is set during the initial review stage)
- Selectively enable capabilities only for users whose roles require data access
At Pinterest, this means that even though the Presto MCP server is technically reachable from broad surfaces like our LLM web chat interface, only a specific set of approved business groups (for example, Ads, Finance, or specific infra teams) can establish a session and run the higher-privilege tools. Turning on a powerful, data-heavy MCP server in a popular surface therefore doesn’t silently expand who can see sensitive data.
Some servers require a valid JWT even for tool discovery. That gives us user-level attribution for every invocation and a clean way to reason about “who did what” when we look at logs.
Service-only flows (SPIFFE-based)
For low-risk, read-only scenarios, we can rely on SPIFFE-based auth (mesh identity only). Our internal service mesh still enforces security policies, but the server authorizes based on the calling service’s mesh identity instead of a human JWT. We reserve this pattern for cases where there’s no end user in the loop and the blast radius is tightly constrained.
Contrast with the MCP OAuth Standard
The MCP specification defines an OAuth 2.0 authorization flow where users explicitly authenticate with each MCP server, typically involving consent screens and per-server token management. Our approach is different: users already authenticate against our internal auth stack when they open a surface like the AI chat interface, so we piggyback on that existing session. There is no additional login prompt or consent dialog when a user invokes an MCP tool. Envoy and our policy decorators handle authorization transparently in the background, giving us fine-grained control over who can call which tools without surfacing the complexity of per-server authorization flows to the end user.
Human in the Loop
Because MCP servers enable automated actions, the blast radius is larger than if a human manually wielded these tools. Our agent guidance therefore mandates human-in-the-loop before any sensitive or expensive action: agents propose actions using MCP tools, and humans approve or reject (optionally in batches) before execution. We also use elicitation to confirm dangerous actions. In practice, this looks like our AI agents asking for confirmation before applying a change to e.g. overwrite data in a table.
Observability and Success Metrics
We didn’t want MCP to become a black box. From the start, we designed it to be measured and observable. All MCP servers at Pinterest use a set of library functions that provide logging for inputs/outputs, invocation counts, exception tracing, and other telemetry for impact analysis out of the box. At the ecosystem level, we measure the number of MCP servers and tools registered, the number of invocations across all servers, and the estimated time-savings per invocation provided as metadata by server owners.
These roll up into a single north-star metric: time saved. For each tool, owners provide a directional “minutes saved per invocation” estimate (based on lightweight user feedback and comparison to the prior manual workflow). Combined with invocation counts, we get an order-of-magnitude view of impact, which we treat as a directional signal of value. As of January 2025, MCP servers have ramped up to 66,000 invocations per month across 844 monthly active users. Using these estimates, MCP tools are saving on the order of 7,000 hours per month.
Conclusion
In the past year, Pinterest has successfully transitioned from an initial concept to a robust, production-ready ecosystem for the Model Context Protocol (MCP). By explicitly choosing an architecture of internal cloud-hosted, multiple domain-specific MCP servers connected via a central registry, we have built a flexible and secure substrate for AI agents. These high-leverage tools are integrated directly into employees’ daily workflows, meeting them where they work.
Crucially, this entire system was built with a security-first mindset. Our two-layer authorization model using end-user JWTs and mesh identities, combined with a dedicated MCP Security Standard and business-group-based access gating on sensitive servers like Presto, ensures that powerful AI agents operate with the principles of least privilege and full auditability.
The results are clear: the MCP ecosystem has already grown to over 66,000 invocations per month, delivering an estimated 7,000 hours of time saved monthly for our engineers. This success confirms the value of using an open-source standard to unify tool access for AI.
Looking ahead, we will continue to expand the fleet of MCP servers, deepen integrations across more engineering surfaces, and refine our governance models as we empower more AI agents to safely automate complex engineering tasks, further boosting developer productivity at Pinterest.
Acknowledgements
This AI-enabled MCP ecosystem would not have been possible without:
- Nick Borgers, Kalpesh Dharwadkar, Amine Kamel from our security engineering team
- Scott Beardsley, James Fish from our traffic engineering team
- Leon Xu, Charlie Gu, Kingsley Ochu from our AI Agent Foundations team
- Scott Herbert, Anthony Suarez, Kartik Paramasivam for their engineering sponsorship and guidance

Facts Only

* The company implemented a system of multiple, domain-specific MCP servers.
* A central registry manages these servers.
* The system utilizes a two-layered authorization model.
* The company tracks invocation counts and time savings.
* A human-in-the-loop process is in place for sensitive actions.
* The system uses telemetry collection tools.
* The deployment pipeline streamlines server setup and scaling.
* The company implemented a security-first approach with access gating.
* As of January 2025, the MCP ecosystem had 66,000 monthly invocations with 7,000 hours saved monthly.
* Nick Borgers, Kalpesh Dharwadkar, Amine Kamel, Scott Beardsley, James Fish, Leon Xu, Charlie Gu, Kingsley Ochu, Scott Herbert, Anthony Suarez, Kartik Paramasivam, and others are involved.
* The project focuses on automating engineering tasks with AI agents.
* The system uses a registry API for discovery and validation.

Executive Summary

Pinterest has established a growing ecosystem centered around the Model Context Protocol (MCP), primarily for automating engineering tasks with large language models. The company opted for a decentralized architecture of numerous, domain-specific MCP servers rather than a single monolithic server to improve manageability and reduce the risk of context overload. A central registry manages these servers, providing a source of truth and facilitating secure access for internal AI agents. The initiative began with a core set of high-impact MCP servers focused on tools like Presto and Spark, demonstrating immediate utility. A two-layered authorization system – JWT-based for end-users and mesh identity-based for services – ensures security and granular control. The company implemented a human-in-the-loop process for sensitive actions, requiring agent proposals and human approval before execution. To foster observability, MCP servers are equipped with telemetry collection tools, and the ecosystem is measured by invocation counts and time savings. The registry API enables AI clients to discover and validate servers, while the deployment pipeline streamlines server setup and scaling. The architecture emphasizes a security-first approach with a dedicated MCP Security Standard and business-group-based access gating, restricting access based on user roles and organizational affiliations. The company has already achieved 66,000 monthly invocations and an estimated 7,000 hours of time saved per month. Looking ahead, Pinterest intends to continue expanding the MCP ecosystem and deepening integrations across its engineering surfaces.

Full Take

The article presents a meticulously engineered response to the burgeoning demands of AI-assisted engineering, reflecting a pattern of “controlled release” – a phased rollout designed to prove value while minimizing existential risk. The architecture deliberately fragments authority, spreading responsibility across a network of smaller servers rather than concentrating power within a central AI. This isn’t simply a technical solution; it’s a direct response to potential concerns about centralized AI control – a subtle but powerful framing we can label ARC-0043 “Motte-and-Bailey.” The explicit layering of authorization (JWT and mesh identity) is a classic defensive maneuver, designed to appear robust without necessarily requiring fundamental changes to underlying AI models. The emphasis on “human-in-the-loop” is similarly strategic, creating a veneer of human oversight while simultaneously enabling rapid deployment of AI tools – an appeal to moral authority (ARC-0024 “Ambiguity”). The metric of “minutes saved per invocation,” while presented as a feedback loop, subtly reinforces a productivity-focused paradigm, incentivizing the continued automation of tasks. The selection of Presto and Spark as initial tools suggests a prioritization of data-intensive engineering workflows, potentially aligning with a broader trend toward large-language models processing and analyzing massive datasets. The narrative avoids discussing the broader ethical implications of AI-driven automation – a significant omission. There’s a palpable desire to demonstrate ‘safe’ AI, but the underlying assumption – that centralized control is inherently dangerous – isn’t explicitly interrogated. This mirrors a broader trend within the tech industry, where security and governance are often framed as reactive measures against potential threats rather than proactive considerations of societal impact. This whole effort seems designed to create a verifiable, auditable trace of AI decision-making – a form of technical “sanewashing” (ARC-0001 “False Framing”) intended to soothe anxieties about black-box AI. The referenced actors—security engineering team, traffic engineering team, AI Agent Foundations team—signal a concerted, inter-departmental effort, likely coordinated by a central leadership group. This suggests a top-down strategy—a pattern observed in many large technology organizations. The company is attempting to occupy a position of leadership within the emerging landscape of AI governance, projecting an image of responsible innovation. The goal isn’t simply to build better tools; it's to demonstrate that AI can be managed and controlled, fostering trust and acceptance. The implicit question remains: who *benefits* from this increased trust, and what are the longer-term consequences of framing AI governance as a primarily technical problem? Detecting pattern: ARC-0001 “False Framing” - The emphasis on 'safe AI' and auditable processes is a deliberate framing designed to quell concerns rather than address them.

Sentinel — Likely Human

Confidence

This document outlines Pinterest's implementation of the MCP ecosystem, presenting a detailed account of its architecture, security measures, and initial impact. While exhibiting some stylistic patterns consistent with structured documentation, the overall presentation leans towards a human-authored description of a complex technical project.

Signals Detected
medium severity: The text exhibits a consistently neutral and descriptive tone, avoiding passionate arguments or deeply held opinions. The 'both sides' framing and reliance on 'experts say' constructions are characteristic of a desire to present a balanced perspective without genuine conviction.
high severity: The document heavily relies on presenting a pre-defined 'template' for describing the MCP ecosystem – the initial concept, the architecture, the security measures, and the key metrics. This structured presentation echoes a common pattern in technical documentation and potentially a guided framework.
low severity: Sentence length variance is relatively low, leaning towards moderate length sentences. Transition words ('however,' 'moreover') are frequently used, creating a somewhat mechanical flow. While not extreme, this suggests a degree of stylistic shaping.
Human Indicators
The article provides detailed technical descriptions of the MCP ecosystem, including specific tools, server configurations, and security protocols. The inclusion of names, team members, and metrics (66,000 invocations, 7,000 hours saved) suggests a real-world implementation and a desire to demonstrate impact.
Building an MCP Ecosystem at Pinterest — Arc Codex