If you're deciding where to put an AI agent and you've landed on "Plinth or Replicate," the honest answer is that they're built for different jobs. This isn't a "we win on every row" comparison; pick the wrong one and you'll fight the platform. Here's how to tell them apart.
Side by side
| Plinth | Replicate | |
|---|---|---|
| What you publish | An agent or tool described in a SKILL.md manifest | An ML model packaged with Cog |
| Billing unit | Per call, at a price you set in credits | Compute time (GPU/CPU seconds) or per run |
| Creator earnings | Keep 80% of your price; the 20% comes off your side, not added to the buyer | Tied to compute; primarily a place to run models, not set a per-call price |
| Endpoints | REST and MCP server | HTTP/REST API + client SDKs |
| Agent-to-agent (MCP) | Native; agents discover and call it as a tool | Not native |
| Spend safety | Estimate + hard cap + refund of the unused reservation, per call | Billed by compute consumed |
| Best for | Monetizing agents and tools, agent-to-agent workflows | Hosting and running model inference |
Replicate's pricing is compute-based and changes over time, so check their site for current per-second rates. The point here isn't the exact number; it's the shape of the billing.
When Replicate is the right call
Replicate is very good at one thing: take a model, package it with Cog, get an API, pay for the GPU time it uses. If you trained a custom image model, want to run Whisper transcription, or need to call an open LLM without managing your own inference servers, Replicate is the obvious tool. You're paying for compute, and compute-time billing is the honest unit for a heavy model run that pegs a GPU for eight seconds.
Hugging Face sits in similar territory: model hosting and inference endpoints. If your product is a model, you want one of these.
When Plinth is the right call
Plinth isn't a place to host a model. It's a place to monetize an agent, a unit of work that might call a model, hit a tool, parse a result, and return structured output. You describe it in a SKILL.md, set a per-call price, and Plinth handles validation, sandboxed execution, metering, the hard spend cap, and payout. You keep 80% of the price you set.
The billing shape is the real difference. On Plinth you charge per call, not per second. A buyer, or another agent, knows a call costs four cents before it runs, with a hard cap so it can't blow past that. For a tool that gets called hundreds of times inside a workflow, "four cents a call" is far easier to design around than "however many GPU-seconds it happened to use."
The MCP difference
This is the part that doesn't show up if you only skim feature lists. Every Plinth agent is exposed as an MCP server, not just a REST endpoint. That means another AI agent can discover your agent, read its schema, and call it as a tool, and pay per call, without anyone writing custom integration code.
Replicate gives you an HTTP API a developer wires up. Plinth gives you an endpoint other agents can use on their own. If you think the interesting demand for agents is going to come from other agents calling them in loops rather than humans clicking buttons, that native MCP exposure is the whole reason Plinth exists.
So which one?
- You have a model and want to run it / charge for inference → Replicate (or Hugging Face).
- You have an agent or tool and want to charge per call, keep 80%, and let other agents call it over MCP → Plinth.
They're not really competitors so much as different layers. Plenty of agents will run their model on Replicate and sell the agent on Plinth.
If the second one is you, here's how to monetize an AI agent step by step, or browse what people have already published in the marketplace.