LithosAI · Cloud · Docs · Quickstart · Examples · Contributing · Slack
Motus is an open-source agent serving project that enables higher capability, lower cost, and faster agents. As building agents has never been easier, Motus takes a no-framework approach and provides the infrastructure needed for efficient agent serving. Deploy simply across self-managed and cloud environments at any scale.
The fastest way to get started is to let your coding agent handle building, serving, and deploying with Motus.
Motus works out of the box with any coding agent (e.g., Claude Code, Codex, or Cursor). Install the plugin and CLI with one command:
curl -fsSL https://www.lithosai.com/motus/install.sh | sh
Then use it directly in your workflow:
/motus # activate Motus skills
build your agent # start building your agent
/motus serve # serve locally
/motus deploy # deploy to the cloud
See plugins/motus/README.md for marketplace installs and more details.
Install Motus to serve agents locally and deploy them to Motus Cloud. Motus supports agents built with:
Using uv:
uv add lithosai-motus
Or with pip:
pip install lithosai-motus
# Serve locally
motus serve start myapp:agent --port 8000
# Chat with your local agent
motus serve chat http://localhost:8000 "Hello!"
# Deploy to Motus Cloud
motus deploy --name myapp myapp:agent
# Chat with your deployed agent
motus serve chat https://myapp.lithosai.com "Hello!"
Motus is powered by a serving runtime that automatically converts Python code into parallel, resilient workflows. Everything is designed to be simple, intuitive, and customizable.
from motus.agent import ReActAgent
from motus.models import OpenAIChatClient
from motus.runtime import resolve
from motus.tools import tool
@tool # define a simple tool
async def search(query: str) -> str:
"""Search the web for information."""
return f"Results for: {query}"
# define a ReAct agent
agent = ReActAgent(client=OpenAIChatClient(), model_name="gpt-4o", tools=[search])
print(resolve(agent("Hello World!")))
Start simple, and explore the agents documentation for more advanced usage.
Example: fetch an article, summarize it, extract hashtags in parallel, then publish:
from motus.runtime import resolve
from motus.runtime.agent_task import agent_task
@agent_task # wrap functions as tasks in your workflow
async def summarize(article): ... # just a normal function
@agent_task
async def extract(article): ... # extract hashtags
@agent_task(retries=3, timeout=10.0) # augment tasks with retries and timeouts
async def fetch(url): ...
@agent_task
async def publish(summary, hashtags): ... # publish on LinkedIn
# Your logic becomes your code directly:
article = fetch("https://www.lithosai.com")
summary = summarize(article) # Motus infers the dependency graph from data flow.
hashtags = extract(article) # Both depend on `article`, run in parallel.
post = publish(summary, hashtags) # Waits for both upstream tasks.
print(resolve(post)) # get final result
No explicit DAGs, just Python. Motus leverages @agent_task decorators to turn Python functions into asynchronous tasks.
Motus sits under your agents, providing scheduling, parallelism, caching, resilience, observability, and tracing. Learn more about the Motus runtime.
Run the included examples:
# Basic ReAct agent — interactive console chat
uv run python examples/agent.py
# Task graph demo — parallelism, dependency tracking, multi-return
uv run python examples/runtime/task_graph_demo.py
Learn more from our comprehensive examples.
| undefinedAgentsundefined | ReActAgent runs the reasoning loop, tool dispatch, and conversation state. Multi-turn memory, structured output via Pydantic, and input/output guardrails. All built in. A working agent in under 10 lines. |
| undefinedToolsundefined | Write a function, get a tool. Expose class methods with @tools, wrap an MCP server with get_mcp(), nest another agent with as_tool(), or run untrusted code in a Docker sandbox. Everything composes through the same tools=[...] interface. Built-in utilities: skills, bash, file ops, glob / grep, todo tracking. |
| undefinedTask-graph runtimeundefined | @agent_task turns any function into a node in a dependency graph with automatic parallel execution, multi-return futures, non-blocking operators. Retries, timeouts, and backoff are declarative on the task and overridable per call site with .policy(). |
| undefinedObservability & debuggingundefined | Every LLM call, tool invocation, and task dependency traced automatically. Interactive HTML viewer, Jaeger export, or cloud dashboard. Enabled with one env var. |
| undefinedMulti-provider modelsundefined | Unified client for OpenAI, Anthropic, Gemini, and OpenRouter. Switch providers by changing one line, agent logic stays the same. Local models (Ollama, vLLM, SGLang) work through base_url. |
| undefinedLocal servingundefined | motus serve exposes any agent as a session-based HTTP API locally. Test the full serving stack before deploying to the cloud. |
| undefinedMemoryundefined | Provided memory solutions: basic (append-only), compact (auto-summarizes when token budget runs thin). Session save/restore built in. |
| undefinedGuardrailsundefined | Input and output validation on both agents and individual tools. Declare the parameters you care about — return a dict to modify, raise to block. Structured output guardrails match fields on Pydantic models. |
| undefinedMulti-agent compositionundefined | agent.as_tool() wraps any agent as a tool. The supervisor doesn’t know whether it’s calling a function or another agent — the interface is identical. fork() creates independent conversation branches. |
| undefinedMCP integrationundefined | Connect any MCP-compatible server with get_mcp(). Local via stdio, remote via HTTP, or inside a Docker container. Filter and rename tools with prefix, blocklist, and guardrails. |
| undefinedDocker sandboxesundefined | Run untrusted code in isolated containers. Mount volumes, expose ports, execute shell and Python — attach to any agent as a tool provider. |
| undefinedPrompt cachingundefined | Prompt caching via CachePolicy — STATIC (system + tools) or AUTO (+ conversation prefix). Reduce latency and cost on long conversations. |
| undefinedSDK compatibilityundefined | Drop-in for OpenAI Agents SDK, Anthropic SDK, and Google ADK. Change the import, keep your code. |
| undefinedHuman-in-the-loopundefined | Built-in support for interactive approval, clarification, and feedback during agent execution. Pause the agent, ask for human input, and resume. Works in both local serving and cloud deployment. |
See the Contributing Guide to get started, or come say hi on Slack. Let’s build together!
Apache 2.0 — see LICENSE.