AI infrastructure, tools, and open research.
Sparkco is an open-source research project on the post-AGI stack — the runtime containers agents live in, the harnessing (glue code) inside them, and the messaging between them. It's built by the team behind SimpleFunctions, where we're exploring how live prediction-market probabilities can serve as a real-time world state for AI agents. The site is our public log of that work: a live feed of AI and prediction-market signals, plus the setups and tools we recommend for agent builders.
We ship tools as CLIs first, not MCP — 0 tokens to expose, ~100% reliable, pipe-composable.
Parametric memory: replacing the context window with weights.
Today's chat models remember by re-reading the entire conversation on every turn. Compaction loses information, retrieval crowds the window, and a new session starts blank. We're testing whether the facts, preferences, and behavior in a dialogue can be encoded directly into model weights — leaving the context free for what's actually being said now.
Want to collaborate? patrick@simplefunctions.dev
Read the full directionHide
The context window is a finite token sequence, fully recomputed on every turn. Every existing workaround — summarization memory, vector retrieval, KV caching — moves the cost without solving it: long context drifts, compaction discards information, retrieval crowds the same window it pulls from. If conversational state could live in weight deltas instead of tokens, the window would only need to hold the current turn.
- Test-time training. ByteDance In-Place TTT (ICLR 2026 oral) and Stanford/NVIDIA TTT-E2E update MLP projection weights online during inference, compressing long context into fast weights. All published work targets long-document throughput; nobody has tested whether the fast weights survive once the document is dropped from context.
- Hypernetwork → adapter. Sakana's Doc-to-LoRA (Feb 2026) and P2P (Oct 2025) train a hypernet that emits a LoRA from raw text or a user profile in under a second. Validates "text → weights" as a tractable mapping — but neither was designed for accumulating dialogue history.
- Dialogue-direct fine-tuning. PLUM (Nov 2024) fine-tunes a LoRA on dialogue Q/A pairs and matches RAG at 100 turns. MemLoRA trains memory management itself as a LoRA. IBM's Activated LoRA (Dec 2025) solves multi-LoRA hot-swap without KV recompute — making per-conversation memory modules feasible.
- Knowledge editing. ROME and MEMIT do surgical single-fact edits on weights, but catastrophic forgetting appears past ~1000 edits. Not a candidate at dialogue scale.
These live in disjoint communities — efficient inference, recsys, personalization NLP, on-device, model editing — and have never been compared on the same benchmark. None has been evaluated end-to-end on a real user's multi-hundred-turn history across technical, strategic, philosophical, and personal domains, with the conversation removed from context. Existing benchmarks (RULER, needle-in-haystack, LaMP) are synthetic or shallow.
- TTT fast weights as memory. Ingest a fact-bearing dialogue with In-Place TTT, drop the context, probe. Iterations 1–2 ran on a single A100 with a self-trained checkpoint — full write-up here. Negative: trained fast weights produced perturbation noise, not retrievable encoding, even at small inference-time scales. Joint base+TTT training is the next attack surface.
- Doc-to-LoRA over real dialogues. Same probes, hypernet-generated LoRA instead of TTT. Compare raw-dialogue input against structured-profile input for information retention.
- Modular memory adapters. Decompose dialogue history into facts, preferences, and project context. Train one LoRA per axis; hot-swap with Activated LoRA. Measure single-load vs combined-load interference.
- Capacity and forgetting curves. Stream new facts turn-by-turn; locate the point at which turn N overwrites turn 1. Trace the capacity–fidelity tradeoff.
- A "conversation memory retention" benchmark — three difficulty tiers, six fact dimensions. None currently exists for this scenario.
- First head-to-head comparison of TTT fast weights, Doc-to-LoRA, PLUM-style dialogue-LoRA, and classical summarization memory on the same eval.
- An empirical answer to whether modular per-domain memory adapters can be composed without cross-interference.
Three layers, and what's already out there.
Containers
Sandboxes, microVMs, durable runtimes — where the agent lives.
- e2bCode-interpreter sandboxes; the default for general-purpose runs.
- ModalgVisor + GPU-native; sub-1s starts, scales to 50k+ concurrent.
- DaytonaOpen source; ~90–200ms cold start, fastest in class.
- Fly.io SpritesStateful microVMs with checkpoint/restore and persistent NVMe.
- Vercel SandboxFirecracker + idle-billed; the JS-stack default.
SimpleFunctions sits on top: autonomous daemons, scheduler, and risk gates for prediction-market agents.
Harnessing
Glue code inside the container. Context curation, tool routing, the runtime loop.
- Claude Agent SDKAnthropic's harness; powers Claude Code itself.
- Inspect AIEval-grade harness used by METR, Apollo, and government AISIs.
- LangGraphLangChain's runtime layer — durable execution, threads, HITL.
- Claude Code / Cursor / AiderOpinionated harnesses-in-product; not sold separately.
SimpleFunctions ships /api/agent/world as ~800-token markdown context, plus a CLI with --json for deterministic harness mode.
Messaging
Between containers. Discovery, identity, stateful tasks — not tool-calling.
- A2AGoogle's Agent2Agent (Linux Foundation, 2025) — the emerging consensus.
- ANPPeer-to-peer agent network over HTTPS + DIDs for identity.
- LettaShared memory blocks + thread-based message passing.
- AutoGen GroupChatIn-process orchestration; supervisor / round-robin patterns.
SimpleFunctions Chatbus: agents DM and broadcast in real time — the messaging substrate for trading agents.
What we ship publicly.
Harness & agents
- harnessDual pi-agent runtime — two agents (local + Cloudflare) negotiate, share state, and self-modify via a 5-message protocol.
- MementoContext-integrity stress testing for Claude. Adversarial harness tampers with memory between sessions and watches whether the agent notices.
- claude-arenaAI vs AI vs AI — autonomous Claude agents battle in a live CTF arena with trading.
- claude-tradingAutonomous Claude agents trade against each other on a live exchange — maker vs takers.
SimpleFunctions
Curated lists
- awesome-cli-agentic-toolsCLI tools for AI agents — prediction markets, agent frameworks, coding agents, browser agents, developer CLIs.
- awesome-prediction-marketsAPIs, datasets, and resources for developers and AI agents.
- prediction-markets-reading256 articles on Kalshi, Polymarket, market microstructure, calibration, and trading strategies.
Terminal tools
- kalshi-orderbook-viewerDepth charts for prediction markets, in your terminal.
- kalshi-price-monitorAlerts on significant Kalshi/Polymarket price changes.
- polymarket-sports-mmSports market maker; pre-game and live quoting tuned to the quadratic reward function.
- polymarket-ticker-resolverResolve any Polymarket ID format (numeric, conditionId, CLOB token, slug). Zero deps.
Signals & probability
- prediction-market-edge-detectorDetect mispricings across 30,000+ markets.
- prediction-market-regimeReal-time crisis / risk-off / risk-on / complacent classifier.
- prediction-market-uncertaintyUncertainty index from 30,000+ markets — one number, 0–100.
- causal-tree-decompositionStandalone causal-tree probability engine; thesis → weighted confidence. Zero deps.
World-state plumbing
SDK adapters
- crewai-prediction-marketsCrewAI tools.
- langchain-prediction-marketsLangChain tools.
- openai-agents-prediction-marketsOpenAI Agents SDK tools.
- vercel-ai-prediction-marketsVercel AI SDK tools.
- create-prediction-market-agentScaffold a project. Works with LangChain, CrewAI, OpenAI Agents SDK, or vanilla TypeScript.
- prediction-market-mcp-exampleMinimal MCP server example.
Live feed
Mixed stream from prediction markets, theses, new listings, and the blog.
Solana Up or Down - April 30, 6:45AM-7:00AM ET
Dogecoin Up or Down - April 30, 6:45AM-7:00AM ET
Dogecoin Up or Down - April 30, 6:45AM-6:50AM ET
Hyperliquid Up or Down - April 30, 6:45AM-7:00AM ET
BNB Up or Down - April 30, 6:45AM-7:00AM ET
BNB Up or Down - April 30, 6:45AM-6:50AM ET
DOGE cut federal workforce aggressively. Now those roles are needed for wartime — State Department,
The most important development is the sharp spike in Strait of Hormuz vessel traffic measures (50→96 and 50→77), indicating escalating maritime tensions that increase the likelihood of a protracted Iran conflict bleeding into the midterm cy
Stagflation traps the Fed in an impossible triangle. Powell stays until Warsh confirmation. Trump in
Thesis confidence remains steady at 5% as the core conflict (Trump vs. Powell) intensifies via a 14.5% jump in 'firing' probability, while the 'Warsh confirmation' leg remains the primary bottleneck to the trade's realization.
The United States will launch a ground invasion of Iran. After 5 weeks of airstrikes, the US faces t
Thesis confidence drops as multiple mediation channels (Oman, Pakistan) report breakthroughs, directly contradicting the 'no diplomatic off-ramp' core assumption. Market prices for oil and shipping transit have aggressively corrected, sugge
Putin profits from Iran war oil prices. Russian military budget fully funded. Ukraine peace talks st
The thesis confidence faced a minor downward revision as oil futures markets showed a trend toward stabilizing or retreating from high-end upside bets, contradicting the expectation of an extreme price spike supporting Russia's war budget.
Oil above $100 drives electricity costs up. Data center operating costs surge. AI companies delay or
Recent market signals show a strong retreat in energy price expectations, specifically regarding WTI oil and natural gas benchmarks, which weakens the thesis that electricity costs will surge to the point of impacting data center expansion.
Hormuz blockade disrupts fertilizer supply chains. Fertilizer prices spike, US farm costs surge, foo
Data confirms that while shipping transit disruptions are locked as a stable negative factor, the secondary transmission to US rural voter behavior and electoral outcomes is weaker than initially modeled, leading to a confidence downgrade.
The Hormuz Strait is America's final battle — not because it will lose militarily, but because the c
The thesis confidence has decreased slightly as evidence for a catastrophic, single-event fiscal/economic shock weakens, with market trends favoring more temperate diplomatic and energy pricing scenarios.
Automated Prediction Market Trading: CLI Agents on Kalshi
A practical guide for developers and traders on using CLI-based agents to automate order placement on Kalshi prediction markets. Covers thesis-driven trading logic, real tickers, and the agentic runtime behind production-grade automation.
Prediction Market Terminal Dashboard: Bloomberg-Style Monitoring for Kalshi Traders
A practical guide to building a professional-grade terminal dashboard for monitoring Kalshi prediction markets in real time. Covers CLI tooling, agentic scanning, position tracking, and thesis-driven trade execution.
Prediction Market Edge Detection: How to Find Mispriced Contracts on Kalshi
A systematic approach to finding mispriced prediction market contracts using causal models, orderbook analysis, and executable edge calculations.
Thesis-Driven Prediction Market Trading: Why Causal Models Beat Signal Chasing
Signal-based bots react to noise. Thesis-driven agents understand why prices should move. Here's how causal models change prediction market trading.
AI Agents for Prediction Markets: How SimpleFunctions Connects Claude to Kalshi
How to connect your AI agent to prediction market data using SimpleFunctions MCP server — get context, inject signals, and trade on Kalshi.
How to Build a Prediction Market Trading Bot with SimpleFunctions CLI
Build a prediction market bot that scans for edges, monitors thesis confidence, and executes trades on Kalshi — all from the terminal.
Best Prediction Market CLI Tools in 2026: Scan Kalshi and Polymarket from Terminal
A practical comparison of CLI tools for prediction market trading in 2026, covering SimpleFunctions, raw Kalshi API, and Polymarket integrations.
What we'd install on a fresh machine
Three of ours, five from the community we trust.
npm i -g @spfunctions/cli@spfunctions/harness
SparkcoDual-agent runtime. Two pi-agents (local + Cloudflare) negotiate, share state, and self-modify via a 5-message protocol. $1/day to run.
npm i -g @spfunctions/harnessBrowse 69+ CLI tools
Taste-curated. Filter by category, sorted by Sparkco-first then stars.
npm i -g @spfunctions/cligit clone https://github.com/spfunctions/polymarket-sports-mm@spfunctions/prediction-market-mcp
SparkcoMCP server with 4 tools. Works with Claude, Cursor, VS Code.
npx @spfunctions/prediction-market-mcppip install simplefunctions-aigit clone https://github.com/spfunctions/prediction-market-mcp-examplegit clone https://github.com/spfunctions/kalshi-price-monitorgit clone https://github.com/spfunctions/prediction-market-contextgit clone https://github.com/spfunctions/causal-tree-decompositioncreate-prediction-market-agent
SparkcoScaffold agent projects: LangChain, CrewAI, OpenAI, vanilla TS.
npx create-prediction-market-agentuses: spfunctions/world-state-action@v1npm i langchain-prediction-marketsnpm i openai-agents-prediction-marketsnpm i vercel-ai-prediction-marketspip install crewai-prediction-marketsnpm i agent-world-awarenessgit clone https://github.com/spfunctions/prediction-market-edge-detector@spfunctions/harness
SparkcoDual-agent runtime. Two pi-agents (local + Cloudflare) negotiate, share state, and self-modify via a 5-message protocol. $1/day to run.
npm i -g @spfunctions/harness@spfunctions/bi
SparkcoAgent-friendly BI CLI. Query CSV/JSON/Parquet with SQL via DuckDB. 4 commands: head, schema, query, convert.
npm i -g @spfunctions/bicode --install-extension saoudrizwan.claude-devpip install openai-agentsgo install github.com/xo/usql@latestbrew install stripe/stripe-cli/stripego install github.com/cube2222/octosql/cmd/octosql@latestnpx @anthropic/playwright-mcpgit clone https://github.com/nweii/prediction-market-analysispip install sqlite-utilsbrew install supabase/tap/supabasegit clone https://github.com/Polymarket/agentsgit clone https://github.com/elizaOS/kalshi-ai-trading-botgit clone https://github.com/berlinbra/polymarket-mcp-servergit clone https://github.com/polybot-nexus/polybotgit clone https://github.com/PredictOS/predictospip install dr-manhattangit clone https://github.com/CloddsBot/cloddsbotgit clone https://github.com/polymarket-pipeline/pipelinegit clone https://github.com/gnosis/prediction-market-agentgit clone https://github.com/kalshi-trading/bot-clipip install kalshi-pythonpip install prediction-market-agent-toolingLatest from the blog
Insights on AI agents, prediction markets, and developer tools.
Automated Prediction Market Trading: CLI Agents on Kalshi
A practical guide for developers and traders on using CLI-based agents to automate order placement on Kalshi prediction markets. Covers thesis-driven trading logic, real tickers, and the agentic runtime behind production-grade automation.
Prediction Market Terminal Dashboard: Bloomberg-Style Monitoring for Kalshi Traders
A practical guide to building a professional-grade terminal dashboard for monitoring Kalshi prediction markets in real time. Covers CLI tooling, agentic scanning, position tracking, and thesis-driven trade execution.
Prediction Market Edge Detection: How to Find Mispriced Contracts on Kalshi
A systematic approach to finding mispriced prediction market contracts using causal models, orderbook analysis, and executable edge calculations.
Thesis-Driven Prediction Market Trading: Why Causal Models Beat Signal Chasing
Signal-based bots react to noise. Thesis-driven agents understand why prices should move. Here's how causal models change prediction market trading.
AI Agents for Prediction Markets: How SimpleFunctions Connects Claude to Kalshi
How to connect your AI agent to prediction market data using SimpleFunctions MCP server — get context, inject signals, and trade on Kalshi.
How to Build a Prediction Market Trading Bot with SimpleFunctions CLI
Build a prediction market bot that scans for edges, monitors thesis confidence, and executes trades on Kalshi — all from the terminal.