AI infrastructure, tools, and open research.
Sparkco is an open-source research project on the post-AGI stack — the runtime containers agents live in, the harnessing (glue code) inside them, and the messaging between them. It's built by the team behind SimpleFunctions, where we're exploring how live prediction-market probabilities can serve as a real-time world state for AI agents. The site is our public log of that work: a live feed of AI and prediction-market signals, plus the setups and tools we recommend for agent builders.
We ship tools as CLIs first, not MCP — 0 tokens to expose, ~100% reliable, pipe-composable.
Parametric memory: replacing the context window with weights.
Today's chat models remember by re-reading the entire conversation on every turn. Compaction loses information, retrieval crowds the window, and a new session starts blank. We're testing whether the facts, preferences, and behavior in a dialogue can be encoded directly into model weights — leaving the context free for what's actually being said now.
Want to collaborate? patrick@simplefunctions.dev
Read the full directionHide
The context window is a finite token sequence, fully recomputed on every turn. Every existing workaround — summarization memory, vector retrieval, KV caching — moves the cost without solving it: long context drifts, compaction discards information, retrieval crowds the same window it pulls from. If conversational state could live in weight deltas instead of tokens, the window would only need to hold the current turn.
- Test-time training. ByteDance In-Place TTT (ICLR 2026 oral) and Stanford/NVIDIA TTT-E2E update MLP projection weights online during inference, compressing long context into fast weights. All published work targets long-document throughput; nobody has tested whether the fast weights survive once the document is dropped from context.
- Hypernetwork → adapter. Sakana's Doc-to-LoRA (Feb 2026) and P2P (Oct 2025) train a hypernet that emits a LoRA from raw text or a user profile in under a second. Validates "text → weights" as a tractable mapping — but neither was designed for accumulating dialogue history.
- Dialogue-direct fine-tuning. PLUM (Nov 2024) fine-tunes a LoRA on dialogue Q/A pairs and matches RAG at 100 turns. MemLoRA trains memory management itself as a LoRA. IBM's Activated LoRA (Dec 2025) solves multi-LoRA hot-swap without KV recompute — making per-conversation memory modules feasible.
- Knowledge editing. ROME and MEMIT do surgical single-fact edits on weights, but catastrophic forgetting appears past ~1000 edits. Not a candidate at dialogue scale.
These live in disjoint communities — efficient inference, recsys, personalization NLP, on-device, model editing — and have never been compared on the same benchmark. None has been evaluated end-to-end on a real user's multi-hundred-turn history across technical, strategic, philosophical, and personal domains, with the conversation removed from context. Existing benchmarks (RULER, needle-in-haystack, LaMP) are synthetic or shallow.
- TTT fast weights as memory. Ingest a fact-bearing dialogue with In-Place TTT, drop the context, probe. Iterations 1–2 ran on a single A100 with a self-trained checkpoint — full write-up here. Negative: trained fast weights produced perturbation noise, not retrievable encoding, even at small inference-time scales. Joint base+TTT training is the next attack surface.
- Doc-to-LoRA over real dialogues. Same probes, hypernet-generated LoRA instead of TTT. Compare raw-dialogue input against structured-profile input for information retention.
- Modular memory adapters. Decompose dialogue history into facts, preferences, and project context. Train one LoRA per axis; hot-swap with Activated LoRA. Measure single-load vs combined-load interference.
- Capacity and forgetting curves. Stream new facts turn-by-turn; locate the point at which turn N overwrites turn 1. Trace the capacity–fidelity tradeoff.
- A "conversation memory retention" benchmark — three difficulty tiers, six fact dimensions. None currently exists for this scenario.
- First head-to-head comparison of TTT fast weights, Doc-to-LoRA, PLUM-style dialogue-LoRA, and classical summarization memory on the same eval.
- An empirical answer to whether modular per-domain memory adapters can be composed without cross-interference.
Three layers, and what's already out there.
Containers
Sandboxes, microVMs, durable runtimes — where the agent lives.
- e2bCode-interpreter sandboxes; the default for general-purpose runs.
- ModalgVisor + GPU-native; sub-1s starts, scales to 50k+ concurrent.
- DaytonaOpen source; ~90–200ms cold start, fastest in class.
- Fly.io SpritesStateful microVMs with checkpoint/restore and persistent NVMe.
- Vercel SandboxFirecracker + idle-billed; the JS-stack default.
SimpleFunctions sits on top: autonomous daemons, scheduler, and risk gates for prediction-market agents.
Harnessing
Glue code inside the container. Context curation, tool routing, the runtime loop.
- Claude Agent SDKAnthropic's harness; powers Claude Code itself.
- Inspect AIEval-grade harness used by METR, Apollo, and government AISIs.
- LangGraphLangChain's runtime layer — durable execution, threads, HITL.
- Claude Code / Cursor / AiderOpinionated harnesses-in-product; not sold separately.
SimpleFunctions ships /api/agent/world as ~800-token markdown context, plus a CLI with --json for deterministic harness mode.
Messaging
Between containers. Discovery, identity, stateful tasks — not tool-calling.
- A2AGoogle's Agent2Agent (Linux Foundation, 2025) — the emerging consensus.
- ANPPeer-to-peer agent network over HTTPS + DIDs for identity.
- LettaShared memory blocks + thread-based message passing.
- AutoGen GroupChatIn-process orchestration; supervisor / round-robin patterns.
SimpleFunctions Chatbus: agents DM and broadcast in real time — the messaging substrate for trading agents.
What we ship publicly.
Harness & agents
- harnessDual pi-agent runtime — two agents (local + Cloudflare) negotiate, share state, and self-modify via a 5-message protocol.
- MementoContext-integrity stress testing for Claude. Adversarial harness tampers with memory between sessions and watches whether the agent notices.
- claude-arenaAI vs AI vs AI — autonomous Claude agents battle in a live CTF arena with trading.
- claude-tradingAutonomous Claude agents trade against each other on a live exchange — maker vs takers.
SimpleFunctions
Curated lists
- awesome-cli-agentic-toolsCLI tools for AI agents — prediction markets, agent frameworks, coding agents, browser agents, developer CLIs.
- awesome-prediction-marketsAPIs, datasets, and resources for developers and AI agents.
- prediction-markets-reading256 articles on Kalshi, Polymarket, market microstructure, calibration, and trading strategies.
Terminal tools
- kalshi-orderbook-viewerDepth charts for prediction markets, in your terminal.
- kalshi-price-monitorAlerts on significant Kalshi/Polymarket price changes.
- polymarket-sports-mmSports market maker; pre-game and live quoting tuned to the quadratic reward function.
- polymarket-ticker-resolverResolve any Polymarket ID format (numeric, conditionId, CLOB token, slug). Zero deps.
Signals & probability
- prediction-market-edge-detectorDetect mispricings across 30,000+ markets.
- prediction-market-regimeReal-time crisis / risk-off / risk-on / complacent classifier.
- prediction-market-uncertaintyUncertainty index from 30,000+ markets — one number, 0–100.
- causal-tree-decompositionStandalone causal-tree probability engine; thesis → weighted confidence. Zero deps.
World-state plumbing
SDK adapters
- crewai-prediction-marketsCrewAI tools.
- langchain-prediction-marketsLangChain tools.
- openai-agents-prediction-marketsOpenAI Agents SDK tools.
- vercel-ai-prediction-marketsVercel AI SDK tools.
- create-prediction-market-agentScaffold a project. Works with LangChain, CrewAI, OpenAI Agents SDK, or vanilla TypeScript.
- prediction-market-mcp-exampleMinimal MCP server example.
Live feed
Mixed stream from prediction markets, theses, new listings, and the blog.
US freezes Russian assets, sanctions Iran, bombs Iran — each action tells the world the dollar syste
The thesis remains robust as market data continues to show elevated interest in precious metals and structural shifts in payment rails. Recent price volatility in Bitcoin and various bullion contracts reflects tactical noise rather than a c
Stagflation traps the Fed in an impossible triangle. Powell stays until Warsh confirmation. Trump in
Thesis confidence declined slightly as market participants aggressively priced out immediate inflationary spikes in July CPI and reduced recession probabilities. While the core political struggle between Trump and Powell remains the primary
Ethereum Up or Down - May 25, 2:35PM-2:40PM ET
Ethereum above ___ on May 24, 4PM ET?: 2,025
Ethereum above ___ on May 24, 4PM ET?
Ethereum above ___ on May 24, 4PM ET?: 2,040
Ethereum above ___ on May 24, 4PM ET?
Ethereum above ___ on May 24, 4PM ET?: 2,055
Ethereum above ___ on May 24, 4PM ET?
Ethereum above ___ on May 24, 4PM ET?: 2,070
Ethereum above ___ on May 24, 4PM ET?
Ethereum above ___ on May 24, 4PM ET?: 2,085
Ethereum above ___ on May 24, 4PM ET?
Buy Hormuz normalization: 54c deal odds vs 28c traffic lag
The July 31 US-Iran peace deal contract jumped 15c to 54c while Strait of Hormuz normal-traffic by end-June sits at just 28c — a 26c contagion gap. R2 is already shifting regime from neutral to taker, confirming momentum. The diplomatic-mee
Sell oil above $100: bearish crude contracts now at 63c
WTI crude falling to $80 by end of June is priced at 63c on R8, which is transitioning from taker to neutral — a regime exhaustion signal. Meanwhile the $95 May contract surged 33c to 72c on Iran détente news, confirming the directional shi
Contrarian: Gold to $4,650 still live at 56c amid dollar weakness
R6 shows gold at $4,650 in May 2026 priced at 56c, shifting to taker regime with score 0.625 — momentum is accelerating. Iran détente reduces safe-haven demand but Fed leadership instability (Powell exit at 95c) and dollar uncertainty susta
12c cross-venue arb: buy Kalshi Bitcoin reserve at 19c
X1 on Kalshi prices the US National Bitcoin Reserve before Jan 1, 2027 at 19c while X2 on Polymarket prices the identical contract at 31c — a 12c gap at 0.95 confidence. This is the highest-confidence cross-venue arb in the dataset, well ab
Satoshi coins stay dormant: sell movement risk at 8c
R3 shows Satoshi-moves-Bitcoin-in-2026 at 8c, transitioning from maker to neutral — liquidity is withdrawing, suggesting the 8c price is about to drift. MicroStrategy sell probabilities have been whipsawed (crashed 14c to 21c then surged 11
1477% IY on Congress veto override: carry trade at 10c
L1 offers 1477% implied yield at 10c for a Congress veto override before 2027 — the highest IY in the legislative dataset by a factor of 10x. Powell's forced exit at 95c probability signals executive dominance, making a successful congressi
Insurrection Act invocation at 657% IY: policy tail hedge
L5 prices Trump invoking the Insurrection Act at 20c with 657% implied yield — a fat-tailed policy risk that is systematically underpriced given the Powell exit signal and executive power consolidation narrative. L6 at 58c (27% IY) represen
58c contagion lag: Democratic FL-23 lags CA-17 trigger by 78c
C1 and C3 show Democratic House race triggers in CA-17 and CA-44 moving +78c delta while FL-23 Democratic contract sits at 82c — a -58c contagion gap. The uniform gap across 8 contract pairs (C1 through C8) at identical -58c magnitude signa
The United States will launch a ground invasion of Iran. After 5 weeks of airstrikes, the US faces t
Thesis confidence drops as multiple mediation channels (Oman, Pakistan) report breakthroughs, directly contradicting the 'no diplomatic off-ramp' core assumption. Market prices for oil and shipping transit have aggressively corrected, sugge
Putin profits from Iran war oil prices. Russian military budget fully funded. Ukraine peace talks st
The thesis confidence faced a minor downward revision as oil futures markets showed a trend toward stabilizing or retreating from high-end upside bets, contradicting the expectation of an extreme price spike supporting Russia's war budget.
Oil above $100 drives electricity costs up. Data center operating costs surge. AI companies delay or
Recent market signals show a strong retreat in energy price expectations, specifically regarding WTI oil and natural gas benchmarks, which weakens the thesis that electricity costs will surge to the point of impacting data center expansion.
The Hormuz Strait is America's final battle — not because it will lose militarily, but because the c
The thesis confidence has decreased slightly as evidence for a catastrophic, single-event fiscal/economic shock weakens, with market trends favoring more temperate diplomatic and energy pricing scenarios.
What we'd install on a fresh machine
Three of ours, five from the community we trust.
npm i -g @spfunctions/cli@spfunctions/harness
SparkcoDual-agent runtime. Two pi-agents (local + Cloudflare) negotiate, share state, and self-modify via a 5-message protocol. $1/day to run.
npm i -g @spfunctions/harnessBrowse 69+ CLI tools
Taste-curated. Filter by category, sorted by Sparkco-first then stars.
npm i -g @spfunctions/cligit clone https://github.com/spfunctions/polymarket-sports-mm@spfunctions/prediction-market-mcp
SparkcoMCP server with 4 tools. Works with Claude, Cursor, VS Code.
npx @spfunctions/prediction-market-mcppip install simplefunctions-aigit clone https://github.com/spfunctions/prediction-market-mcp-examplegit clone https://github.com/spfunctions/kalshi-price-monitorgit clone https://github.com/spfunctions/prediction-market-contextgit clone https://github.com/spfunctions/causal-tree-decompositioncreate-prediction-market-agent
SparkcoScaffold agent projects: LangChain, CrewAI, OpenAI, vanilla TS.
npx create-prediction-market-agentuses: spfunctions/world-state-action@v1npm i langchain-prediction-marketsnpm i openai-agents-prediction-marketsnpm i vercel-ai-prediction-marketspip install crewai-prediction-marketsnpm i agent-world-awarenessgit clone https://github.com/spfunctions/prediction-market-edge-detector@spfunctions/harness
SparkcoDual-agent runtime. Two pi-agents (local + Cloudflare) negotiate, share state, and self-modify via a 5-message protocol. $1/day to run.
npm i -g @spfunctions/harness@spfunctions/bi
SparkcoAgent-friendly BI CLI. Query CSV/JSON/Parquet with SQL via DuckDB. 4 commands: head, schema, query, convert.
npm i -g @spfunctions/bicode --install-extension saoudrizwan.claude-devpip install openai-agentsgo install github.com/xo/usql@latestbrew install stripe/stripe-cli/stripego install github.com/cube2222/octosql/cmd/octosql@latestnpx @anthropic/playwright-mcpgit clone https://github.com/nweii/prediction-market-analysispip install sqlite-utilsbrew install supabase/tap/supabasegit clone https://github.com/Polymarket/agentsgit clone https://github.com/elizaOS/kalshi-ai-trading-botgit clone https://github.com/berlinbra/polymarket-mcp-servergit clone https://github.com/polybot-nexus/polybotgit clone https://github.com/PredictOS/predictospip install dr-manhattangit clone https://github.com/CloddsBot/cloddsbotgit clone https://github.com/polymarket-pipeline/pipelinegit clone https://github.com/gnosis/prediction-market-agentgit clone https://github.com/kalshi-trading/bot-clipip install kalshi-pythonpip install prediction-market-agent-toolingLatest from the blog
Insights on AI agents, prediction markets, and developer tools.
Automated Prediction Market Trading: CLI Agents on Kalshi
A practical guide for developers and traders on using CLI-based agents to automate order placement on Kalshi prediction markets. Covers thesis-driven trading logic, real tickers, and the agentic runtime behind production-grade automation.
Prediction Market Terminal Dashboard: Bloomberg-Style Monitoring for Kalshi Traders
A practical guide to building a professional-grade terminal dashboard for monitoring Kalshi prediction markets in real time. Covers CLI tooling, agentic scanning, position tracking, and thesis-driven trade execution.
Prediction Market Edge Detection: How to Find Mispriced Contracts on Kalshi
A systematic approach to finding mispriced prediction market contracts using causal models, orderbook analysis, and executable edge calculations.
Thesis-Driven Prediction Market Trading: Why Causal Models Beat Signal Chasing
Signal-based bots react to noise. Thesis-driven agents understand why prices should move. Here's how causal models change prediction market trading.
AI Agents for Prediction Markets: How SimpleFunctions Connects Claude to Kalshi
How to connect your AI agent to prediction market data using SimpleFunctions MCP server — get context, inject signals, and trade on Kalshi.
How to Build a Prediction Market Trading Bot with SimpleFunctions CLI
Build a prediction market bot that scans for edges, monitors thesis confidence, and executes trades on Kalshi — all from the terminal.