#artificial-intelligence #llm #ai-agent #vision

bin+lib agentic-vision-mcp

MCP server for AgenticVision — universal LLM access to persistent visual memory

16 releases

0.3.0 Feb 27, 2026
0.2.5 Feb 26, 2026
0.1.8 Feb 25, 2026

#24 in #vision


MIT license

365KB
9K SLoC

AgenticVision-MCP

MCP server for AgenticVision — universal LLM access to persistent visual memory.

crates.io MIT License

What it does

AgenticVision-MCP exposes the AgenticVision engine over the Model Context Protocol (JSON-RPC 2.0 over stdio). Any MCP-compatible LLM gains persistent visual memory — capture screenshots, embed with CLIP ViT-B/32, compare, recall.

Install

cargo install agentic-vision-mcp

Configure Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "vision": {
      "command": "agentic-vision-mcp",
      "args": ["--vision", "~/.vision.avis", "serve"]
    }
  }
}

Configure Claude Code

Add to ~/.claude/mcp.json:

{
  "mcpServers": {
    "vision": {
      "command": "agentic-vision-mcp",
      "args": ["--vision", "~/.vision.avis", "serve"]
    }
  }
}

Configure VS Code / Cursor

Add to .vscode/settings.json:

{
  "mcp.servers": {
    "vision": {
      "command": "agentic-vision-mcp",
      "args": ["--vision", "${workspaceFolder}/.vision/project.avis", "serve"]
    }
  }
}

Configure Windsurf

Add to ~/.codeium/windsurf/mcp_config.json:

{
  "mcpServers": {
    "vision": {
      "command": "agentic-vision-mcp",
      "args": ["--vision", "~/.vision.avis", "serve"]
    }
  }
}

Do not use /tmp for vision files — macOS and Linux clear this directory periodically. Use ~/.vision.avis for persistent storage.

MCP Surface Area

Category Count Examples
Tools 10 vision_capture, vision_similar, vision_diff, vision_compare, vision_query, vision_ocr, vision_track, vision_link, session_start, session_end
Resources 6 avis://capture/{id}, avis://session/{id}, avis://timeline/{start}/{end}, avis://similar/{id}, avis://stats, avis://recent
Prompts 4 observe, compare, track, describe

How it works

  1. Capturevision_capture accepts images from files, base64, screenshots, or the system clipboard. Embeds with CLIP ViT-B/32, stores in .avis binary format. Screenshots support optional region capture on macOS and Linux.
  2. Queryvision_query retrieves by time, description, or recency. vision_similar finds visually similar captures by cosine similarity.
  3. Comparevision_compare for side-by-side LLM analysis. vision_diff for pixel-level differencing with 8×8 grid region detection.
  4. Linkvision_link connects captures to AgenticMemory cognitive graph nodes.

CLI Commands

# Start server (stdio) — defaults to ~/.vision.avis
agentic-vision-mcp serve

# Start server with custom vision file and model
agentic-vision-mcp --vision /path/to/file.avis --model /path/to/clip.onnx serve

# Validate a vision file
agentic-vision-mcp --vision ~/.vision.avis validate

# Print server info as JSON
agentic-vision-mcp info

Performance

Operation Time
MCP tool round-trip 7.2 ms
Image capture 47 ms
Similarity search (top-5) 1-2 ms
Visual diff <1 ms

Development

This crate is part of the AgenticVision Cargo workspace.

# Run MCP server tests (from workspace root)
cargo test -p agentic-vision-mcp

# Run all workspace tests
cargo test --workspace

# Clippy + format
cargo clippy --workspace
cargo fmt --all

# Build release
cargo build --release

Protocol

This server implements MCP (Model Context Protocol) spec version 2024-11-05 over JSON-RPC 2.0. Transport: stdio (newline-delimited JSON over stdin/stdout).

License

MIT

Dependencies

~20–35MB
~461K SLoC