When an agent screenshots the app using the mcp and then interacts with the app by clicking at coords, they see the coords in the native image coords. Since the mcp does everything else in logical coordinates, it helps if the image they see is also in logical resoltution, so we always downscale it to 1.0. I've added this here to avioid having to decode and re-encode the image in the mcp. Unfortunately it only does downscaling for now, since adding some way to upscale the image just for the screenshot would add a lot of complexity, and might be invasive from a plugin. I've also changed the submit call to take a closure, to make it easier to use other transport channel (makes the implementation for reruns mcp nicer). --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Co-authored-by: Lucas Meurer <lucas@rerun.io>
2.9 KiB
egui_inspection
Inspection for egui apps.
egui_inspection defines a wire protocol and an [egui::Plugin] (InspectionPlugin) that
serves it. An external inspector — such as the
egui_mcp MCP server — connects and can:
- read the app's AccessKit tree (
GetTree), - inject input events (
HandleEvents— clicks, typing, scrolling, …), - capture a screenshot on request (
Screenshot), - resize the window (
Resize).
The protocol is strictly request → response, which maps cleanly onto both a TCP socket and a unary RPC (so the same machinery can be tunnelled over another transport).
Screenshots need a visible window. Reading the tree and injecting input work even while the app is in the background, but capturing a screenshot requires a rendered frame — which the OS won't produce for a fully-occluded or minimized window (notably on macOS, where the GPU surface isn't available). Bring the window to the foreground to capture it; the
Screenshotrequest times out otherwise.
What it's for
egui_inspection is the shared foundation for tools that observe or drive an egui app from
the outside. Anything that speaks the protocol (over TCP, or another transport) can be a
consumer:
egui_mcp— an MCP server that exposes the app to AI agents and other tooling: query the widget tree, click / type / scroll, take screenshots.- An egui inspector GUI (planned) — a visual debugger that connects to a running app to browse its widget tree and drive it interactively.
- Test inspection & frame streaming (planned) — attach to
egui_kittesttests, and stream frames for live mirroring of an app's window.
Enabling it in an eframe app
Enable eframe's inspection feature, then set the EGUI_INSPECTION env var at runtime. It's
either truthy, falsy, or a bind address:
EGUI_INSPECTION=1 cargo run --features inspection # binds 127.0.0.1:5719
EGUI_INSPECTION=0.0.0.0:5719 cargo run --features inspection # reachable across devices
When the variable is unset or falsy (0 / false), inspection is completely off
(production-safe).
⚠️ Binding a non-loopback address exposes full control of the app — and its screenshots — to anyone who can reach the port, with no authentication. A warning is logged when you do so. Prefer loopback + an SSH tunnel for remote debugging.
Using the plugin directly
# let ctx = egui::Context::default();
ctx.add_plugin(egui_inspection::InspectionPlugin::new(Some("my app".to_owned())));
egui_inspection::serve(&ctx, "127.0.0.1:5719").unwrap();