Case Study
An end-to-end interaction pattern for an AI agent interface, covering the chat environment, in-chat elements, artifacts, and the agent's working layer. Built so people can see what the agent is doing, stay in control, and trust the output.
Overview · The Problem
AI agents can now take real actions and return real work, not just chat replies.
But that power creates a new design problem. When an agent interprets a request, makes decisions, and produces files on its own, the person on the other side is left out of the loop.
The Brief
I was brought in for a focused two-week engagement to design the agent chat experience for Impetus, an enterprise operations platform. The brief wasn't "make a chatbot." It was to define a coherent interaction pattern for working with an agent: requesting work, watching it reason, reviewing outputs, and steering it when it's wrong.
This case study maps that pattern across four layers, each one making the agent more legible, more steerable, and more trustworthy.
An agent that works invisibly asks you to trust blindly.
Four layers of the pattern
A · Agent Workflows
Chat Environment
The structure you work inside
B · Decision Visibility
Chat Elements
How each message communicates state and logic
C · Human-in-the-Loop
Artifacts
The work the agent produces, reviewable and extractable
D · Trust Patterns
Virtual Machine
The agent's working layer, made visible
Research · Competitive Analysis
Before designing anything, I audited how existing products handle the same problems: Shopify, Anthropic, Hatchcanvas, Manus, ChatGPT, Slack, and Bard. The question wasn't which UI looked best, it was which patterns made the agent's work legible and the output trustworthy. Six dimensions, six products.
Shopify · Hatchcanvas · Anthropic · ChatGPT · Manus · Slack · Bard, reference captures across six interaction dimensions
The table surfaced the gap: most products handle chat listing and basic artifacts, but none showed the agent's live execution in a way a non-technical user could follow. The agentic mode column, the row that mattered most, was mostly blank. Making live agent execution visible to a non-technical user — something no existing product did — became the brief for the Virtual Machine layer.
The foundation is the workspace itself. A predictable structure is the first form of trust, people relax when the space behaves consistently.
Default structure
The default state sets the contract: a focused composer, a clear model selector, and an honest disclaimer that the agent can make mistakes. The empty state invites a request without overwhelming, everything else stays out of the way until needed.
Expand actions & chat scroll
The side panel collapses to an icon strip by default, keeping the chat canvas uncluttered. Expanding it reveals three quick actions at the top — New Chat, Search Chat, Saved Memories — followed by a scrollable chat history grouped into Starred and Chats. The panel slides open inline without a layout jump, so the workspace adjusts without losing context.
Side panel element states
Trust at the system level is built from rigor at the component level. I documented every interactive element across Normal, Hover, Pressed, and Selected so behavior is consistent and legible everywhere. The states matrix isn't decoration, it's the spec that keeps the agent's workspace feeling reliable.
Inside the conversation, each message has to do more than display text — it has to communicate state. I built a small system of chat elements so every message carries that information legibly.
Text styles (in context + spec)
I defined distinct text treatments for the three voices in any agent conversation: the user's prompt, the agent's thinking, and the agent's response. Separating these visually is what lets a person scan a long exchange and instantly know who is "speaking." The "Thought for Xs" treatment makes the agent's reasoning a first-class, collapsible part of the message, visible when you want to verify, out of the way when you don't.
This is the decision-visibility layer. The thinking block reveals, in plain language: what the agent understood, the steps it planned, and the assumptions it made, so a wrong interpretation gets caught before the user trusts the output.
Tags & agent action elements
Agents don't just talk, they cite sources, reference files, and run actions. I designed a set of in-chat tags for this: action badges (what the agent is doing), source/link tags (clickable provenance), and artifact/file tags (the files under review or in use). Inside the thinking block, these combine into a readable trace: action → details → sources → output → status (Done / Executing). Provenance you can click is provenance you can trust.
The agent's real value is the work it produces. Artifacts are how that work becomes something a person can review, verify, edit, extract, and hand off, the human-in-the-loop core of the system. An answer locked inside a chat bubble can't be trusted or used; an artifact can.
Artifacts listing (opened)
Every file the agent generates collects in a dedicated Artifacts panel, typed, named, individually downloadable, and "Download All" for the whole set. The user can collapse it to keep focus, or open it to see everything the agent has produced in one place. Output becomes inventory, not a scroll-back hunt.
Artifacts & folders sent by the user
The loop runs both ways. Users send files and folders into the conversation as inputs, with the same tag language used for the agent's outputs. One consistent file element, whether it's something you gave the agent or something it made, so the exchange reads as a single shared workspace.
Opened artifact: document view
Opening an artifact reveals a full preview beside the conversation, a formatted document the user can read, verify against their request, and publish or download. This is the review checkpoint: the agent's claim and the actual deliverable sit side by side for confirmation.
Opened artifact: code view
For structured output, the same artifact switches to a code/data view, the actual query definitions and schema the agent used. Toggling between human-readable preview and raw structure is the strongest trust move in the system: the user can verify exactly what the agent did before relying on it.
Artifact component patterns
Like the rest of the system, artifacts are a documented component set: list-item states (Normal / Hover / Pressed / Selected), artifact-as-response vs. artifact-as-input variants, and the preview-and-breakdown anatomy. This is what makes the pattern reusable beyond this one product.
The deepest layer makes the agent's work itself visible. Most chat interfaces hide execution behind a spinner, the user waits, unsure what's happening or whether to trust the result. I designed a "computer" layer that does the opposite: it shows the agent working, names the file or source it's touching, and lets the user replay the whole run. Visible work is trustable work.
Working banner (collapsed, in chat)
While the agent runs, a compact banner sits inline above the composer, "Fynd is using xyz.gpt file", with a thumbnail peek of what it's operating on. Above it, the live action trace updates in real time: each action names itself, shows its steps, cites its sources, and reports status (Done → Executing). The user never wonders "is it stuck?", the work is narrated as it happens, without leaving the conversation.
Working view (opened, split screen)
Expanding the banner opens the Fynd Computer: a dedicated panel beside the chat that shows the agent's environment, the action it's running, the source or file it's acting on, and a playback bar to step through the run. Pairing the conversation with a live view of the machine turns an opaque process into something a person can watch, pause, and verify. This is the human-in-the-loop principle pushed all the way down to execution.
The pattern: collapsed ↔ opened
Like every other layer, the VM is a documented, reusable pattern with two states: collapsed (the in-chat banner, for staying in flow) and opened (the full computer view, for close inspection). Progressive disclosure again, transparency on demand, never forced. The same playback controls in both states let the user replay what the agent did, making the agent's work an auditable record rather than a one-time event.
This is the trust-through-transparency centerpiece of the system. An agent you can watch and replay is an agent you can trust with real work.
The System
Put together, the four layers form one loop: the user makes a request, sees the agent's interpretation and reasoning, watches the work run, reviews the artifact it produces, and either extracts it or steers the agent to try again. Visibility and control at every step, that's the pattern.
Reflection
The hardest decision wasn't visual, it was how much of the agent's mind to show. Too little and the user can't trust it; too much and the interface drowns. The answer was progressive disclosure: collapsed by default, expandable on demand, with provenance always one click away.
Designing for a non-deterministic system also reframed what "done" means. A traditional UI is correct or broken; an agent interface has to be honest about uncertainty and always leave the human a way to intervene. That principle — legible, steerable, honest, is what I'd carry into any agent product.