Flux Intent 1.0: Designing AI as Capability, Not Conversation

Last week I wrote about Flux Input—a command palette that understands navigation intent. Today I want to explore the architectural decision that makes it possible, and why it matters beyond code organization.

Control Topology

Every AI browser or AI-enhanced extension I’ve seen adds a sidebar. You click an icon, a panel slides out, you have a conversation. The sidebar becomes a peer to the browser—two surfaces, two interaction modes, two mental models.

This creates a topology:

Human → Browser
      → Sidebar (AI)

Two parallel channels. The AI lives alongside your browsing, not inside it.

I wanted a different topology:

Human → Drift (control room) → Browser / AI / World

One channel. Drift is the orchestration layer—the brain that connects human intent to everything else. Browser navigation, tab management, AI queries, settings—all flow through the same interface.

This isn’t anti-sidebar. I might add a sidebar later for extended conversations or detailed AI output. But it would be something Drift dispatches to, not a peer surface. The control room stays central.

Why Orchestration Matters

The difference between “sidebar as peer” and “sidebar under orchestration” seems subtle, but it changes who drives.

In a peer topology, the AI can lead. It can ask follow-up questions, suggest tangents, shape the interaction. You context-switch between browsing and conversing.

In an orchestration topology, you lead. You issue commands through Drift, the system executes. AI becomes a capability Drift can invoke, not an entity you consult separately. When you type /goto that article about SQLite, you’re not asking the AI for help—you’re telling Drift what you want, and Drift uses whatever it needs (history search, AI, direct navigation) to fulfill it.

This is the product philosophy: human-first, keyboard-native, central control. The architecture needs to support it.

The Intent Abstraction

The naive implementation treats AI as special. Regular commands go through normal code paths. AI commands go through different code paths with loading states, streaming responses, error handling. You end up with two systems bolted together.

I started there. It was a mess.

The insight was treating AI as just another handler in a unified pipeline:

User Input → IntentParser → FluxIntent → IntentRouter → IntentHandler → Result

Every input—whether it resolves to direct navigation, search, or an AI call—flows through the same path. The system doesn’t know or care whether AI is involved until the handler executes.

An intent has three parts:

struct FluxIntent {
  let verb: Verb        // what to do: .goto, .ask, .search
  let context: Context? // where to look: .sel, .page, .history
  let freeText: String  // additional input
}

This maps to the syntax: /goto @history that article about SQLite. Verb is goto. Context is history. Free text is the query. The parser extracts structure. The router finds the handler. The handler executes.

What makes this work for AI integration:

Uniform interface. The GotoHandler for AI-assisted navigation has the same signature as SearchHandler for direct search. The view layer doesn’t branch on “is this an AI command.” It routes and waits.

Scoped context. The @sel, @page, @history tokens explicitly declare what data the AI can access. This isn’t just syntax sugar—it’s a contract. When you type /ask @sel explain this, you’re authorizing access to selected text, nothing more.

Streaming as implementation detail. Handlers receive callbacks for streaming responses. AI handlers use them. Non-AI handlers don’t. The protocol accommodates both without special-casing.

protocol IntentHandler {
  var supportedVerb: FluxIntent.Verb { get }
  func execute(_ intent: FluxIntent, context: IntentExecutionContext) async
  func cancel()
}

struct IntentExecutionContext {
  let browserModel: BrowserModel
  let aiService: AIService?
  let selectedText: String?
  var onStreamChunk: ((String) -> Void)?
  var onComplete: ((Result<IntentResult, Error>) -> Void)?
}

Do remember - the aiService is optional. Handlers check if it’s configured, fall back gracefully if not. AI is a capability that may or may not be present, not a mode the system switches into.

When someone uses Drift, they’re not “talking to AI” or “using the browser.” They’re expressing intent. The system figures out what’s needed—direct navigation, history search, AI query, or some combination. The user stays in one place, in control.

I’m aware of limitations and tradeoffs. Complex queries and latency perception are two big ones. To fit AI into the intent model, and keep Drift keyboard-native, I will slowly need to address them. By writing this architecture now, I have a rough idea of how to evolve in 2.0.

The Shape of AI-Native Software

I don’t think sidebars are wrong. For some products, conversation is the right metaphor. Flakes might eventually have a sidebar for extended AI interactions—but it would open from Drift, display results Drift requested, close when done. The orchestration layer stays central.

The question I keep returning to: what should the control topology of AI-native software look like? The industry defaulted to peer surfaces—browser here, AI there, user bouncing between them. That’s one answer.

The Intent architecture is my attempt at a different answer. One control surface. One interaction model. AI as capability, not conversation partner. The user expresses intent, the system orchestrates whatever’s needed to fulfill it.

I believe AGI is still far off. But even today’s AI capabilities can enhance software if designed thoughtfully. I’m finding a way to do that in Drift, and I think the Intent architecture is key.