IMPULSE #5: Stop Treating Modes as Features

I recently revisited a piece on multimodal UX that describes how everyday experiences are drifting from “screen + tap” toward rich blends of voice, touch, vision, and motion—think smart homes, in‑car systems, and AR environments. The article defines multimodal UX not just as “more ways to interact,” but as designing a single, cohesive experience that feels intuitive no matter which mode the user leans on in that moment.

What struck me is how often we still design modes like separate features: “We added voice,” “we added gesture,” “we added chat.” The article’s examples—smart home voice + wall panels, automotive dashboards mixing touch and voice—show that users don’t think that way. They just reach for whatever feels fastest, safest, or most natural. If the system treats each mode as a silo, the user ends up doing the coordination work: remembering which mode does what, where, and when.

For UX, the real design problem isn’t “how do we support more input types?” It’s “how do we choreograph them so users don’t have to think about the choreography at all?” That maps directly onto my thesis questions about conversational design tools. When a designer switches from typing “make this bigger” to dragging a handle, that’s also multimodal UX—just in a professional tool context. The challenge is to make that transition feel like one continuous action, not a context switch that breaks flow.

If multimodal UX in consumer products is about blending voice, touch, and vision into one mental model, then multimodal UX in design tools should be about blending conversation and direct manipulation into one sense of control.

Relevant link: https://www.ux-bulletin.com/multimodal-ux-design/

Leave a Reply

Your email address will not be published. Required fields are marked *