Desktop recommended
ASL Flow uses your webcam and a computer vision model that works best on a desktop or laptop with a larger screen.
← Back to demosASL Flow
Built for Wispr FlowMaking voice dictation accessible through sign language, using only a webcam.
What Wispr does today
Wispr Flow's pipeline begins at the microphone. It has no input layer for users who communicate through sign, which means one of the most powerful productivity tools built in years is inaccessible to an entire population by default.
Today
Requires spoken audio
With ASL Flow
Works with sign language
What this demo adds
Wispr Flow turns speech into polished text across every app, but it requires a voice. For the 30+ million deaf and hard-of-hearing Americans, that's a wall. ASL Flow is a concept integration that removes it: a computer vision pipeline reads American Sign Language from a standard webcam, converts it to synthesized audio, and passes it into Wispr Flow, making every text field on your computer accessible through sign language, with no specialized hardware.
This is a concept demo, not a finished product. The hand recognition model works with ASL fingerspelling (individual letters A–Z) only, not full ASL vocabulary or phrases. Real ASL is a complete language; this prototype demonstrates the technical pipeline at a letter level. Accuracy varies with lighting and hand position. Hold each sign steady for ~1 second for best results.
Why it's better
Personal Calibration
Unlike fixed gesture libraries, ASL Flow learns your hands. Sign each letter once during setup and the model stores your personal landmark vectors on-device. Your calibration improves the more you use it, and it never leaves your machine.
Audio Bridge
The key architectural insight: Wispr Flow doesn't need to know sign language. ASL Flow converts sign to synthesized speech upstream, so Wispr receives a clean audio stream, identical to a human voice. Zero changes to Wispr's existing architecture required.
Intelligent Input Layer
Word prediction, custom shortcuts, and sentence-level reading make ASL Flow more than a translator: it's a full input system. Power users can define shorthand signs for frequently used phrases, reducing signing burden for repetitive communication by an estimated 40-60%.
Why This Is Different
Most ASL tools stop at recognition. ASL Flow is a full input stack: every layer is designed to make sign language a first-class input method for any app on your computer.
Try it
Hold ASL letter signs in front of your camera. The model reads each letter, speaks it aloud, and builds text in the output field.
Tip: Hold each sign steady for ~1 second. Works best with good lighting and a plain background. Supports letters A-Z (excluding J and Z which require motion).
Initializing…
In production, the CV model runs entirely on-device via TensorFlow.js, so no video data leaves your machine. Wispr Flow would receive only the synthesized audio stream, identical to standard microphone input.