Wallspace Captions

Wallspace Captions

Live transcription-to-visuals node for Daydream Scope — receives caption text from WallSpace, A.EYE.ECHO, OSC, or WebSocket sources, overlays styled text on video, and forwards transcription as prompts to drive real-time AI visual generation.

Downloads0
Comments--
Created03.16.26

scope-wallspace-captions

Live transcription-to-visuals node for Daydream Scope — receives caption text from WallSpace, A.EYE.ECHO, OSC, or WebSocket sources, overlays styled text on video, and forwards transcription as prompts to drive real-time AI visual generation.

Co-authored by Jack Morgan and Ciborg / Matt

Features

  • 4 text input methods: Scope prompt field, manual text, OSC (UDP), WebSocket
  • Pre + Post pipelines: Pre bakes text into frames so AI stylises it; Post overlays clean captions after AI generation
  • Advanced caption placement: XY coordinate positioning (percentage-based), preset positions (top/center/bottom), text alignment
  • Full styling control: Font size, colour (RGB), opacity, text outline with colour/width, background box with colour/opacity/padding/corner radius
  • Caption Event System: Parses text into structured events (WORD, SENTENCE_START/END, QUESTION, EXCLAMATION, PAUSE, EMPHASIS, SPEAKER_CHANGE) that drive visual behaviours
  • Event-reactive effects: Per-word flash, punctuation colour reactions, pause fade, emphasis highlighting
  • Prompt forwarding: Transcription text forwarded as prompts with style prefix, template formatting, and rate limiting

Use Cases

  • Accessibility: Live captions for deaf/hard-of-hearing audiences at live events (A.EYE.ECHO integration)
  • VJ performance: Spoken word → AI-generated reactive visuals in real-time
  • Live events: Audience speech drives projected visuals
  • Art installations: Text-reactive generative art

Installation

From GitHub

uv pip install "scope-wallspace-captions @ git+https://github.com/jackmo650/scope-wallspace-captions"

Local development

git clone https://github.com/jackmo650/scope-wallspace-captions.git
cd scope-wallspace-captions
pip install -e .

Via Scope

Settings → Nodes → Browse → select the scope-wallspace-captions directory.

Pipelines

PipelineIDUsageDescription
WS Captions (Pre)wallspace-captions-prePreprocessorText baked into frames before AI model — AI sees and stylises the text
WS Captions (Post)wallspace-captions-postPostprocessorClean text overlay after AI generation — readable captions on top

Text Input Methods

SourceSettingHow it works
Scope Prompttext_source=promptText from Scope's built-in prompt field (default)
Manualtext_source=manualType into the transcription_text field
OSCtext_source=oscUDP listener on configurable port (default 9000), address /caption/text
WebSockettext_source=websocketWebSocket server on configurable port (default 9100), accepts plain text or JSON

OSC Format

Matches the WallSpace/A.EYE.ECHO OSC bridge:

  • /caption/text [string] — caption text
  • /caption/clear [] — clear buffer

WebSocket Format

Accepts plain text strings or JSON:

{"text": "Hello world", "speaker": "Jack"}

Caption Event System

Raw transcription text is parsed into structured events:

EventTriggerVisual Effect
SENTENCE_STARTNew sentence beginsFlash/pulse
SENTENCE_ENDPeriod/question/exclamation at endFade transition
WORDEach word extractedPer-word animation (flash)
QUESTION? detectedBlue colour shift
EXCLAMATION! detectedRed intensity spike
PAUSEGap > thresholdOpacity fade
EMPHASISALL CAPS word (3+ chars)Yellow highlight
SPEAKER_CHANGEDifferent speaker tagStyle switch

Events are also forwarded in the output dict as {"events": [...]} for downstream nodes.

Parameters

Load-time (require pipeline reload)

ParameterDefaultDescription
text_sourcepromptInput method: prompt, manual, osc, websocket
osc_port9000OSC UDP port
osc_address/caption/textOSC address
ws_port9100WebSocket TCP port
font_path(empty)Path to .ttf/.otf font

Runtime — Caption Placement

ParameterDefaultDescription
overlay_enabledtrueToggle overlay
position_presetbottombottom, top, center, custom
pos_x50X position (% of width)
pos_y90Y position (% of height)
text_aligncenterleft, center, right
max_width90Max text width (%)
max_lines3Visible lines

Runtime — Text Style

ParameterDefaultDescription
font_size48Font size (px)
text_color_r/g/b255/255/255Text RGB
text_opacity100Text opacity (%)
outline_enabledtrueText outline
outline_width2Outline thickness (px)
outline_color_r/g/b0/0/0Outline RGB
bg_enabledtrueBackground box
bg_color_r/g/b0/0/0Background RGB
bg_opacity50Background opacity (%)
bg_padding12Box padding (px)
bg_corner_radius8Corner radius (px)

Runtime — Prompt Forwarding

ParameterDefaultDescription
prompt_enabledtrueForward text as prompt
style_prefix(empty)Style prefix (e.g. "cinematic neon")
prompt_template{style} {text}Prompt format template
prompt_weight1.0Prompt weight
update_interval2.0Rate limit (seconds)

Runtime — Events

ParameterDefaultDescription
events_enabledtrueEnable event parsing
pause_threshold2.0Silence duration for PAUSE (sec)
event_intensity0.5Effect intensity multiplier
word_flash_enabledfalseFlash newest word
punctuation_reacttrueReact to ? ! .
event_color_shiftfalseColour shift by event type

Requirements

  • Python 3.12+
  • PyTorch (ships with Scope)
  • python-osc
  • websockets
  • Pillow

License

MIT

Development Workflow

This project follows a human-in-the-loop development process:

  1. All requests start as GitHub Issues — bugs, features, tasks, and experiments are logged using the provided issue templates.
  2. Issues are reviewed and triaged — the maintainer reviews each issue, adjusts scope, and assigns priority.
  3. Only approved issues move forward — no implementation begins until an issue is explicitly labeled approved.
  4. Implementation happens on explicit instruction — coding agents and contributors only work on approved, assigned work.
  5. Pull requests reference an approved issue — every PR must link back to the issue it addresses.