coding

by David Adlberger - 11. January 2026

Product IX: Image Extender

Moving Beyond Dry Audio to Spatially Intelligent Soundscapes

My primary objective for this update was to bridge a critical perceptual gap in the system: while the previous iterations successfully mapped visual information to sonic elements with precise panning and temporal placement, the resulting audio mix remained perceptually “dry” and disconnected from the image’s implied acoustic environment. This update introduces adaptive reverberation, not as a cosmetic effect, but as a semantically grounded spatialization layer that transforms discrete sound objects into a coherent, immersive acoustic scene.

System Architecture

The existing interactive DAW interface, with its per-track volume controls, sound replacement engine, and user feedback mechanisms, was extended with a comprehensive spatial audio processing module. This module interprets the reverb parameters derived from image analysis (room detection, size estimation, material damping, and spatial width) and provides interactive control over their application.

Global Parameter State & Data Flow Integration

A crucial architectural challenge was maintaining separation between the raw audio mix (user-adjustable volume levels) and the reverb-processed version. I implemented a dual-state system with:

current_mix_raw: The continuously updated sum of all audio tracks with current volume slider adjustments.
current_mix_with_reverb: A cached, processed version with reverberation applied, recalculated only when reverb parameters change or volume sliders are adjusted with reverb enabled.

This separation preserves processing efficiency while maintaining real-time responsiveness. The system automatically pulls reverb parameters (room_size, damping, wet_level, width) from the image analysis block when available, providing image-informed defaults while allowing full manual override.

Pedalboard-Based Reverb Engine

I integrated the pedalboard audio processing library to implement professional-grade reverberation. The engine operates through a transparent conversion chain:

Format Conversion: AudioSegment objects (from pydub) are converted to NumPy arrays normalized to the [-1, 1] range
Pedalboard Processing: A Reverb effect instance applies parameters with real-time adjustable controls
Format Restoration: Processed audio is converted back to AudioSegment while preserving sample rate and channel configuration

The implementation supports both mono and stereo processing chains, maintaining compatibility with the existing panning system.

Interactive Reverb Control Interface

A dedicated control panel was added to the DAW interface, featuring:

Parameter Sliders: Four continuous controls for room size, damping, wet/dry mix, and stereo width, pre-populated with image-derived values when available
Toggle System: Three distinct interaction modes:
1. “🔄 Apply Reverb”: Manual application with current settings
2. “🔇 Remove Reverb”: Return to dry mix
3. “Reverb ON/OFF Toggle”: Single-click switching between states
Contextual Feedback: Display of image-based room detection status (indoor/outdoor)

Seamless Playback Integration

The playback system was redesigned to dynamically switch between dry and wet mixes:

Intelligent Routing: The play_mix() function automatically selects current_mix_with_reverb or current_mix_raw based on the reverb_enabled flag
State-Aware Processing: When volume sliders are adjusted with reverb enabled, the system automatically reapplies reverberation to the updated mix, maintaining perceptual consistency
Export Differentiation: Final mixes are exported with _with_reverb or _raw suffixes, providing clear version control

Design Philosophy: Transparency Over Automation

This phase reinforced a critical design principle: spatial effects should enhance rather than obscure the user’s creative decisions. Several automation approaches were considered and rejected:

Automatic Reverb Application: While the system could automatically apply image-derived reverb, I preserved manual activation to maintain user agency
Dynamic Parameter Adjustment: Real-time modification of reverb parameters during playback was technically feasible but introduced perceptual confusion
Per-Track Reverb: Individual reverberation for each sound object would create acoustic chaos rather than coherent space

The decision was made to implement reverb as a master bus effect, applied consistently to the entire mix after individual track processing. This approach creates a unified acoustic space that respects the visual scene’s implied environment while preserving the clarity of individual sound elements.

Technical Challenges & Solutions

State Synchronization

The most significant challenge was maintaining synchronization between the constantly updating volume-adjusted mix and the computationally expensive reverb processing. The solution was a conditional caching system: reverb is only recalculated when parameters change or when volume adjustments occur with reverb active.

Format Compatibility

Bridging the pydub-based mixing system with pedalboard‘s NumPy-based processing required careful attention to sample format conversion, channel configuration, and normalization. The implementation maintains bit-perfect round-trip conversion.

by David Adlberger - 30. December 2025

Product VIII: Image Extender

Iterative Workflow and Feedback Mechanism

The primary objective for this update was to architect a paradigm shift from a linear generative pipeline to a nonlinear, interactive sound design environment

System Architecture & Implementation of Interactive Components

The existing pipeline, comprising image analysis (object detection, semantic tagging), importance-weighted sound search, audio processing (equalization, normalization, panoramic distribution based on visual coordinates), and temporal randomization was extended with a state-preserving session layer and an interactive control interface, implemented within the collab notebook ecosystem.

Data Structure & State Management
A critical prerequisite for interactivity was the preservation of all intermediate audio objects and their associated metadata. The system was refactored to maintain a global, mutable data structure, a list of processed_track objects. Each object encapsulates:

The raw audio waveform (as a NumPy array).
Semantic source tag (e.g., “car,” “rain”).
Track type (ambience base or foreground object).
Temporal onset and duration within the mix.
Panning coefficient (derived from image x-coordinate).
Initial target loudness (LUFS, derived from object importance scaling).

Dynamic Mixing Console Interface
A GUI panel was generated post-sonification, featuring the following interactive widgets for each processed_track:

Per-Track Gain Sliders: Linear potentiometers (range 0.0 to 2.0) controlling amplitude multiplication. Adjustment triggers an immediate recalculation of the output sum via a create_current_mix() function, which performs a weighted summation of all tracks based on the current slider states.
Play/Stop Controls: Buttons invoking a non-blocking, threaded audio playback engine (using IPython.display.Audio and threading), allowing for real-time auditioning without interface latency.

On-Demand Sound Replacement Engine
The most significant functional addition is the per-track “Search & Replace” capability. Each track’s GUI includes a dedicated search button (🔍). Its event handler executes the following algorithm:

Tag Identification: Retrieves the original semantic tag from the target processed_track.
Targeted Audio Retrieval: Calls a modified search_new_sound_for_tag(tag, exclude_id_list) function. This function re-executes the original search logic, including query formulation, Freesound API calls, descriptor validation (e.g., excluding excessively long or short files), and fallback strategies—while maintaining a session-specific exclusion list to avoid re-selecting previously used sounds.
Consistent Processing: The newly retrieved audio file undergoes an identical processing chain as in the initial pipeline: target loudness normalization (to the original track’s LUFS target), application of the same panning coefficient, and insertion at the identical temporal position.
State Update & Mix Regeneration: The new audio data replaces the old waveform in the processed_track object. The create_current_mix() function is invoked, seamlessly integrating the new sonic element while preserving all other user adjustments (e.g., volume levels of other tracks).

Integrated Feedback & Evaluation Module
To formalize user evaluation and gather data for continuous system improvement, a structured feedback panel was integrated adjacent to the mixing controls. This panel captures:

A subjective 5-point Likert scale rating.
Unstructured textual feedback.
Automated attachment of complete session metadata (input image description, derived tags, importance values, processing parameters, and the final processed_track list).
This design explicitly closes the feedback loop, treating each user interaction as a potential training or validation datum for future algorithmic refinements.
Automated sending of the feedback via email

by florian.prasse - 8. June 2025

WebExpo Talk #2: Elis Laasik

Beyond Design Tools: Prototyping in code

In this second post, I want to recap the talk of Elis Laasik at WebExpo 25, where she discussed the topic of design prototyping in code. Elis, with her extensive experience in the field, explained how prototyping using basic HTML, CSS, and JavaScript—along with some JavaScript frameworks—can be an efficient way to approach web development. She highlighted that this method is especially valuable in professional contexts, where prototyping plays a crucial role in shaping the user experience, testing ideas, and ensuring that the final product aligns with business goals.

One of the main points that Elis emphasized was that these prototypes don’t require a backend or a database. Instead, the focus is entirely on the front-end elements, like the user interface and customer journey. This approach allows teams to test how the website or app behaves in real time, which can be much more useful than static design mockups. Since the prototype is coded directly, it is much closer to the finished product, giving stakeholders a more accurate sense of how the final product will look and function. The fact that it’s interactive and responsive adds another layer of realism to the process, which can be especially valuable in understanding the user experience.

This approach to prototyping really stood out to me, as it closely mirrors the way I work on personal projects. When I’m building a website on my own, I tend to start coding right away, rather than creating a design in tools like Figma first. I find that coding a prototype feels more “real” because I can see the project develop as I work on it. It also allows me to directly address how the website will behave, rather than just looking at a static design. I could relate to what Elis was saying because, for me, starting to code early gives a more authentic sense of the project’s progress and helps me figure out how the website will work from a functional perspective.

Elis also mentioned that prototyping in code can be particularly useful when dealing with complex user interactions or when there is no shared vision across the team. By coding the prototype, it’s easier to explore different solutions and test how users will actually engage with the site or app. This kind of flexibility and control can be crucial in situations where the design needs to be flexible or constantly evolving.

That said, Elis pointed out that there are certain scenarios where using code for prototyping might not be the best approach. For smaller projects or when branding design is a significant focus, she suggested it might be better to start with a traditional design tool like Figma. In these cases, the need for high-fidelity visuals or design accuracy might take precedence over functionality in the early stages. I completely understand this perspective, especially when the main goal is to define the visual identity of a brand before diving into the technical aspects.

In conclusion, Elis’ talk provided a lot of valuable insights into the practical use of code-based prototypes. It was interesting to see how this approach is applied in professional environments and how it can be a useful tool for creating realistic, interactive designs. For me, it reinforced the idea that prototyping with code isn’t just about creating something functional—it’s about exploring possibilities, improving user experience, and aligning the product with business objectives.

by victoria.bremer - 8. June 2025

WebExpo Conference Talk #1 – Data Visualization

As someone who is very interested in visual design, data visualization and interdisciplinary topics, mixing design and science or values and aesthetics, I was really curious about Nadieh Bremers talk „Creating an Effective & Beautiful Data Visualisation from Scratch”. I wasn’t sure what to expect, since I have found that „beautiful data visualization“ often just means clear and structured, but I was more that positively surprised to see how much artistic creativity she was able to incorporate into her visualizations while still maintaining the data to communicate. What I was also surprised by and really broadened my view on the topic was her approach and angle to how she creates her visualizations. I had never heard of the tool she uses (coding it in D3.js) and thought it was so cool to create truly interactive pieces with the actual data in the background instead of using visual tools like Illustrator, which I was more used to when it comes to creatively visualizing data.

What I also thought was a great starting point was her emphasis on storytelling through data. Rather than beginning with tools or templates, she encouraged designers to start with the narrative: what is the data trying to say? This approach really aligns with interaction design principles, where the goal is not just functionality but clarity, emotion, and user connection. Sketching ideas before coding is sort of like prototyping in UX or any other visually creative field, reminding us that visual thinking is critical to problem solving. I really enjoyed that she considered aesthetic and emotional engagement. I feel like many visualizations aim for neutrality or objectivity, but in her case the work also aims to be expressive, and fun. She challenged the idea that beauty is just decoration. Instead, she argued that beauty and clarity are not mutually exclusive, and that well-designed visuals can help users stay curious, linger longer, and feel more connected to the data. This view aligns with interaction design’s attention to emotional and engaging user experiences and human centered design.

As mentioned her use of D3.js was also very interesting for me. By building a data visualization from scratch in a live coding session, she nicely demonstrated what a workflow can look like, which I found really helpful. What made this talk especially valuable was watching her iterative process. Trying something to see what happens, then continuing from there, changing things along the way and making mistakes. Her process reminded me of the iterative prototyping cycles in interaction design: test, tweak, refine. Even a small change in data structure or layout can significantly shift the meaning of a visualization. It was a really eyeopening creative process and a reminder that you don’t need a perfect or exact vision to start and then go through with, but rather develop an idea of what works along the way. This process also showed me how D3 (and coding in general) can empower designers to go beyond their visual tools and create more immersive and interactive experiences while still maintaining the aesthetics.

by florian.prasse - 19. May 202519. May 2025

Prototyping a Data Visualisation Installation

In this second part of the IDW25 recap I want to talk about the prototype I created of my CO2 project. The goal for me was to build an installation to interact with the data visualisation. For this I chose makey makey and built a control mechanism with aluminium foil and a pressure plate. To send the data from processing to the projector I used Arena Resolume. My concept was based on the CO2 footprint. So you activated the animation with steping on the pressure plate. Here are first sketches of my idea.

I thought about making an interactive map where you step on different countries and realease fog in a glas container. But that was too much to create in one week so I reduced the concept to projecting the boundary I showed in my last blog post. I kept it simple and connected the makey makey to aluminum foil to control the time line with a finger tap. The processing output is being projected on the wall.

Conclusion

It was interesting to get to know the possibilities of the used technology to visualise data and therefore I had a lot of fun during production. I will definitly keep working on this project to see where it can go. Also expand the data set to compare more countries. In the first part of the blog post I mentioned that there are probably better programs to make more visually appealing animations for such a topic. But for a rough protoype this was completly fine.

by florian.prasse - 19. May 202519. May 2025

Data Visulisation with Processing

The International Design Week 2025 is over and this first part of the blog will be a recap of my process. I joined the workshop #6—Beyond Data Visualisation with Eva-Maria Heinrich. The goal was to present a self chosen data set on a socio-political topic. I chose a data set on the Co2 emission worldwide per country (https://ourworldindata.org/co2-emissions). The process started with evaluating the data span I want to show and the method of visualisation. Because the task of the workshop was to present data in an abstract way and to step back from the conventional methods, to make the experience more memorable.

Cutting the Data with Python

So to get a specific range of data to make a prototype i used python to cut the csv file to my liking. I used the pandas tool for python to manipulate the file. At first I wanted to compare three countries, but later in the process I realized that this goal was a bit too much for the given time, since I haven’t used python like this before. It was a nice way to get to know the first steps of data analysis with coding.

I created a new csv file with a selected country, in this case it was Austria in a time span from 1900—2023. Now it was time to visualise it.

Let’s get creative!

In my research on how CO2 was being visualised before I looked up some videos of NASA showing how the emission covers the world. I got inspired by this video.

I chose processing to create my own interpretation of visualising emission. In hindsight, there are probably better tools to do that, but it was interesting to work with processing and code some visuals relative to a data set. I created a radial boundary which is invisible. Inside this shape, i let a particle system flow around which is relative to the CO2 emission in the specific year, shown in the top left corner. This visualisation works like a timeline. You can use your LEFT and RIGHT arrow keys to go back and forth in 10 years steps. The boundary expands or be reduced, which depends if the emission of that year is higher or less. The particle system also draws more or less circles, depending on the amount of CO2.

After the workshop was done I tried out other methods to make the particle system flow more and create a feeling of gas and air.

Conclusion

The whole week was a nice experience. I got to try out new techniques and tools and create something i never done have before. A problem I encountered was the time. It’s hard to estimate what you can do, if you try out something completly new. The presentation day at the end was really inspiring and emotional to see what the all the other students have created and talking about their process and results.

by David Adlberger - 28. April 202528. April 2025

Prototyping VI: Image Extender – Image sonification tool for immersive perception of sounds from images and new creation possibilities

New features in the object recognition and test run for images:

Since the initial freesound.org and GeminAI setup, I have added several improvements.
You can now choose between different object recognition models and adjust settings like the number of detected objects and the minimum confidence threshold.

I also created a detailed testing matrix, using a wide range of images to evaluate detection accuracy. Due to that there might be the change of the model later on, because it seems the gemini api only has a very basic pool of tags and is also not a good training in every category.

Test of images for the object recognition

It is still reliable for these basic tags like “bird”, “car”, “tree”, etc. And for these tags it also doesn’t really matter if theres a lot of shadow, you only see half of the object or even if its blurry. But because of the lack of specific tags I will look into models or APIs that offer more fine-grained recognition.

Coming up: I’ll be working on whether to auto-play or download the selected audio files including layering sounds, adjusting volumes, experimenting with EQ and filtering — all to make the playback more natural and immersive. Also, I will think about categorization and moving the tags into a layer system. Beside that I am going to check for other object recognition models, but I might stick to the gemini api for prototyping a bit more and change the model later.