sound design – Page 7

by Alina Volkova - 10. March 2025

Explore III: Embodied Resonance – Refining the Project Vision

Primary Intention:

The project’s core goal is to create an embodied, immersive experience where the performer’s movements and physiological signals interact with dynamic soundscapes, reflecting states of stress, panic, and resolution. This endeavor seeks to explore the intersection of the body, trauma, and sound as a medium of expression and understanding.

Tasks Fulfilled by the Project:

Expressive Performance: Convey the visceral experience of stress and trauma through movement and sound.
Interactive Soundscapes: Use real-time biofeedback to dynamically alter sound parameters, enhancing the audience’s sensory engagement.
Therapeutic Exploration: Demonstrate the potential of somatic expression and sound for trauma exploration and healing.

Main Goals:

Develop a cohesive interaction between biofeedback, sound design, and movement.
Design an immersive auditory space using ambisonics.
Create an emotionally impactful narrative through choreography and sound dynamics.

Steps for Project Implementation

Identifying Subtasks:

Movement and Choreography Exploration:
- Research and refine body movements that mirror states of stress and release.
- Develop movement scores aligned with sound triggers.
Biofeedback and Technology Integration:
- Select and test wearable sensors for movement and physiological signals (e.g., heart rate monitors, EMG sensors).
- Map sensor data to sound parameters using tools like Max/MSP or Pure Data.
Sound Design and Ambisonics:
- Create a palette of sound textures representing emotional states.
- Test and refine 3D spatial audio setups.
Rehearsal and Iteration:
- Practice interaction between movement and sound.
- Adjust mappings and refine performance flow.

Determining the Sequence:

Begin with movement research and initial choreography.
Set up and test biofeedback systems.
Integrate sound design with real-time data mappings.
Conduct iterative rehearsals and refine dynamics.

Description of Subtasks

Required Information and Conditions:

Knowledge of movement techniques representing trauma.
Understanding biofeedback sensors and data processing.
Familiarity with ambisonic sound design principles.

Methods:

Employ somatic techniques and physical theater practices for movement.
Use biofeedback-driven sound generation software for real-time interaction.
Apply iterative testing and rehearsal methods for refinement.

Existing Knowledge and Skills:

Dance and performance experience.
Basic knowledge of sensor technologies and sound design tools.
Understanding of trauma’s physical manifestations through literature.

Additional Resources:

Sensors and biofeedback devices.
Ambisonic Toolkit and spatial audio software.
Research materials on trauma and biofeedback in art.

Timeline Overview

Current Semester – “Explore” Phase:

Research movement responses to stress and trauma.
Test sensors and sound mapping tools.
Document all findings to create the exposé and prepare for the oral presentation.

Second Semester – “Experiment” Phase:

Prototype interactions between movement, biofeedback, and sound.
Evaluate the feasibility and emotional resonance of the prototypes.
Incorporate feedback and iterate designs.

Third Semester – “Product” Phase:

Combine prototypes into a cohesive performance.
Optimize the interplay between sound and movement.
Conclude with final documentation and a presentation of the complete performance.

Questions for Exploration

What additional biofeedback sensors and sound techniques can enhance the performance?
How can movement scores effectively translate the emotional states into physical expressions?
What feedback mechanisms will refine the audience’s immersive experience?

by Alina Volkova - 10. March 2025

Explore II: Embodied Resonance – First draft

A live performance where the body’s movement and physiological responses interact with real-time, 3D soundscapes, creating an auditory and sensory experience that embodies the physical and emotional states associated with trauma, stress, or panic.

Core Elements

Live Movement and Performance:
- Physical Expression: Expressive body movements are used to convey states of stress, panic, and tension. Movements could be choreographed or improvised, incorporating controlled gestures, sudden shifts, and spasmodic motions that mirror the body’s natural reactions to trauma.
- Sensor Integration: The performer will be equipped with wearable sensors (e.g., accelerometers, heart rate monitors, muscle tension sensors) to capture real-time data that triggers sound changes.
Sound Design and Biofeedback:
- Real-time Data to Sound Mapping: The data from the sensors can be mapped to sound parameters such as volume, pitch, and spatial positioning.
- Spatial Audio (Ambisonics): the 3D sound environment where the sound moves with the performer, simulating the feeling of being surrounded by or caught in an experience of panic.
- Sound Layers and Textures: Layer sounds that range from chaotic, dissonant clusters to more open, calming tones, symbolizing shifts between heightened panic and brief moments of relief.
Interactive Performance Dynamics:
- Feedback Loops: The performer’s movements could influence sound parameters, and changes in sound could, in turn, affect how the performer responds (e.g., sudden loud or abrupt sounds causing physical shifts).
- Immersive Auditory Space: Spatial audio setup will immerse the audience, making them feel as though they are within the performance’s sonic realm or inside the performer’s body.
Choreography and Movement Techniques:
- Imitating Panic and Stress:
  - Breath Control: Rapid, shallow breathing or uneven breathing patterns to simulate panic.
  - Body Tension and Release: Show how different areas of the body can tense up and release in response to imagined threats.
  - Sudden, Erratic Movements: Imitate fight-or-flight reactions through jerky, uncoordinated gestures.
- Movement Scores: Create a set of movement phrases that can be triggered by specific sound cues, with each phase representing a different level of intensity or emotional state.

Implementation Steps:

Initial Research and Movement Exploration:
- Spend time exploring how the body naturally responds to stress through dance or physical theatre techniques.
- Record and analyze your body’s response to various stimuli to understand how to replicate these in a performance context.
Tech Setup and Testing:
- Choose sensors capable of tracking movement and vital signs, such as wearable accelerometers and heart rate monitors.
- Connect the sensors to real-time audio processing software (e.g., Max/MSP, Pure Data) to create dynamic sound generation based on data input.
- Experiment with one biofeedback sensor (e.g., heartbeat or EMG) and connect it to sound manipulation software.
- Test simple ambisonic setups to understand spatial audio placement.
Sound Design:
- Use ambisonics to experiment with how sounds can be positioned and moved in 3D space.
- Create a palette of sound elements that represent different stress levels, such as soft background noise, mechanical sounds, distorted human voices, and deep bass thuds.
Rehearsals and Iteration:
- Conduct rehearsals where you practice the movement and sound interaction, making adjustments to the data-to-sound mappings to achieve the desired response.
- Test with different inputs to refine the sonic representation of the body’s signals.
- Refine the performance flow by timing the intensity of movements and sound shifts to ensure coherence and emotional impact.

Resources

Body and Trauma

The Body Keeps the Score by Bessel van der Kolk
Waking the Tiger: Healing Trauma by Peter Levine

Sound Design and Technology

Sound Design: The Expressive Power of Music, Voice and Sound Effects in Cinema by David Sonnenschein
Immersive Sound: The Art and Science of Binaural and Multi-Channel Audio edited by Agnieszka Roginska and Paul Geluso

Tools and Tutorials

Ambisonic Toolkit (ATK)
Cycling ’74 Max/MSP Tutorials

Artistic and Conceptual References

Janet Cardiff – Known for immersive sound installations, especially her 40-Part Motet.
Meredith Monk – Combines movement and sound to explore human experience.
Christine Sun Kim – Explores sound and silence through the lens of the body and perception.

Academic Research in Sound and Perception

Music, Cognition, and Computerized Sound: An Introduction to Psychoacoustics by Perry Cook

by Alina Volkova - 10. March 202510. March 2025

Explore I: Body and Sound – Looking for the Idea

My Background and Interests

My journey into sound and technology started with my experiments in movement-based sound design. One of my first projects used ultrasonic sensors and Arduino technology to transform body movement into music. I was fascinated by the idea of turning motion into sound, mapping gestures into an interactive sonic experience. This led me to explore other ways of integrating physical action with sound manipulation, such as using MIDI controllers and custom-built sensors.

I see sound as more than just music—it’s a form of expression, communication, and interaction. My interest in sound design is rooted in its ability to create immersive experiences, whether through spatial sound, interactivity, or emotional storytelling. I love experimenting with unconventional ways of generating and manipulating sound, pushing beyond traditional composition to explore new territories.

Right now, I’m particularly interested in how sound connects to the body. How can movement or internal processes be used as an instrument? How do physical states influence the way we experience sound? These are the questions that drive my current explorations.

Idea Draft for a Future Project

At first, I was focused on transforming movement into sound. My early idea was to explore sensors that could read touch, direction, and motion, allowing me to control different sound layers by moving my body. I imagined a 3D sound composition where gestures could manipulate textures, rhythms, and effects in real-time. Maybe even integrating voice elements, allowing me to shape effects with both movement and singing.

Over time, my focus shifted. Instead of external movement, I started thinking about internal body processes—breath, heartbeat, muscle tension. What if sound could react to what happens inside the body rather than just external gestures? This led to the idea of biofeedback-driven sound, where physiological data becomes a source of real-time sonic transformation.

The concept is still in development, but the main idea remains the same: exploring the relationship between the body and sound in a way that is immersive, interactive, and emotionally driven. Whether through movement or internal signals, I want to create a performance where sound is a direct extension of the body’s state, turning invisible experiences into something that can be heard and felt.

Moving Forward

This project is still evolving. It might become a performance, an installation, or something entirely different. Right now, I’m in the phase of exploring what’s possible. Sound and the body are deeply connected, and I want to keep pushing that connection in new and unexpected ways.

by David Adlberger - 10. March 2025

Prototyping I: Image Extender – Image sonification tool for immersive perception of sounds from images and new creation possibilities

Shift of intention of the project due to time plan:

By narrowing down the topic to ensure the feasibility of this project the focus or main purpose of the project will be the artistic approach. The tool will still combine the use of direct image to audio translation and the translation via sonification into a more abstract form. The main use cases will be generating unique audio samples for creative applications, such as sound design for interactive installations, brand audio identities, or matching image soundscapes and the possibility to be a versatile instrument for experimental media artists and display tool for image information.

By further research on different possibilities of sonification of image data and development of the sonification language itself the translation and display purpose is going to get more clear within the following weeks.

Testing of Google Gemini API for AI Object and Image Recognition:

The first testing of the Google Gemini Api started well. There are different models for dedicated object recognition and image recognition itself which can be combined to analyze pictures in terms of objects and partly scenery. These models (SSD, EfficientNET,…) create similar results but not always the same. It might be an option to make it selectable for the user (so that in a failure case a different model can be tried and may give better results). The scenery recognition itself tends to be a problem. It may be a possibility to try out different apis.

The data we get from this AI model is a tag for the recognized objects or image content and a percentage of the probability.

The next steps for the direct translation of it into realistic sound representations will be to test the possibility of using the api of freesound.org to search directly and automated for the recognized object tags and load matching audio files. These search calls also need to filter by copyright type of the sounds and a choosing rule / algorithm needs to be created.

object recognition: efficient float 16 model (Photo by Jason Oh on unsplash)

Research on sonification of images / video material and different approaches:

The world of image sonification is rich with diverse techniques, each offering unique ways to transform visual data into auditory experiences. The world of image sonification is rich with diverse techniques, each offering unique ways to map visual data into auditory experiences. One of the most straightforward methods is raster scanning, introduced by Yeo and Berger. This technique maps the brightness values of grayscale image pixels directly to audio samples, creating a one-to-one correspondence between visual and auditory data. By scanning an image line by line, from top to bottom, the system generates a sound that reflects the texture and patterns of the image. For example, a smooth gradient might produce a steady tone, while a highly textured image could result in a more complex, evolving soundscape. The process is fully reversible, allowing for both image sonification and sound visualization, making it a versatile tool for artists and researchers alike. This method is particularly effective for sonifying image textures and exploring the auditory representation of visual filters, such as “patchwork” or “grain” effects.(Yeo and Berger, 2006)

Principle raster scanning (Yeo and Berger, 2006)

In contrast, Audible Panorama (Huang et al. 2019) automates sound mapping for 360° panorama images used in virtual reality (VR). It detects objects using computer vision, estimates their depth, and assigns spatialized audio from a database. For example, a car might trigger engine sounds, while a person generates footsteps, creating an immersive auditory experience that enhances VR realism. A user study confirmed that spatial audio significantly improves the sense of presence. It contains a interesting concept regarding to choosing a random audio file from a sound library to avoid producing similar or same results. Also it mentions the aspect of postprocessing the audios which also would be a relevant aspect for the image extender project.

principle audible panorama (Huang et al. 2019)

Another approach, HindSight (Schoop, Smith, and Hartmann 2018), focuses on real-time object detection and sonification in 360° video. Using a head-mounted camera and neural networks, it detects objects like cars and pedestrians, then sonifies their position and danger level through bone conduction headphones. Beeps increase in tempo and pan to indicate proximity and direction, providing real-time safety alerts for cyclists.

Finally, Sonic Panoramas (Kabisch, Kuester, and Penny 2005) takes an interactive approach, allowing users to navigate landscape images while generating sound based on their position. Edge detection extracts features like mountains or forests, mapping them to dynamic soundscapes. For instance, a mountain ridge might produce a resonant tone, while a forest creates layered, chaotic sounds, blending visual and auditory art. It also mentions different approaches for sonification itself. For example the idea of using micro (timbre, pitch and melody) and macro level (rhythm and form) mapping.

principle sonic panoramas (Kabisch, Kuester, and Penny 2005)

Each of these methods—raster scanning, Audible Panorama, HindSight, and Sonic Panoramas—demonstrates the versatility of sonification as a tool for transforming visual data into sound and lead keeping these different approaches in mind for developing my own sonification language or mapping method. It also leads to further research by checking some useful references they used in their work for a deeper understanding of sonification and extending the possibilities.

References

Huang, Haikun, Michael Solah, Dingzeyu Li, and Lap-Fai Yu. 2019. “Audible Panorama: Automatic Spatial Audio Generation for Panorama Imagery.” In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 1–11. Glasgow, Scotland: ACM. https://doi.org/10.1145/3290605.3300851.

Kabisch, Eric, Falko Kuester, and Simon Penny. 2005. “Sonic Panoramas: Experiments with Interactive Landscape Image Sonification.” In Proceedings of the 2005 International Conference on Artificial Reality and Telexistence (ICAT), 156–163. Christchurch, New Zealand: HIT Lab NZ.

Schoop, Eldon, James Smith, and Bjoern Hartmann. 2018. “HindSight: Enhancing Spatial Awareness by Sonifying Detected Objects in Real-Time 360-Degree Video.” In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, 1–12. Montreal, QC, Canada: ACM. https://doi.org/10.1145/3173574.3173717.

Yeo, Woon Seung, and Jonathan Berger. 2006. “Application of Raster Scanning Method to Image Sonification, Sound Visualization, Sound Analysis and Synthesis.” In Proceedings of the 9th International Conference on Digital Audio Effects (DAFx-06), 311–316. Montreal, Canada: DAFx.

by Verena Schneider - 10. March 2025

BLOG POST 1: THE SPARK OF AN IDEA (WEEK 1 – 16.10)

Title: “The Sonic Wave: Where Surfing Meets Sound and Technology”

This week marked the beginning of an exciting journey. The idea of merging surfing, sound, and technology has been on my mind for some time, and now it’s time to bring it to life. The concept is simple yet profound: embed sensors into a surfboard to capture the motion, speed, and vibrations of the board as it rides the waves. Then, transform this data into sound and visuals, creating an immersive experience that highlights the rhythm and beauty of surfing.

RESEARCH AND INSPIRATION:

I started by exploring existing projects that combine sports and technology. The Surflogic GPS Tracker and TRACE were particularly inspiring. These tools track surfers’ performance metrics like speed and wave count, but they don’t delve into the artistic side of things. I want to go beyond performance tracking and explore how surfing can be experienced as a multisensory art form.

CHALLENGES AND QUESTIONS:

How do I integrate sensors into a surfboard without affecting its performance?
What kind of sensors will give me the most accurate and meaningful data?
How can I translate raw data into something that resonates emotionally with an audience?

NEXT STEPS:

Research sensor technology (accelerometers, gyroscopes, hydrophones).
Reach out to surfboard shapers and tech experts for advice.
Start sketching out a prototype design.

This project feels like a perfect blend of my passions—surfing, technology, and art. I’m eager to see where this journey takes me.

by David Adlberger - 5. March 2025

Explore II: Image Extender – Image sonification tool for immersive perception of sounds from images and new creation possiblities

The Image Extender project bridges accessibility and creativity, offering an innovative way to perceive visual data through sound. With its dual-purpose approach, the tool has the potential to redefine auditory experiences for diverse audiences, pushing the boundaries of technology and human perception.

The project is designed as a dual-purpose tool for immersive perception and creative sound design. By leveraging AI-based image recognition and sonification algorithms, the tool will transform visual data into auditory experiences. This innovative approach is intended for:

1. Visually Impaired Individuals
2. Artists and Designers

The tool will focus on translating colors, textures, shapes, and spatial arrangements into structured soundscapes, ensuring clarity and creativity for diverse users.

Core Functionality: Translating image data into sound using sonification frameworks and AI algorithms.
Target Audiences: Visually impaired users and creative professionals.
Platforms: Initially desktop applications with planned mobile deployment for on-the-go accessibility.
User Experience: A customizable interface to balance complexity, accessibility, and creativity.

Working Hypotheses and Requirements

Hypotheses:
1. Cross-modal sonification enhances understanding and creativity in visual-to-auditory transformations.
2. Intuitive soundscapes improve accessibility for visually impaired users compared to traditional methods.
Requirements:
- Develop an intuitive sonification framework adaptable to various images.
- Integrate customizable settings to prevent sensory overload.
- Ensure compatibility across platforms (desktop and mobile).

Subtasks

1. Project Planning & Structure

Define Scope and Goals: Clarify key deliverables and objectives for both visually impaired users and artists/designers.
Research Methods: Identify research approaches (e.g., user interviews, surveys, literature review).
Project Timeline and Milestones: Establish a phased timeline including prototyping, testing, and final implementation.
Identify Dependencies: List libraries, frameworks, and tools needed (Python, Pure Data, Max/MSP, OSC, etc.).

2. Research & Data Collection

Sonification Techniques: Research existing sonification methods and metaphors for cross-modal (sight-to-sound) mapping and research different other approaches that can also blend in the overall sonification strategy.
Image Recognition Algorithms: Investigate AI image recognition models (e.g., OpenCV, TensorFlow, PyTorch).
Psychoacoustics & Perceptual Mapping: Review how different sound frequencies, intensities, and spatialization affect perception.
Existing Tools & References: Study tools like Melobytes, VOSIS, and BeMyEyes to understand features, limitations, and user feedback.

object detection from python yolo library

3. Concept Development & Prototyping

Develop Sonification Mapping Framework: Define rules for mapping visual elements (color, shape, texture) to sound parameters (pitch, timbre, rhythm).
Simple Prototype: Create a basic prototype that integrates:
- AI content recognition (Python + image processing libraries).
- Sound generation (Pure Data or Max/MSP).
- Communication via OSC (e.g., using Wekinator).
Create or collect Sample Soundscapes: Generate initial soundscapes for different types of images (e.g., landscapes, portraits, abstract visuals).

example of puredata with rem library (image to sound in pure data by Artiom
Constantinov)

4. User Experience Design

UI/UX Design for Desktop:
- Design intuitive interface for uploading images and adjusting sonification parameters.
- Mock up controls for adjusting sound complexity, intensity, and spatialization.
Accessibility Features:
- Ensure screen reader compatibility.
- Develop customizable presets for different levels of user experience (basic vs. advanced).
Mobile Optimization Plan:
- Plan for responsive design and functionality for smartphones.

5. Testing & Feedback Collection

Create Testing Scenarios:
- Develop a set of diverse images (varying in content, color, and complexity).
Usability Testing with Visually Impaired Users:
- Gather feedback on the clarity, intuitiveness, and sensory experience of the sonifications.
- Identify areas of overstimulation or confusion.
Feedback from Artists/Designers:
- Assess the creative flexibility and utility of the tool for sound design.
Iterate Based on Feedback:
- Refine sonification mappings and interface based on user input.

6. Implementation of Standalone Application

Develop Core Application:
- Integrate image recognition with sonification engine.
- Implement adjustable parameters for sound generation.
Error Handling & Performance Optimization:
- Ensure efficient processing for high-resolution images.
- Handle edge cases for unexpected or low-quality inputs.
Cross-Platform Compatibility:
- Ensure compatibility with Windows, macOS, and plan for future mobile deployment.

7. Finalization & Deployment

Finalize Feature Set:
- Balance between accessibility and creative flexibility.
- Ensure the sonification language is both consistent and adaptable.
Documentation & Tutorials:
- Create user guides for visually impaired users and artists.
- Provide tutorials for customizing sonification settings.
Deployment:
- Package as a standalone desktop application.
- Plan for mobile release (potentially a future phase).

Technological Basis Subtasks:

Programming: Develop core image recognition and processing modules in Python.
Sonification Engine: Create audio synthesis patches in Pure Data/Max/MSP.
Integration: Implement OSC communication between Python and the sound engine.
UI Development: Design and code the user interface for accessibility and usability.
Testing Automation: Create scripts for automating image-sonification tests.

Possible academic foundations for further research and work:

Chatterjee, Oindrila, and Shantanu Chakrabartty. “Using Growth Transform Dynamical Systems for Spatio-Temporal Data Sonification.” arXiv preprint, 2021.

Chion, Michel. Audio-Vision. New York: Columbia University Press, 1994.

Görne, Tobias. Sound Design. Munich: Hanser, 2017.

Hermann, Thomas, Andy Hunt, and John G. Neuhoff, eds. The Sonification Handbook. Berlin: Logos Publishing House, 2011.

Schick, Adolf. Schallwirkung aus psychologischer Sicht. Stuttgart: Klett-Cotta, 1979.

Sigal, Erich. “Akustik: Schall und seine Eigenschaften.” Accessed January 21, 2025. mu-sig.de.

Spence, Charles. “Crossmodal Correspondences: A Tutorial Review.” Attention, Perception, Psychophysics, 2011.

Ziemer, Tim. Psychoacoustic Music Sound Field Synthesis. Cham: Springer International Publishing, 2020.

Ziemer, Tim, Nuttawut Nuchprayoon, and Holger Schultheis. “Psychoacoustic Sonification as User Interface for Human-Machine Interaction.” International Journal of Informatics Society, 2020.

Ziemer, Tim, and Holger Schultheis. “Three Orthogonal Dimensions for Psychoacoustic Sonification.” Acta Acustica United with Acustica, 2020.

by David Adlberger - 5. March 2025

Explore I: Image Extender – Image sonification tool for immersive perception of sounds from images and new creation possiblities

The project would be a program that uses either AI-content recognition or a specific sonification algorithm by using equivalent of the perception of sight (cross-model metaphors).

examples of cross modal metaphors (Görne, 2017, S.53)

This approach could serve two main audiences:

1. Visually Impaired Individuals:
The tool would provide an alternative to traditional audio descriptions, aiming instead to deliver a sonic experience that evokes the ambiance, spatial depth, or mood of an image. Instead of giving direct descriptive feedback, it would use non-verbal soundscapes to create an “impression” of the scene, engaging the listener’s perception intuitively. Therefore, the aspect of a strict sonification language might be a good approach. Maybe even better than just displaying the sounds of the images. Or maybe a mixture of both.

2. Artists and Designers:
The tool could generate unique audio samples for creative applications, such as sound design for interactive installations, brand audio identities, or cinematic soundscapes. By enabling the synthesis of sound based on visual data, the tool could become a versatile instrument for experimental media artists.

Purpose

The core purpose would be the mixture of both purposes before, a tool that supports and helps creating in the same suite.

The dual purpose of accessibility and creativity is central to the project’s design philosophy, but balancing these objectives poses a challenge. While the tool should serve as a robust aid for visually impaired users, it also needs to function as a practical and flexible sound design instrument.

The final product can then be used by people who benefit from the added perception they get of images and screens and for artists or designers as a tool.

Primary Goal

A primary goal is to establish a sonification language that is intuitive, consistent, and adaptable to a variety of images and scenes. This “language” would ideally be flexible enough for creative expression yet structured enough to provide clarity for visually impaired users. Using a dynamic, adaptable set of rules tied to image data, the tool would be able to translate colors, textures, shapes, and contrasts into specific sounds.

To make the tool accessible and enjoyable, careful attention needs to be paid to the balance of sound complexity. Testing with visually impaired individuals will be essential for calibrating the audio to avoid overwhelming or confusing sensory experiences. Adjustable parameters could allow users to tailor sound intensity, frequency, and spatialization, giving them control while preserving the underlying sonification framework. It’s important to focus on realistic an achievable goal first.

planning on the methods (structure)
research and data collection
simple prototyping of key concept
testing phases
implementation in an standalone application
ui design and mobile optimization

The prototype will evolve in stages, with usability testing playing a key role in refining functionality. Early feedback from visually impaired testers will be invaluable in shaping how soundscapes are structured and controlled. Incorporating adjustable settings will likely be necessary to allow users to customize their experience and avoid potential overstimulation. However, this customization could complicate the design if the aim is to develop a consistent sonification language. Testing will help to balance these needs

Initial development will target desktop environments, with plans to expand to smartphones. A mobile-friendly interface would allow users to access sonification on the go, making it easier to engage with images and scenes from any device.

In general, it could lead to a different perception of sound in connection with images or visuals.

Needed components

Technological Basis:

Programming Language & IDE:
The primary development of the image recognition could be done in Python, which offers strong libraries for image processing, machine learning, and integration with sound engines. Also wekinator could be a good start for the communication via OSC for example.

Sonification Tools:
Pure Data or Max/MSP are ideal choices for creating the audio processing and synthesis framework, as they enable fine-tuned audio manipulation. These platforms can map visual data inputs (like color or shape) to sound parameters (such as pitch, timbre, or rhythm).

Testing Resources:
A set of test images and videos will be required to refine the tool’s translations across various visual scenarios.

Existing Inspirations and References:

– Melobytes: Software that converts images to music, highlighting the potential for creative auditory representations of visuals.

– VOSIS: A synthesizer that filters visual data based on grayscale values, demonstrating how sound synthesis can be based on visual texture.

– image-sonification.vercel.app: A platform that creates audio loops from RGB values, showing how color data can be translated into sound.

– BeMyEyes: An app that provides auditory descriptions for visually impaired users, emphasizing the importance of accessibility in technology design.

Academic Foundations:

Literature on sonification, psychoacoustics, and synthesis will support the development of the program. These fields will help inform how sound can effectively communicate complex information without overwhelming the listener.

References / Source

Görne, Tobias. Sound Design. Munich: Hanser, 2017.