Titel:Objektbasierte Musikproduktion – Entwicklung eines kombinierten Workflows für Dolby Atmos Music und 360 Reality Audio auf Basis einer bereits bestehenden Stereo-Mischung Autorin: Daniela Rieger Erscheinungsjahr: 2020 Studiengang: Masterstudiengang Audiovisuelle Medien Hochschule: Hochschule der Medien Stuttgart (HdM) Betreuung: Betreuung (Praxispartner Fraunhofer IIS, Erlangen): Dr. Ulli Scuda, M.Eng. Philipp Eibl
Die Masterarbeit von Daniela Rieger wurde im Studiengang Audiovisuelle Medien an der Hochschule der Medien Stuttgart verfasst und befasst sich mit einem Thema, welches sehr nahe zu meinem aktuell geplanten Masterarbeitsthema ist: der objektbasierten Musikproduktion in den Formaten Dolby Atmos Music und 360 Reality Audio. Ziel der Arbeit ist es, einen kombinierten Workflow zu entwickeln, der auf einer bereits bestehenden Stereo-Produktion aufbaut und für beide Formate funktioniert. Dies ist im Grunde eine gute Grundlage, wie ich die Stereo-Produktionen (die ich jetzt schon habe), in 3D Audio übertragen kann.
Die Arbeit ist klar und nachvollziehbar aufgebaut. Nach einer Einführung folgen theoretische Grundlagen zu objektbasiertem Audio, eine technische Beschreibung der beiden Systeme, die Entwicklung des Workflows sowie die praktische Umsetzung anhand eines realen Songs (Kentia Danca von RIAD & J.K.Rollin’). Das Werkstück besteht also aus einer konkreten Produktion in beiden Formaten, die in der Arbeit ausführlich dokumentiert wird.
Gestaltungshöhe
Die Arbeit überzeugt durch ihren hohen technischen Anspruch. Rieger zeigt ein sehr gutes Verständnis für komplexe Produktionsabläufe und schafft es, diese in einen strukturierten, praxisnahen Workflow zu übersetzen. Die Darstellung ist detailliert, mit vielen Abbildungen und Beispielen, wodurch die technische Umsetzung nachvollziehbar wird. Was etwas zu kurz kommt, ist die gestalterisch-ästhetische Bewertung des Ergebnisses. Die Autorin beschreibt zwar die klanglichen Unterschiede zwischen den beiden Formaten, führt aber keine systematische Höruntersuchung oder vergleichende Evaluation durch (Hörtests?).
Innovationsgrad
Das Thema war zum Zeitpunkt der Veröffentlichung (2020) sehr aktuell und relevant. Objektbasierte Musikformate befanden sich damals im Aufschwung, und ein kombinierter Workflow war bisher kaum dokumentiert. Der Neuigkeitswert liegt also vor allem in der praxisorientierten Kombination beider Systeme, nicht in einer theoretischen Neuentwicklung.
Selbstständigkeit
Die Arbeit zeigt ein hohes Maß an Eigenständigkeit. Rieger hat sich intensiv in beide Systeme eingearbeitet, die Workflows eigenständig aufgebaut. Besonders positiv fällt auf, dass sie bei technischen Schwierigkeiten eigene Lösungen entwickelt und direkt mit Dolby in Kontakt stand, um Detailfragen zu klären.
Gliederung und Struktur
Der Aufbau ist logisch und gut nachvollziehbar. Die Kapitel sind klar voneinander abgegrenzt und führen schrittweise von der Theorie zur Praxis. Abbildungen, Tabellen und Screenshots unterstützen die Struktur und helfen beim Verständnis. Manche theoretische Abschnitte sind recht textlastig und hätten etwas kürzer ausfallen können. Auch methodische Entscheidungen (z. B. Parameterwahl bei den Exporten) könnten stellenweise übersichtlicher zusammengefasst werden.
Kommunikationsgrad
Riegers Schreibstil ist sachlich, präzise und technisch korrekt. Sie erklärt komplexe Sachverhalte verständlich und belegt ihre Aussagen mit anschaulichen Beispielen. Besonders hilfreich sind die zahlreichen Screenshots und Diagramme, die ihre Beschreibungen unterstützen. Teilweise ist der Text recht dicht und setzt technisches Vorwissen voraus, was für ein Fachpublikum aber angemessen ist. Für Leser*innen außerhalb unseres Studiengangs wäre ein kurzes Glossar oder eine Zusammenfassung am Ende der Kapitel hilfreich gewesen.
Umfang der Arbeit
Die Arbeit hat einen sehr passenden Umfang. Sie behandelt alle relevanten Aspekte des Themas und geht dabei sowohl theoretisch als auch praktisch in die Tiefe. Der Aufwand der praktischen Umsetzung wird deutlich, und der Umfang ist für eine Masterarbeit angemessen und ausgewogen.
Orthographie, Sorgfalt und Genauigkeit
Die Arbeit ist formal sehr sauber. Sprache, Rechtschreibung und Layout sind größtenteils fehlerfrei. Zitate und Quellenangaben sind korrekt formatiert, und die Verzeichnisse sind vollständig. Nur gelegentlich finden sich längere Schachtelsätze, die man für eine noch bessere Lesbarkeit etwas vereinfachen könnte.
Literatur
Das Literaturverzeichnis ist umfangreich und enthält sowohl wissenschaftliche als auch praxisnahe Quellen. Neben Fachartikeln und AES-Publikationen nutzt Rieger auch aktuelle Dokumentationen der Hersteller Dolby und Sony, die für das Thema unerlässlich sind.
Beurteilung des Werkstücks
Das Werkstück ist der zentrale praktische Teil der Arbeit. Es besteht aus der Umsetzung eines Songs in Dolby Atmos Music und 360 Reality Audio und wird detailliert beschrieben. Rieger zeigt alle Schritte von der Session-Struktur über die Plugin-Konfiguration bis hin zu den Exporten und Lautheitsmessungen. Die Umsetzung ist technisch überzeugend und praxisnah. Theorie und Praxis greifen sinnvoll ineinander, und die Arbeit zeigt klar, welche Unterschiede und Herausforderungen in den beiden Formaten bestehen. Zwar sind die fertigen Produktionen selbst (also die Audiofiles) nicht direkt zugänglich (was leider sehr schade ist), doch die Dokumentation ist so detailliert, dass der Prozess vollständig nachvollziehbar bleibt. Insgesamt entspricht die Qualität des Werkstücks eindeutig den Anforderungen einer Masterarbeit auf unserer FH: Es ist technisch sauber, innovativ und zeigt einen klaren Erkenntnisgewinn.
Meine persönliche Gesamtbewertung
Daniela Rieger hat mit dieser Arbeit eine sehr gelungene und praxisorientierte Masterarbeit vorgelegt. Sie verbindet theoretisches Wissen mit praktischer Umsetzung auf hohem Niveau und liefert einen Workflow, der auch für andere Produzenten und Toningenieur*innen relevant ist. Besonders positiv ist die technische Präzision und die klare Struktur. Etwas ausbaufähig wäre noch die klangliche bzw. ästhetische Bewertung des Ergebnisses, etwa durch eine kleine Hörstudie oder eine Reflexion der wahrgenommenen Räumlichkeit.
Insgesamt überzeugt die Arbeit durch ihre Sorgfalt, Tiefe und Praxisnähe. Ich würde sie im oberen Notenbereich einstufen – zwischen 1 und 2
Empfehlung / Inspiration für mein 3D Audio Master-Projekt
Es könnte wirklich noch spannender sein, die Hörwahrnehmung stärker in den Fokus zu rücken – etwa durch kleine Vergleichstests oder Feedbackrunden mit Hörer*innen (wie das bei anderen Masterarbeiten, die ich in diesem Bereich gelesen hatte, auch schon getan wurde).
For the production of ‘Stand By’, I chose to record and edit everything in Cubase 12, as it’s my main DAW and I’m highly familiar with its workflow, shortcuts, and overall layout. The entire project contains nearly 150 tracks, all recorded & edited in Cubase.
When it came to mixing in 3D audio, I decided to begin my spatial audio journey using Ambisonics and Reaper rather than Dolby Atmos. This decision was largely influenced by the IEM Plugin Suite, which offers powerful and intuitive tools for Ambisonics mixing — making the entry into 3D audio more approachable and flexible.
I chose to work with fifth-order Ambisonics for this project to achieve a more accurate and immersive rendering of diffuseness, spaciousness, and spatial depth. While first-order Ambisonics might seem sufficient due to the even nature of diffuse sound fields, in practice, their low spatial resolution leads to high directional correlation during playback, which significantly impairs the perception of these spatial qualities. Higher-order Ambisonics, in contrast, improves the mapping of uncorrelated signals and preserves spatial impressions much more effectively. Psychoacoustic research has shown that an Ambisonic order of three or higher is required to perceptually preserve decorrelation between neighboring loudspeakers, which is crucial for rendering depth and diffuseness. Fifth-order Ambisonics further enhances this, particularly outside the sweet spot, providing a more consistent spatial experience across a larger listening area. As demonstrated in the IEM CUBE, a fifth-order system allows nearly the entire horizontal listening plane—in this case, a 12 × 10 m concert space—to become a valid and perceptually plausible playback zone. [1]
Thus, fifth-order Ambisonics is not only a practical choice for immersive production in larger spaces, but it also strikes an effective balance between spatial resolution, technological complexity, and perceptual benefit [2].
I also had the opportunity to experience this myself during a small listening test we conducted with Matthias Frank. We listened to first-, third-, and fifth-order Ambisonics in a blind comparison and were asked to rate certain spatial parameters like spatial depth or localization. The first order was quite easy to identify due to its limited spatial resolution. However, distinguishing between third- and fifth-order Ambisonics proved to be much more challenging, as the differences were often subtle and less immediately perceptible.
After that, I started with setting up the routing, which was one of the most underestimated parts of this project. Similar to a traditional stereo production, I created a structure of groups and subgroups, but adapted it for Ambisonics. For example, in the drum section, encoding happens at the main drum group via the IEM MultiEncoder. All individual channels are routed into that group, allowing me to process them using conventional stereo plugins before spatializing them — saving both CPU resources and maintaining flexibility in the early mixing stages.
Within the drum routing, I created subgroups for kick, snare, overheads and the droom, allowing for finer control and processing. When dealing with coherent signals, such as double-tracked guitars, I first routed both signals (panned hard L & hard R) into a stereo group to conserve CPU power by processing them together. This group is then routed into a master guitar group that handles Ambisonics encoding. Since the L and R signals remain separated, you can still treat them independently from each other in the encoder. So I can still place them individually in the 3D field — even though they were previously grouped.
I followed the same approach with vocals, organizing them into groups before routing them into the Multiencoder. For specific adlibs, I used the GranularEncoder to create glitchy, scattered spatial effects.
To add a sense of depth and immersion to the vocals, I used a little bit of the FDN Reverb for diffuse reverb and the Room Encoder for some early reflections – all plugins are from the IEM Suite.
Finding this optimal signal flow took quite a bit of time and experimentation. It was a major learning process to understand how to best structure a large session for Ambisonics, and I’m still refining my approach. I’ve already begun mixing in the production studio at IEM, and although there’s certainly still room for improvement, I’m genuinely happy with the current state of the mix. This being my first attempt of a spatial audio mix, I see it as a solid starting point — and I’m excited to continue learning and evolving through hands-on experience.
[1] Franz Zotter und Matthias Frank, Ambisonics: A Practical 3D Audio Theory for Recording, Studio Production, Sound Reinforcement, and Virtual Reality, Bd. 19, Springer Topics in Signal Processing (Cham: Springer International Publishing, 2019), 18–20, https://doi.org/10.1007/978-3-030-17207-7.
In “Stand By”, sound design plays a critical role in reinforcing the song’s emotional core — the psychological entrapment of a toxic relationship, which parallels the patterns of addiction.
To support the emotional arc of Stand By, spatial elements were deliberately positioned behind and around the listener to enhance feelings of tension, disorientation, and emotional overload (more about that below). This approach aligns with findings by Stefanowska and Zieliński (2024), who highlight that rear-positioned and difficult-to-localize sound sources can intensify emotional responses—particularly those associated with discomfort, fear, or psychological distress.[1]
By embracing these psychoacoustic principles, the sound design doesn’t merely illustrate the lyrical content, but actively immerses the listener in the protagonist’s emotional state.
But the key principle that guided my general approach to spatial mixing came from Lasse Nipkow, who emphasized the importance of listener expectation in immersive audio. As he puts it: „Die Leute sind es gewöhnt, dass die Musik vorne spielt, also lasst sie da auch spielen, und packt Wichtiges wie Schlagzeug und Stimme in die vorderen Lautsprecher.[2]“ Translated: “People are used to music playing from the front—so let it play there, and place important elements like drums and vocals in the front loudspeakers.” This mindset shaped my core mixing philosophy. Instead of treating 3D audio as an opportunity to scatter key musical components arbitrarily throughout the sound field, I chose to respect the listener’s intuitive focus. Drums, lead vocals, and harmonic anchors were mostly placed in the front hemisphere to preserve clarity and narrative drive, while the rest of the spatial field—especially the sides, rear, and height—became a playground for emotional and textural enhancement. This balance allowed me to stay immersive without losing musical coherence.
The track begins intentionally narrow and intimate, with the vocal placed front and center and only minimal ad-libs distributed in the surrounding space – Like fleeting thoughts echoing in the periphery. The guitar is slightly off the center, a second guitar plays the octaves of the riff, positioned at low volume on the other side of the room. Subtle rim hits on the rack tom foreshadow the emotional unravelling to come, creeping in like the early signs of danger.
As the second verse enters, the space opens drastically. The full drum kit kicks in with a deep floor tom and a palm-muted guitar part, tracked four times, creates a dense rhythmic bed. Meanwhile, a haunting “Uhh” choir swirls around the listener. This ghostly texture mirrors the psychological fog of emotional abuse — disembodied voices, indifferent and cold, being around you. It captures the emotional numbness and disorientation of dependency: the sense of being surrounded, yet entirely alone.
In the chorus, additional guitar layers are spread wide across the field, amplifying the pressure. Key lyrical lines are doubled with backing vocals:
I’m running in circles ‘Forced to stay’ I want to leave this place ‘But I can’t get away’ It’s frustrating ‘And suffocating’ Promised paradise is a lie — so ‘I’ stay on stand by
After each chorus, the song narrows again, mimicking the push-and-pull dynamic of emotional manipulation — the moments of clarity crushed by renewed confusion. At the line “You made me crazy when you…”, only the lead vocal and one side of the choir remains — before the wall of sound returns suddenly in verse two. This verse escalates with ‘open’ guitar chords (as opposed to the palm-muted ones before), and the drummer expands the groove with the addition of the rack tom.
To emphasize the transition into the second chorus, guitar death notes are layered with the snare hits in the fill — eight tracks in total, radiating outward. The final vocal line “Bursting away” is spatially fractured and scattered in all directions, as if the voice itself is breaking apart under the weight of emotional overload.
Then, after the second chorus, comes the confrontation: four cycles of build-up, followed by four of breakdown. During the build-up, a series of toxic phrases like “After everything I’ve done for you”, “Don’t push me”, “You’re nothing without me” — are placed chaotically into the space. Each one is distorted and spatially placed. Some are passed through granular synthesis (via the IEM Granular Encoder), transforming them into chaotic, stuttering fragments that glitch and scatter unpredictably. It places the listener inside the chaos of an abusive dynamic, where reason disintegrates and confusion dominates.
The tension is increased through a gradual high-cut filter on the guitars — which opens more and more, the closer the breakdown comes. A burning fuse — a literal sound effect — moves around the listener, traveling over their head just before the drop, suggesting both tension and inevitability.
At the start of the breakdown, a sub-drop slams in, marking the collapse. The four breakdown cycles remain true to traditional rock instrumentation but are widened into immersive 3D space.
Then, the moment of illusion arrives. We transition into the stairwell reverb section — a metaphor for the seductive promise of escape. Instead of distributing the stairwell recording (captured with five microphones) across the room, I placed the microphones behind the listener, emphasizing the contrast with the confined front-space. The mix collapses forward again, symbolized by sliding guitars that pan from back to front and the return of the fuse sound, automated to rush toward the listener. It’s the false hope of recovery — crushed by relapse.
The final chorus hits harder. The bass becomes more expressive, adding fills to push the groove forward. The word “suffocation” is no longer static — it’s sung alternately on the left and right, while the lead vocal itself begins to drift toward the backing voices, suggesting emotional fragmentation.
The line “Promised paradise is a lie” is repeated three times in the final chorus. And after that the final lyric line of the song comes – “And I stand on stand by”. A solitary voice. Nothing more. Just like addiction, the emotional trap is isolating. You’re still there. Still connected. But unable to move.
[1] Antonina Stefanowska und Sławomir K. Zieliński, „Spatial sound and emotions: A literature survey on the relationship between spatially rendered audio and listeners’ affective responses“, International Journal of Electronics and Telecommunications, 25. Juni 2024, 297, https://doi.org/10.24425/ijet.2024.149544.
For the bass, we used a Fender Jazz Bass, recorded directly through my Line 6 Helix modeller. We chose a amp simulation that included impulse responses (IRs) replicating the mic’d sound of a cabinet captured with an Audix D6 (typically a kick drum mic) and a Shure SM57. This unusual combination provided exactly what I was looking for: deep, punchy lows from the D6 and more defined highs from the SM57 — a perfect balance for our mix.
With the electric guitars, we kept the 3D audio production in mind throughout the entire process. That’s why we recorded multiple layers to allow for spatial variation during mixing. I played the guitar parts using both a Gibson Les Paul Standard and a custom-built Telecaster — again routed through the Line 6 Helix, which offered us a broad palette of amp and cab simulations with consistent quality.
Vocal Recordings
Vocals were recorded using my Neumann TLM102. We tested several microphones, including the AKG C414 as well as the new version of this microphone, the Austrian Audio OC818. In the end, the TLM102 simply fit Lukas’s voice the best. For certain shouts and accents, we recorded more takes to give us more layering options in the mix.
Backing vocals and harmonies were performed by Clemens (our bassist), Lukas (our lead vocalist), and myself. We used a variety of techniques — including thirds above and below the lead vocal — and occasionally doubled the lead in octaves to add emotional weight or build intensity in specific sections.
We’ve already written eight songs for the album. Every time we move toward recording a new track, we sit down as a band and evaluate which song we want to take out of the pre-production phase and develop further. The last one we recorded was a song called ‘Stand By’. Since we had only recently written it, the energy and momentum around the track were still fresh — we were all highly motivated to fully produce it and spend more time engaging with its emotional and sonic layers. Not all of the songs we’ve written will make it onto the final album, and the writing process is still ongoing. We’re planning to write additional songs over the summer, including sessions with external professional songwriters to expand our creative input and further explore the theme from new perspectives.
Concept of the song
‘Stand By’ is a raw and emotional track that dives deep into the suffocating reality of being trapped in a toxic relationship—a dynamic that mirrors the psychological and emotional patterns often found in addiction. The song paints a vivid picture of circular thinking and emotional dependency: the feeling of giving everything and receiving harm in return, the confusion of being hurt by someone who once promised love, and the inner battle of wanting to leave but being psychologically unable to do so.
The metaphor of being ‘on stand by’ captures a state of paralysis—still connected, still present, but unable to act or move forward. In the context of our concept album on addiction and dependency, this song stands as a powerful metaphor for emotional entrapment. Just like with substance or behavioural addictions, the individual becomes stuck in a loop: knowing something is damaging but feeling incapable of breaking away.
First drafts of the lyrics
These are the final lyrics of the song ‘Stand by’ – FLAVOR AMP:
[Verse 1] You and me felt like a fairytale, I believed, you gave it away (You) started laughing while I started to bleed Tried so hard fulfilling your needs
[Verse 2] When you suffered pain you made me feel the same Even if you know you’re wrong you had to maintain It’s such a shame For you, I always take the blame
[Chorus 1] I’m running in circles Forced to stay I want to leave this place But I can’t get away It’s frustrating And suffocating Promised paradise is a lie So I stay on stand by
[Verse 3]
It made me crazy when you started to play You never cared what I had to say I’m not allowed to complain My mind is bursting away
[Chorus 2]
I’m running in circles Forced to stay I want to leave this place But I can’t get away It’s frustrating And suffocating Promised paradise is a lie So I stay on stand by
[Build up]
[Breakdown]
[Chorus 3]
I’m running in circles Forced to stay I want to leave this place And I can’t get away It’s frustrating But suffocating [Post Chorus]
Promised paradise is a lie
Promised paradise is a lie So I stay on stand by
With the core structure of the song now in place, we moved on to recording the drums — a key step in shaping the track’s sonic identity.
My individual project deals with the creation of a 3D audio concept album in the genre of rock – on the topics of addiction and dependency. Logically, we are also confronted with the problem of high production costs. In order to be able to realize the project to the extent planned, it was therefore important to create a recording and mixing environment ourselves – to reduce costs and at the same time improve the quality of the project. Since the beginning of 2024, we have been able to take over the former warehouse premises of our drummer’s parents (they run a floristry business). Here we have just under 80 m² at our disposal, which we have converted into a small office, a rehearsal/recording room and a mixing room on our own initiative. We took care of the building work (putting up walls, pouring screed, electricity, etc.) in the course of 2024. We and our trainee were already able to use the office together with our intern during the planning of our festival Bock auf Rock in summer 2024 – however, the control and rehearsal room could only be used to a limited extent due to the lack of room acoustics optimization.
Empty room (first setup)
In February 2025, the time had finally come: a suitable recording and mixing environment had to be created for the start of the project. Our drummer is studying civil engineering, so we had the necessary knowledge to implement this project ourselves. The aim was to build a recording and control room with the best possible acoustics (in terms of room dimensions and budget). We wanted to create a very good stereo monitoring situation in the control room. It was important for us to have the opportunity to pre-produce as much as possible, as time in the 3D audio-compatible studios at the IEM is limited. We finally started our construction project in mid-February. We had already organized insulation wool and other building materials via Willhaben over the last few months. In total, we used around 15 m³ of insulation wool (mostly Ursa DF39 Dual).
We spent a long time researching what was the cheapest option for the frames of the absorbers and what will fit for our purpose (Will it crack when you screw it together? How will the cut look like? … ). We tested a lot of different wood types.
In the end, we opted for MDF plates – these are much cheaper than solid wood and still offer the necessary stability. We had the plates (3 m x 2.10 m x 1.6 mm = in total we bought about 45m²) which we bought through a local carpeter. Before we started the installation, we planned as precisely as possible where and how many absorbers should be placed and how many we would need in total. We mainly worked with the Trikustik room mode calculator and the Porous Absorber Calculator.
We built the absorbers according to the course-plan of a Berlin acoustician called Jesco (Acoustic Insider). He offers an online course for the professional expansion of home studios: Jesco is the founder of Acoustics Insider, where he teaches practical acoustic treatment techniques for audio professionals—without the voodoo. With over 12 years of experience mixing records and treating studios in Berlin, he knows exactly how to turn almost any room into a reliable creative space. His approach has helped him reduce his average mix time to just 4 hours and earn a platinum record for mixing Ofenbach’s “Be Mine.”
His program was the perfect basis for us, because it’s all about achieving the best possible result with the available resources – that’s the reality of small bands, young audio engineers and students. It really helped us in terms of the placement of speakers, building bass traps & absorbers and where to place them. And as you can see from the results – it worked great! Now it got serious: the construction of the !70 absorbers! could begin. In total, we built two different “absorber categories”, which we divided into further models:
large room absorbers (110 cm x 110 cm x 30 cm) – not visible (e.g. behind other absorbers or diffusers). We need them to catch the low frequencies (we have 50cm material & 10cm air behind it on the front wall – in the corners we have about 70cm).
normal absorbers (standard size: 100 cm x 62.5 cm x 20 cm). There were special dimensions in special places to control the absorption more specifically (150 cm x 62.5 cm x 20 cm; 100 cm x 62.5 cm x 30 cm; 125 cm x 62.5 cm x 20 cm). Diffusion slats are also screwed onto 7 of the normal absorbers (covered with black fabric) – we will do it this summer when we have time to do that.
Everything was planned, the wooden panels ordered – we were ready to go. Of course, not everything went according to plan, and things got off to a difficult start:
The MDF boards were incredibly heavy and flexible – transporting the huge boards (approx. 6 m²) was a real HORROR (I really hurt me knee while transporting them…).
Once we had picked up the boards, we started cutting them to size. Unfortunately, the weather was anything but helpful: it was snowing and we spent most of the three days cutting the panels outside in 3 degree temperatures.
In addition to the MDF boards, we also used waste wood (hopefully without woodworm…) that we were allowed to collect from a good friend’s farm. We used this waste wood to build the inner frames of the absorbers and the large absorbers (1 m x 1 m x 30 cm – 14 pieces), which were not visible later anyway.
At the same time, I was already working on the rear wall of the recording room. A large 1D diffuser was to be installed here.
Due to the enormous price of wood, I organized the required wood myself: I was able to cut and then plane boards in a good friend’s wood workshop – it took me a whole day, but the result was great and we saved a lot of money in the process.
While we were cutting the wood, we also made a drilling template for the holes in the wooden panels. This allowed us to quickly prepare all the panels and make a countersink for the screw heads.
At the same time, we prepared the cutting of the insulation wool. For health reasons, we immediately wrapped the cut insulation wool in foil (painter’s foil, 0.2mm thick). Fortunately, we did this indoors – we had simply cleared out the future control room. Fortunately, we also got help from my parents and our friends – without this support, the whole thing would have taken much longer.
After everything was cut and prepared, we took a weekend break as we had gigs in Vienna and Brno (Czech Republic) ahead of us. After that we went straight back to work.
Although university had started again, we used every free minute to continue working on our studio. We started by screwing the frames together. After the outer frames came the inner frames – unfortunately, this work was much more time-consuming than expected. Nevertheless, we decided to continue consistently, as we wanted to maintain a uniform construction method. After building the inner frames, we took care of covering them with fabric. We searched for a long time for a suitable fabric that was sound-permeable and didn’t exceed our budget. In the end, we opted for stage molton (160 g/m²). We bought a total of almost 60 m² – the cutting alone took two of us a whole 12 hours.
The molton was then stapled onto the inner frames and carefully stretched. We screwed these inner frames to the outer frames, inserted the insulating wool and attached two struts at the back.
At the same time, we took care of the large room absorbers – these were only partially covered with fabric (not at all in the control room and over a large area in the recording room), as they are not visible anyway.
Once the absorbers were finished, we got straight down to installing them. We attached them with screw hooks, chains or normal screws (via the rear struts of the absorbers).
There were also numerous other tasks to complete, the scope of which I don’t want to describe in detail here. In addition, some unexpected problems arose during the installation. The main challenges of the entire project were
Installation: In some places, installation was much more difficult than expected. Individual and creative solutions were required.
Door in the corner of the control room (blue door you can see in the first pictures) (= missing absorber in the corner of the room; we bought an old sliding door from an elderly lady via Willhaben – rebuilt it, covered it with plasterboard and finally installed it). We have built four additional ‘mobile’ absorber what we can use for this purpose (or for drum recording e.g.).
Cutting to size (we’d rather not talk about the working conditions, times and weather…) – the boards were extremely unwieldy and almost impossible to transport even with the van. They bent a lot and were very difficult to carry.
Simultaneous use of the premises: The longer the fit-out dragged on, the longer we were unable to use the space for rehearsals. In addition, the entire entrance was blocked with absorbers – this also had to be resolved as quickly as possible.
Despite everything, we are really very satisfied with the end result – precise acoustic measurements of the rooms will follow in the near future.
I have already carried out a first provisional measurement of the control room with my measuring microphone (Superlux ECM999) – here are a few screenshots.
The results are – measured against our do-it-yourself approach – outstanding. The room simply sounds fantastic. I am 100% happy with the results we achieved.
I want to be fully transparent about this project. While part of the studio construction was funded through our band’s shared budget, we also invested a significant amount of our own private money.
Considering the results we achieved, the overall costs were remarkably low—this was undoubtedly due to our detailed and lengthy planning, our strong motivation and perseverance, and some incredible second-hand bargains we found along the way.
Below, you’ll find a breakdown of the costs specifically related to the acoustic treatment:
One important point to mention is that from the mid of February to the beginning of April, we invested around 25 full workdays as a team of two—not including the additional help we received from other band members and friends—into the acoustic construction alone.
Looking back on it objectively, it’s clear that we pushed ourselves well beyond our limits with this project. It was definitly too much. We often worked more than 12 hours straight to get everything done within such a short timeframe. But in the end – we think, that it`s worth the price.
But we are not 100% finished yet. That said, the diffusers in the control room and the ceiling treatments are still unfinished (on some pictures you can see it: we packed leftover pieces of acoustic foam into a cargo net (of a car-trailer) and strapped it to the ceiling above the drums. It’s definitely a temporary fix, but it helped to reduce some reflections from the ceiling). We’re still discussing how best to tackle these elements, and haven’t fully decided on the final approach yet.
At the end of april, i had the opportunity to attend and participate in Lasse Nipkow’s 3D audio seminar. This seminar was held at the ORF Funkhaus in vienna and many important guests from the industry were invited. At the end of the last day of the seminar, all speakers were asked to talk briefly about the future of 3D audio. These were the most important findings of this discussion:
“Who’s Gonna Pay for This?” – Dietz Tinhof
Dietz Tinhof tackled the uncomfortable question of financing 3D audio head-on. He stressed that creators and innovators in the field rarely see financial returns for their work, while platforms and labels profit. “We’re at the forefront of a development where others will reap the rewards, not us,” he said, pointing out the lack of rights or credits for audio engineers compared to other creative roles like cinematographers. He called for collective action to demand recognition and fair compensation, arguing that immersive audio’s artistic and technical value should translate into tangible benefits for its creators. “Ton wächst nicht auf Bäumen—it’s our labor, our ideas. We can’t keep giving it away for free.”
Lasse Nipkow proposed in this context that we should focus on 3D audio in luxury settings (e.g., spas, luxury hotels).
Tom Ammermann continued by emphasizing the need for better binaural mixes, noting significant room for improvement. He highlighted the growing role of 3D audio in live installations and households, urging producers to prioritize quality to shift perceptions from “it wasn’t bad” to genuine enthusiasm. He envisioned 3D audio becoming “the new stereo” if the industry collectively pushes for higher standards.
Michael A. Bühlmann added that while technical formats like mono, stereo, or 3D are packaging, the artistic vision must remain uncompromised. Roger Baltensperger stressed the importance of mastering workflows and quality control, advocating for the same rigor applied to stereo to unlock 3D’s full potential.
Sebastian Oeynhausen (Pan Acoustics) thanked the community for its welcoming atmosphere and noted the divergence between home and industrial applications, urging manufacturers to develop specialized hardware. He also praised tools like Graves 3D for animating audio objects in DAWs.
Katharina Pollack, representing the scientific angle, underscored the importance of foundational research and artistic-technical synergy. She predicted a binaural-dominated future, citing widespread headphone use and innovative applications like Dreamwaves’ navigation systems for the visually impaired.
Karlheinz Brandenburg reflected on 25 years of binaural and speaker-based audio, celebrating its resurgence but cautioning that home-listening standards (e.g., proper headphones for spatial audio) are still evolving. He dismissed the idea that standard headphones or YouTube could deliver true spatial experiences.
“Good Content Survives Mono Underwater”
Florian Camerer blended humor and skepticism, toasting to “mono beer, stereo schnitzel, and immersive fever dreams.” But his real focus was broadcast’s inertia. While public broadcasters like ORF led the 5.1 revolution, immersive audio remains stuck in limbo. “Everyone’s waiting for someone else to jump first—the BBC, the French, the Germans.” He criticized recycled debates over basics like center channels and LFE, calling it “déjà vu from the 5.1 era.” Yet he ended on optimism: immersive audio, unlike 5.1, might survive because of its artistic potential.
Benedikt Ernst, the youngest in the room, brought a hopeful counterpoint. With “youthful recklessness,” he argued that engaging more creatives could unlock both artistic and economic potential. “If we get artists on board—not just as passive recipients but as active participants—the content will improve, and the money might follow.” He acknowledged the uphill battle but emphasized the need to bridge the gap between technical possibilities and creative buy-in.
Lenni Damann grounded the discussion in reality, citing Spotify’s influence as a make-or-break factor. “Labels ask: Why invest in 3D if our artists have 3 million monthly listeners on Spotify but only 95,000 on Apple Music?” He hinted at industry rumors about Spotify’s potential spatial-audio rollout, which could tip the scales. “If the biggest platform pushes it, suddenly the ‘why’ becomes obvious.”
Closing Words: “The Battle for Quality and Perception”
Dietz Tinhof circled back to dual challenges: production and perception. On one side, clients treat 3D as a marketing afterthought, demanding “stems-based pricing” that sacrifices quality. On the other, end-users hear compressed, downgraded versions of meticulously crafted mixes. “We’re stuck between clients who say we’re not making money and listeners who say we don’t hear the difference.” His rallying cry? Fight for immersive audio as its own art form—not just “stereo with extra steps.” Tom Ammermann and Michael Bühlmann echoed this, sharing stories of artists who dismissed 3D until they experienced it firsthand. The takeaway: Education, advocacy, and unflinching quality are the keys to 3D’s future.
Lasse Nipkow closed with a nod to collaboration, inviting attendees to the next Tonmeistertagung. The room’s consensus was clear: 3D audio’s potential is undeniable, but realizing it demands creativity, persistence, and a fair share for those building it.
In late April 2025, I had the exciting opportunity to attend the 3D Audio Seminar by Lasse Nipkow, held in cooperation with the VDT (Verband Deutscher Tonmeister) under the motto “Goosebumps can be planned!”. The seminar took place on April 29–30, 2025, at the ORF RadioKulturhaus in Vienna and brought together audio professionals, creatives, and technical experts with one clear goal: to create impressive 3D audio content for audiences.
The event was not only aimed at sound designers, studios, and educational institutions, but also at planners and representatives from concert halls, museums, hotels, and other service sectors. Its core mission was to bridge the gap between the technical and creative aspects of 3D audio, offering a deep dive into both psychoacoustic principles and practical implementation.
The program covered a wide spectrum:
Psychoacoustic Foundations – understanding how humans perceive sound emotionally and using this knowledge to shape immersive experiences.
Technology and Practice – showcasing tools and workflows for producing and presenting high-quality 3D audio.
3D Listening Experiences – offering real-world examples and demonstrations in a finely tuned acoustic environment to highlight the full potential of spatial sound.
An exhibition area run by the event’s partner companies also accompanied the seminar, offering product showcases and networking opportunities during breaks and the evening reception.
Day 1: Setup and Technical Exploration
Although I initially registered as a participant, Lasse reached out beforehand and asked if I would be interested in joining the setup crew for the event. I immediately agreed—this was a chance I couldn’t pass up.
I arrived in Vienna on Sunday Morning, two days before the official seminar started, and began helping with the installation of the system alongside the team from Pan Acoustics, a German company specializing in professional audio solutions. The setup included multiple speakers, mostly connected via PoE++.
Throughout the day, I had several opportunities for in-depth conversations with Lasse Nipkow himself. These discussions were incredibly insightful and gave me a deeper understanding of the nuances and real-world challenges involved in creating immersive audio content. He also let us try some chocolate he brought from Switzerland – it was very delicious!
A key part of the system design included the placement of the two subwoofers, which were positioned to ensure even bass distribution across the listening area—crucial for supporting the 3D spatial illusion without overwhelming certain areas of the room. One subwoofer was placed in the front of the room, and one on the side. The combination of both subwoofers should ensure an even bass distribution.
Day 2: Measurement and Calibration Issues
Monday was dedicated to measuring and calibrating the system, but several issues became apparent during this process. In my opinion, the subwoofers were simply too small for the size of the room, resulting in a general lack of low-end energy. The bass was not only uneven in some areas—it was overall too weak to support the immersive sound field effectively.
The goal of calibrating the system on this second setup day was to create a neutral listening environment so that all presenters could play their demo material under consistent conditions. However, the system was so poorly calibrated that this goal wasn’t achieved. Most of the ceiling-mounted (height) speakers were barely audible, and the overall balance between the different channels lacked cohesion.
It also seemed likely that some mistakes were made during the measurement process itself—perhaps certain channels were misrouted or mislabeled, which could explain the unusual levels and inconsistent imaging.
As a result, each presenter ended up adjusting individual channel levels to suit their own material and preferences. This led to considerable inconsistencies in playback across presentations—some demos felt immersive and dynamic, while others sounded flat or unbalanced. It was a clear example of how crucial proper system tuning is when aiming for high-quality 3D audio experiences.
First seminar day – April 29
We met at 8 AM, had a coffee, and took the opportunity to chat with various manufacturers before the first lectures began at 10 AM.
The first day began with a deep dive into the psychological and technical fundamentals of spatial hearing, led by Lasse Nipkow himself. He demonstrated how immersive sound can create emotional reactions and detailed the principles of auditory perception in spatial contexts—explaining how our ears and brains collaborate to locate and interpret sound in a three-dimensional environment.
Then, Daniela Rieger introduced Dialog+, a Fraunhofer-based solution for making dialogues more intelligible using AI-assisted processing. This technology addresses a well-known problem in broadcasting: the difficulty many viewers have in understanding speech due to background noise and music. MPEG-H Dialog+ creates an alternative “Clear Speech” version by lowering background sounds and music from existing audio mixes. This version is available as an additional track in on-demand content, such as in the ARD Mediathek.
Dialog+ utilizes cutting-edge Deep Neural Networks (DNNs) to separate dialogue from other audio components. The system processes audio from real broadcasts, isolating the dialogue to create clearer, more accessible sound. It allows for personalization of the dialogue track, making it easier for viewers to understand speech in a variety of contexts, from documentaries to sports events.
Later in the day, Piotr Majdak and Katharina Pollack from the Austrian Academy of Sciences presented a session on how we perceive sound in space, explaining concepts such as HRTFs, ITDs/ILDs, and the role of early reflections in spatial hearing. Their session bridged the gap between scientific research and practical system design.
At 14:00 after lunch, Karlheinz Brandenburg, co-inventor of MP3 and founder of Brandenburg Labs, took the stage to discuss immersive headphone playback—an astonishing approach that makes the experience of listening over headphones almost indistinguishable from loudspeaker playback. His quote, “When I listen to sounds, our brain is a good pattern recognizer,” set the stage for a fascinating discussion on how our brains constantly compare the sounds we hear with stored expectations.
He presented various concepts:
Belly-voice effect (ventriloquist illusion)
McGurk effect (audio-visual fusion)
Room divergence/convergence (interaction of sound with space)
Learning and training (e.g., listening with different ears)
He argued that a plausible audio illusion requires a match between:
Expectations: What we anticipate hearing in the current environment.
Perceived sound: The actual auditory experience.
Several factors influence this, including:
Anatomy: The shape of the ear canal (HRTF).
Spatial cues: Reflections and room acoustics.
Visual cues: Sight can influence hearing.
Personal experience: Our brain’s prior knowledge.
Individualized HRTF: Recent studies have shown that personalized HRTF (tailored to an individual’s ear and head geometry) is not strictly necessary for realistic spatial audio. The brain can adapt to generic HRTF filters over time, though having a personalized measurement can enhance spatial accuracy, especially in headphone-based setups.
Brandenburg discussed how our brain’s ability to match sound patterns creates the illusion of immersive, spatial sound, and how visual and other sensory cues can enhance or disrupt that illusion.
One of the most practically engaging presentations followed: Tom Ammermann introduced his innovative Spatial Audio Designer Processor. This system, designed for professional use, allows real-time object-based mixing and supports a wide range of formats, from 5.1 and Dolby Atmos to custom 64-channel setups. Tom demonstrated how his system can be used in various contexts, from postproduction to live events, providing a highly flexible tool for immersive audio.
The day concluded with an evening listening session, where Tom, Lenni Damann, and Bene Ernst shared their own immersive productions. One of the highlights was the “Roomenizer” from Pinguin Ingenieurbüro, which let listeners experience real-time acoustic environments like cathedrals and concert halls, showing the power of spatial sound to enhance storytelling.
Day 2 – April 30
Florian Camerer presented his 9-channel microphone setup as a solution to the challenges of location-based sound recording for immersive audio. This setup addresses the limitations of traditional mono or stereo recordings by enabling more accurate capture of 3D sound.
Camerer’s system was designed to improve localization and spatial depth, utilizing microphone placement and wind shielding to ensure high-quality recordings in outdoor environments. His approach is particularly suited for capturing natural soundscapes in formats like AURO-3D and offers a more immersive listening experience by providing true spatial representation of the environment.
Later, Roger Baltensperger and Dietz Tinhof explored immersive music production, focusing on how spatial design can enhance emotional impact. Dietz Tinhof spoke openly about the current challenges in the production and perception of immersive audio. Two key issues give him, in his words, “a stomachache”: First, immersive content is often created for marketing purposes only or to benefit from higher payouts on platforms like Apple Music—not because of genuine artistic interest.
He recalled a conversation with Apple Music where they said there was a “quality problem.” His response was: “It’s not a quality problem, it’s a comprehension problem.” In his view, there is still a lack of understanding about what immersive audio can and should be. Too often, it’s still treated as an add-on to stereo, rather than its own creative medium.
He also criticized the widespread practice of charging based on the number of stems in a mix. This leads to worse results, he said, because if a label can choose to pay less, they will—forcing engineers to cut corners: “You get what you pay for.”
Tinhof passionately argued that immersive audio deserves to be seen as an independent art form. At the moment, though, the ecosystem is broken: Labels say they make no money, listeners don’t perceive the difference, and producers are stuck in the middle, trying their best without the proper recognition or infrastructure.
The final listening block included wellness-focused soundscapes and meditative music mixes, showing how spatial audio can be used for relaxation and therapeutic purposes.
Spotlight: Lenni Damann & Bene Ernst
The final session by Lenni Damann and Bene Ernst was a true highlight for me. They focused on the creative use of space in music production, emphasizing that 3D audio should serve the music, not the technology itself. Their works, including immersive mixes for artists like AMISTAT or Alexander Pielsticker, demonstrated how subtle movements and depth can transform simple compositions into emotionally immersive experiences.
Lenni and Bene’s philosophy is that “3D only makes sense if it serves the music.” This was evident in their work, where space became an emotional dimension, not just a technical tool. Their use of reverb zones, depth layering, and precise spatial movement turned a solo piano piece into a deeply immersive experience. They showcased how spatial dynamics can amplify the emotional power of music, making every sound more significant.
For AMISTAT, they worked on “Seasons,” a project where 3D audio wasn’t just used for technical innovation but to enhance the storytelling and emotions of the music. Their approach highlighted the power of “Spatial Dynamics” in music production—showing that the size of the mix should follow the story being told, not the other way around.
For Alexander Pielsticker, their immersive mixes of minimalist pieces, including solo piano works, were designed with “3D in mind.” They utilized modern grand piano recordings and extreme effects, allowing listeners to feel as though they were sitting on the piano bench alongside the artist.
Exhibition Area & Manufacturer Highlights
Throughout both days, the exhibition area was a hotspot of inspiration. Leading manufacturers like Neumann, Sennheiser, Brandenburg Labs, and others showcased their latest products, from spatial microphones and monitoring solutions to immersive production tools and head-tracking headphone systems. Being able to test these tools hands-on and engage with developers and engineers provided valuable insights into how these technologies can be integrated into real-world workflows.
Final Thoughts
One of the most important takeaways from Lasse Nipkow’s seminar was the reminder that 3D audio is not simply “surround sound with height channels.” Instead, it creates a true volumetric sound field—one that blends natural spatiality with precise localization. Lasse emphasized how this approach unlocks an entirely different level of immersive experience.
A particularly striking moment was his demonstration of the difference between real sources—sounds coming directly from a speaker—and phantom sources that exist between loudspeakers. Real sources offer sharper localization and a stronger presence, while phantom sources are more flexible in movement but often sound more diffuse.
Another key concept was the separation of localization and spatial envelopment. Accurate imaging relies on direct sound, whereas a convincing sense of space emerges from decorrelated signals—similar content distributed across multiple channels. This principle is at the heart of 3D audio’s immersive quality.
To illustrate these ideas, Lasse presented multi-channel organ recordings made in the Hofkirche Lucerne. Different organ registers were spatially distributed and individually mic’d—some directed from above, some from behind, and others straight ahead. This spatial strategy, combined with uncorrelated ambient material, resulted in a sonic image that felt rich, complete, and true to the complexity of the instrument.
Finally, Lasse underlined the urgent need for more education and training in the field of 3D audio—not only for sound engineers, but also for musicians and producers. It’s not just about technology, he said, but about developing a sensitivity to psychoacoustics and spatial composition. When these two elements—precise imaging and immersive space—come together, that’s when the magic happens.
I am currently studying Sound Design in my master’s program. As part of my final project, I am producing a concept album about addiction and dependency together with my band Flavor Amp. One part of the project is creating 3D audio versions of our songs from the concept album.
This session marked a very special milestone: it was the very first recording session in our newly built studio.
Although the construction is not 100% finished yet (more on that below), we decided to already start working creatively in the space.
Studio Situation
Since we haven’t found a final solution for treating the ceiling above the drum set yet, we quickly improvised: we packed leftover pieces of acoustic foam into a cargo net (of a car-trailer) and strapped it to the ceiling. It’s definitely a temporary fix, but it helped to reduce some reflections from the ceiling.
The control room is also still a work in progress — the diffusors above the black absorbers haven’t been installed yet. We plan to add them in summer.
Song: Stand by – Flavor Amp
‘Stand By’ is a raw and emotional track that dives deep into the suffocating reality of being trapped in a toxic relationship—a dynamic that mirrors the psychological and emotional patterns often found in addiction. The song paints a vivid picture of circular thinking and emotional dependency: the feeling of giving everything and receiving harm in return, the confusion of being hurt by someone who once promised love, and the inner battle of wanting to leave but being psychologically unable to do so.
The metaphor of being ‘on stand by’ captures a state of paralysis—still connected, still present, but unable to act or move forward. In the context of our concept album on addiction and dependency, this song stands as a powerful metaphor for emotional entrapment. Just like with substance or behavioural addictions, the individual becomes stuck in a loop: knowing something is damaging but feeling incapable of breaking away.
(Current) Lyrics of the song
Recording Setup
We recorded all 17 channels with my Midas M32-LIVE.
This was our patch plan:
CH1: Kick in (Audix D6)
CH2: Kick out (sE Electronics V-Kick)
CH3: Snare top (sE Electronics SE8)
CH4: Snare top (sE Electronics V7X)
CH5: Snare bottom (sE Electronics V-beat)
CH6: Hi-Hat (Shure Sm7b)
CH7: Tom 1 (sE Electronics V-beat)
CH8: Tom2 (sE Electronics V-beat)
CH9: OH HH (AKG C414)
CH10: OH Ride (AKG C414)
CH11: Ride (Neumann KM184)
CH12: Splashes (Neumann KM184)
CH13: Equal-Distance-Mic (Shure Sm57)
CH14: Mono Room (Neumann TLM102)
CH15: Droom L (Neumann KM184) (AB)
CH16: Droom R (Neumann KM184) (AB)
CH17: Hall (outside of the room) (sE Electronics SE8)
Before the session, I had a talk with Matthias Frank, who gave me some valuable input regarding microphone placement and recording techniques. At the end of the day, he advised me to close-mic as many individual components of the drum kit as possible, in order to have maximum flexibility during the mixing process — especially important for a complex 3D audio production.
We also worked with the overdubbing method to gain more control during the mixing and spatialization process. For example, during certain song parts, our drummer intentionally left out some cymbal hits while recording the main drum performance. We then recorded those cymbal accents separately, allowing us to freely position them in the 3D audio field later on.
Following this advice, we set up a wide range of microphones across the kit:
Kick Drum: Mic’d with two microphones — an Audix D6 inside and a SE Electronics V-Kick on the outside. Although I normally prefer a large-diaphragm condenser for the outside mic, using two dynamics turned out to be a great combination (I had no condenser mic left).
Snare Drum: We used three microphones: a typical dynamic mic (V7x) and a small-diaphragm condenser mic (SE8). The condenser captured more brightness and detail, but also more bleed, so I’ll decide during mixing which one fits best. On snare bottom we used the sE Electronics V-beat.
Hi-Hat: Mic’d using a Shure SM7.
Toms: We used the sE Electronic v-beat on both toms.
Overheads: For the overheads, we used AKG C414s — a classic choice known for their clarity and detail.
Cymbals: The ride and the splashes were individually close-mic’d.
Room Micing: Inspired by German engineer Moses Schneider’s techniques, I experimented with the “Droom” (Dream Room) method. This involves two small-diaphragm condenser microphones (cardioid pattern in our case) placed in an A/B stereo setup, but directed away from the drums to capture a very natural and wide room sound. Although hypercardioid microphones are recommended for this method, the cardioids we used worked surprisingly well. Additionally, we set up a mono room microphone the sound of the whole drum kit in our small room.
The Droom
Equal-Distance-Mic: We also used the so-called Equal-Distance-Mic. It’s a microphone placed centrally in the kit, heavily compressed to add punch and energy to the overall sound.
Equal distance mic
Creating the feeling of: Additionally, we placed five small-diaphragm condenser microphones in the stairwell outside the live room to capture a natural, distant reverb that adds spatial depth and emotional weight to the production. This setup was used specifically for a key transition in the song (2 bars) — moving from the breakdown into the final chorus — to sonically express the feeling of being trapped and relentlessly pursued by one’s surroundings.
Capturing the sound of the hall
At that point in the arrangement, the stereo panorama briefly expands, evoking a fleeting sense of escape, only to contract moments later into a confined, focused sound image — symbolizing the inability to truly break free. To reinforce this theme, I’m also considering adding a rotating movement to the sound elements in this section, echoing the chorus line: “I’m running in circles — I can’t stay.” This motion could enhance the sense of disorientation and emotional entrapment, both musically and conceptually.
Conclusion
The session was an important first step for both the project and the studio. Despite the room still being a work-in-progress, the recordings already sound very promising, and I’m excited to take the next steps in the production.
More updates on the studio construction and upcoming recording sessions will follow soon!
I was extremely tired while filming this, as we had been recording drums late into the night. So please excuse the slightly scattered way of speaking — it was a long but exciting session.
When I arrived at the presentation of Plane of Emergence at IRCAM, the setup looked surprisingly simple at first. On the floor, inside a black marked rectangle, were two small cube-like devices, standing quietly next to each other. A big screen behind them showed a live camera view of the scene. I noticed a line connecting the two cubes on the projection, showing exactly how far they were apart. This was made possible by a motion-tracking camera mounted above, constantly measuring their positions.
The artist explained that these devices were not normal speakers or instruments, but autonomous machines. They were able to listen, react, and transform musical patterns based on how close or far they were from each other. There was no conductor or composer telling them what to play — everything emerged from their interaction alone.
While listening, I could feel how the soundscape was always shifting. Sometimes you could recognize small repetitive patterns, like a rhythm or a melody fragment. But just when you thought something stable was forming, it suddenly dissolved into something new. The artist described this as a balance between “territorialization” — when the devices settle into stable patterns — and “deterritorialization” — when they break free and surprise you with unexpected variations. It felt like watching two creatures communicating and constantly changing their language.
The idea behind it is inspired by the philosopher Deleuze and his concept of the plane of immanence — a space where things don’t follow strict rules but constantly create themselves from within. I liked that you could really hear this concept, it wasn’t just theory.
Technically, the system is based on a previous project called Spatially Distributed Instruments, where the machines not only send sounds but also “listen” to each other without noticeable delay. The sound you hear is not pre-composed, it is created in real-time from their relationship in space.
Unfortunately, as the artist mentioned, only two of the planned interaction methods were working that day. But even with these limitations, it was fascinating to see (and hear) how rich and alive the system already was.
For me, it was less like watching a performance and more like observing a small ecosystem made of sound and technology.