03.10.: Fazit und Ausblick

Ich finde, ohne mich selbst zu viel loben zu wollen, dass sich über diese 17 Blogposts wirklich viel Getan hat. Von einem gänzlich neuen Thema zu einer klaren Idee, mit vorliegender Literatur, die eigentlich nur mehr darauf wartet von mir verschlungen und wiedergegeben zu werden.

Mittlerweile bin ich gut in Zypern angekommen, und ich würde behaupten ich habe mich nicht so schlecht in die Erasmus-Experience eingelebt, wenn man das so sagen kann. Jetzt wird es aber Zeit wirklich ins Tun zu kommen, um am Ende nicht den ganzen schönen Sommer hier auf Zypern vorm Laptop verbringen zu müssen, sondern stattdessen jetzt die harte Arbeit zu leisten. Mit der “ASC-Bibel” und “Painting with Light” habe ich im Zuge dieser Blogposts bereits zwei meiner insgesamt sechs Bücher, die ich als “Basisliteratur” für die Masterarbeit verwenden möchte, inhaliert und zusammengefasst. Die anderen vier liegen bereits neben mir hier am Schreibtisch und warten auf einen Tag, an dem die örtliche Erasmus-Organisation mich nicht unverschämt mit Freibier auf ihren Frat-Parties ködert – aber dieser wird kommen, ich glaube fest daran!

Zusätzlich habe ich mir gerade in Hinblick auf die Finale Einreichung des Exposé noch genauere Gedanken über meine Vorgehensweise bei der Filmanalyse gemacht. Auch wenn der erste Schritt das Zusammenfassen der Literatur sein wird, glaube ich braucht ein gutes Exposé vor allem viel Weitblick. Außerdem lässt es sich dann, finde ich zumindest, viel leichter arbeiten. Dafür habe ich mir bereits zwei E-Books zugelegt – Bücher in meine zypriotische Wohnung zu bestellen war mir dann irgendwie zu umständlich. Einerseits die fünfte Ausgabe von “Film- und Fernsehanalyse” von Knut Hickethier und die dreizehnte von “Film Art: An Introduction” von Bordwell, Thompson und Smith. Meiner Recherche zufolge ist das jeweils das deutsche und internationale Standardwerk zur Filmanalyse. Auch wenn darin natürlich viel mehr behandelt wird, hindert mich das ja nicht daran nur die Kapitel “Zur Analyse des Visuellen” beziehungsweise “The Shot: Mis en scene” zu verwenden, um nicht den Rahmen zu sprengen und mich wirklich auf das was ich behandeln möchte, nämlich die Analyse der Lichtsetzung, zu fokussieren.

Auch das sind bereits zusammen über 100 Seiten aus denen sich gut eine individuelle Matrix bauen lassen sollte, die wissenschaftlich fundiert ist und mithilfe derer ich dann Film für Film abarbeiten kann.

Damit (glaube ich jetzt zumindest) habe ich einen ziemlich guten Fahrplan für die nächsten Woche und Monate, den ich jetzt eigentlich nur mehr umsetzen muss. Das freut mich persönlich besonders, da ich eigentlich nicht wirklich der vorplanende Typus bin und deshalb im Moment des Verfassens dieser Zeilen selbst ein bisschen überrascht von mir selbst bin.

Zusammenfassend kann man also sagen, dass mir diese Blogreihe tatsächlich auf meinem Weg zum Status Quo geholfen hat, da sie mich gezwungen hat, mich früh genug mit der Suche meiner Literatur und mit der Literatur selbst auseinanderzusetzen, sodass ich jetzt bereits wirklich an einem guten Punkt bin, von dem aus ich in Ruhe arbeiten kann. Sofern mich das Wetter in Zypern eben lässt 😉

In diesem Sinne, wünsche ich allen Lesern dieser Zeilen einen schönen und hoffentlich stressfreien Masterarbeitssommer!

Blümel out.

Impuls 8: O´Sullivans Breakdown von “Oppenheimer”

Ich habe schon in ein paar Impulsen für diese Blogserie Filme analysiert und versucht herzuleiten wie und warum die Filmemacher bestimmte Entscheidungen getroffen haben. Schon damals war mir klar, möchte ich ähnliche Analysen auch in meiner Masterarbeit machen (und das möchte ich ja) brauche ich dafür eine zumindest strukturiertere (am besten aber sogar wissenschaftlich genormte) Herangehensweise, sodass meine Ergebnisse einerseits einheitlich zwischen den Filmen, aber auch für Dritte nach außen hin nachvollziehbar sind. Spätestens im Gespräch mit Ursula Lagger wurde diese Vermutung bestätigt, woraufhin ich im letzten Impuls dieser Blogserie einmal kurz nachsehen wollte, wie der Wandering DP seine Filmanalysen (beziehungsweise Breakdowns) so aufzieht. Dafür habe ich mir seinen Patreon Breakdown zum Film Oppenheimer angesehen.

Herangehensweise

O´Sullivan sucht sich im Vorfeld ungefähr 10 bis 15 Szenen aus dem Film heraus, die er in (in diesem Fall) 30 Minuten durchbespricht. Dabei geht er voll allem auf drei Dinge ein: Blocking, also wo stehen die Charaktere im Raum und in welchem Verhältnis zueinander, Kameraposition, also auf welche Seite der line of action befinden wir uns und warum, und Lichtsetzung, also woher kommt das Licht, welche Qualität hat es und warum.

Dies macht er aber ohne echten Leitfaden, also nicht genormt wie bei einer Masterarbeit, dass er Shot für Shot den selben Katalog abarbeitet. Stattdessen konzentriert er sich mehr auf das, was seiner Meinung nach am wichtigsten in dem Shot ist, also warum sind wir beispielsweise hier auf der rechten Seite und nicht auf der linken. Dies erklärt er dann mit allen Vor- und Nachteilen.

Ich denke, dass ich hiervon semi viel für meine Masterarbeit mitnehmen kann. Das worauf er achtet, ist definitiv auch genau das, worauf ich mich in meinen Analysen konzentrieren will. Keine Analyse der Handlung, der Charaktere, oder Ähnlichem, sondern eine Analyse all derer Faktoren, die bestimmen wie das Bild am Ende aussieht, nicht was darin passiert. Dafür brauche ich aber eine striktere Herangehensweise und klare Regeln, um meine Arbeit auf ein wissenschaftliches Niveau zu bringen.

Conclusion – Reflections on Immersive Music Production

This project set out to explore how immersive audio formats can be used as an integral part of music production rather than as an additional or purely technical layer. Over the course of the project, it became clear that working in 3D audio fundamentally affects compositional, arrangement-related, and production decisions. Spatial considerations do not emerge only at the mixing stage, but influence songwriting, recording strategies, and performance choices from an early point onward.

A central insight of the project is that spatial width and motion are most effective when used deliberately and in contrast. Excessive or constant spatial expansion can reduce musical impact, whereas controlled changes in spatial density and focus can significantly enhance the perceived energy of specific song sections. In this context, immersive audio proved particularly valuable for shaping structural contrasts, clarifying arrangements, and reducing perceptual masking through spatial distribution rather than aggressive spectral processing.

From a technical perspective, the comparative use of Ambisonics and Dolby Atmos workflows provided valuable insights into different production philosophies. Ambisonics offered a flexible and performance-efficient environment for exploratory spatial work, while Dolby Atmos proved especially practical for structured production workflows and distribution on current streaming platforms. Neither approach emerged as universally superior; instead, their strengths depended on artistic intent, playback context, and production requirements.

Overall, the project demonstrates that immersive audio can serve as a meaningful compositional and narrative tool in contemporary music production—provided that spatial decisions remain grounded in musical intention and listener perception. Rather than treating 3D audio as a novelty, this work argues for its thoughtful integration as an expressive dimension that supports, rather than overshadows, the music itself.

Acknowledgements

I would like to sincerely thank Alois Sontacchi for his continuous support throughout this project. Our discussions were consistently insightful and inspiring, not only in relation to this work, but also beyond its immediate scope. A special thanks also goes to Benjamin Pohler, who was always available for short (or longer) conversations and quick exchanges of ideas.

Workflow Comparison: Ambisonics vs. Dolby Atmos

Based on practical experience gained throughout the project, both workflows revealed distinct strengths and limitations that influenced artistic decisions, technical handling, and playback outcomes.

One noticeable difference concerned vertical spatial resolution. In the Ambisonics workflow, access to a continuous vertical sound field allowed for more flexible and coherent vertical movements. In contrast, a Dolby Atmos setup, as used in this project, did not include a top center speaker. This limitation became particularly apparent in sections where vertical motion played a structural or emotional role, such as moments where sound elements were intended to move upwards. During playback in the Cube, this difference was emphasized further, as the upper loudspeaker layer consists of five speakers that could not be addressed using the chosen Dolby Atmos configuration.

Despite this limitation, the Dolby Atmos workflow proved to be highly efficient and reliable. The integration of the Dolby Atmos Renderer directly into Cubase and Nuendo allowed for seamless monitoring across different loudspeaker layouts, as well as quick evaluation of stereo downmixes and binaural renders. This level of integration significantly simplified workflow management and made it easy to check translation across formats within a familiar DAW environment.

In comparison, working with Ambisonics in Reaper was considerably more performance efficient. Even with large sessions consisting of 120 to 150 tracks, CPU usage remained comparatively low. The IEM Plugin Suite offered a powerful and intuitive toolset for spatial encoding and decoding, reverberation, and sound design tools, enabling many creative possibilities with minimal system load. This made Ambisonics particularly suitable for exploratory work and complex spatial experimentation.

Another key difference lay in signal organization and processing philosophy. The Ambisonics workflow encouraged early grouping and encoding strategies. The Dolby Atmos workflow, on the other hand, offered greater flexibility for multichannel summing and corrective processing at the subgroup level, particularly through the use of multichannel-capable plugins. While both approaches were effective, they led to different working habits and influenced how spatial and tonal decisions were made during mixing.

From a distribution perspective, the Dolby Atmos workflow proved to be more practical. At the time of writing, immersive music releases on major streaming platforms require delivery in the ADM format. Working directly within a Dolby Atmos environment allows for a straightforward ADM export that aligns with current industry standards for music distribution. This made the Dolby-based workflow particularly suitable for release-oriented productions, whereas Ambisonics workflows typically require additional conversion steps before meeting platform-specific delivery requirements.

Overall, neither workflow proved universally superior. Instead, each approach offered specific advantages depending on artistic intent, technical requirements, and playback context. The comparative use of both workflows throughout the project contributed significantly to a deeper understanding of immersive music production practices.

Practical Limitations and Session Transfer Issues

Although not directly related to the spatial workflows themselves, practical challenges arose during the transfer of sessions to the production studio system. Due to compatibility issues between different versions of the FabFilter plugins (notably Pro-Q 3 and Pro-Q 4), session interchange became unexpectedly time-consuming.

Sessions created with older plugin versions could not be opened using newer versions, and vice versa. Attempts to work around this limitation, such as using user presets, were unsuccessful, requiring all equalization settings to be recreated manually. This significantly increased preparation time and highlighted an often-overlooked aspect of production workflows: plugin version compatibility across different systems.

EAR Production Suite Experiments

As part of the ongoing series on spatial mixing approaches in practice, this post focuses on experimental tests conducted with the EAR Production Suite (EPS). These experiments were carried out at a late stage of the project and aimed to explore alternative ADM-based playback and conversion workflows.

EAR Production Suite Experiments

In parallel, experiments were conducted using the EAR Production Suite (EPS). These tests took place during the weekend prior to the final presentation, which significantly limited the available time for extended troubleshooting and deeper investigation.

The EAR Production Suite is a set of VST plugins developed by BBC R&D and IRT under the EBU, designed to enable immersive audio production using the Audio Definition Model (ADM). It allows for importing, exporting, and monitoring ADM content for various loudspeaker configurations based on ITU-R BS.2051, using the ITU ADM Renderer. The suite is primarily optimized for Reaper and serves as a reference implementation for ADM-based workflows[1].

Using the EAR Production Suite, I tested alternative playback and conversion approaches, including rendering ADM content into Ambisonics formats. However, during these tests, unexpected behavior occurred, such as excessive spatial spread and routing inconsistencies. Resolving these issues would have required more extensive investigation and testing.

Due to limited working time in the Cube and the need for a fail-safe playback solution, I ultimately decided against further experimentation with the EAR Production Suite in this context. Instead, the fully channel-based rendering approach, as mentioned before, was chosen for all listening examples used in the presentation.


[1] “EAR Production Suite,” accessed February 6, 2026, https://ear-production-suite.ebu.io//.

Dolby Atmos – Workflow Comparison and Technical Reflection

Continuing the series on spatial mixing approaches in practice, this post focuses on the Dolby Atmos workflow I used for Alter Me and Caught In Dreams, and on the practical steps taken to prepare ADM exports and playback in the IEM Cube.

For the Dolby Atmos productions, I decided to work in Cubase and Nuendo, as the Dolby Atmos Renderer is already fully integrated into both environments. This allowed for a streamlined workflow without the need for external rendering tools[1].

After completing the stereo mixes of Alter Me and Caught In Dreams to an advanced stage, the sessions were converted into Dolby Atmos projects. Cubase provides an automated conversion process in which all existing tracks are initially routed into a standard bed configuration.

For my workflow, I used the standard bed primarily for reverberation. I also used an Ambisonics bus with the Room Encoder and the FDN Reverb as a reverb send. Since the standard bed in Dolby Atmos is limited to a maximum configuration of 7.1.2, I deliberately avoided placing direct sound sources in this bed. Instead, I created a so-called object bed. In this setup, 11 objects were placed at the exact positions of the loudspeakers (used in the production studio), which in my case was the 7.1.4 configuration at the IEM production studio.

Routing signals into this object bed allowed me to address individual loudspeakers, provided that the loudspeaker positions were correctly defined. While this spatial correspondence was largely accurate in the production studio, minor deviations remained due to differences between the virtual speaker layout and the physical setup (higher elevated top speakers for example).

Subgroup structure and processing

In addition to object-based routing, extensive use of subgroups was made. Instrument groups such as drums, guitars, and vocals were routed into dedicated multichannel buses. For example, the drum signals were routed into a 7.1.4 drum bus, allowing for internal panning decisions as well as group-based processing.

Within these subgroup buses, summing and tonal shaping were carried out using multichannel-capable plugins, primarily from the FabFilter suite. Compared to the Ambisonics workflow, this approach provided greater flexibility for summing and corrective processing at the group level, while the overall structural logic of the routing remained similar.

Signals involving pronounced movement or spatial automation were routed directly to objects. In cases where a sound source only changed position briefly within a song, the track was often routed into the object bed and automated using the track’s multipanner rather than being continuously treated as a Dolby Atmos object.

LFE handling

The Low Frequency Effects (LFE) channel was deliberately not used in this workflow. Although the LFE channel is definitely part of standard Dolby Atmos workflow, it is often not used in music production. By excluding the LFE channel, the separation between the standard bed and the object bed remained clear, as any signal intended to address the LFE channel must be routed in the bed. This decision helped maintain a clean and predictable routing structure.

Export and playback preparation for IEM CUBE

At the end of the production process, an ADM file was rendered directly from Cubase. For playback preparation in the Cube, several approaches were tested with the goal of ensuring a stable and reliable setup for the final presentation of this project.

The ADM file was imported into Nuendo and up-rendered to a 9.1.6 configuration. At the time of production, I was not aware that the production studio system (their Nuendo version) also supported a 9.1.6 setup. In retrospect, creating the object bed directly in 9.1.6 would have been the more precise solution.

The up-rendered 9.1.6 mix was then exported as a channel-based 16-channel WAV file. This file was routed manually and directly to the corresponding loudspeakers in the Cube, ensuring full control over playback and eliminating potential uncertainties related to rendering or decoding behavior.


References:

[1] “Getting Started in Dolby Atmos with Steinberg Cubase and Nuendo,” accessed February 8, 2026, https://professionalsupport.dolby.com/s/article/Getting-Started-in-Dolby-Atmos-with-Steinberg-Cubase-and-Nuendo?language=en_US.

Ambisonics – Workflow Comparison and Technical Reflection

Ambisonics Workflow

When it came to mixing in 3D audio, I decided to begin my first immersive mixing experiments using Ambisonics in Reaper rather than Dolby Atmos. This decision was mainly influenced by the IEM Plugin Suite, which provides intuitive and flexible tools for Ambisonics mixing and made the initial entry into 3D audio more accessible.

I chose to work with fifth-order Ambisonics for this project to achieve a more accurate and immersive rendering of diffuseness, spaciousness, and spatial depth. While first-order Ambisonics might seem sufficient due to the even nature of diffuse sound fields, in practice, their low spatial resolution leads to high directional correlation during playback, which significantly impairs the perception of these spatial qualities. Higher-order Ambisonics, in contrast, improves the mapping of uncorrelated signals and preserves spatial impressions much more effectively. Psychoacoustic research has shown that an Ambisonic order of three or higher is required to perceptually preserve decorrelation between neighboring loudspeakers, which is crucial for rendering depth and diffuseness. Fifth-order Ambisonics further enhances this, particularly outside the sweet spot, providing a more consistent spatial experience across a larger listening area. As demonstrated in the IEM CUBE, a fifth-order system allows nearly the entire horizontal listening plane—in this case, a 12 × 10 m concert space—to become a valid and perceptually plausible playback zone. [1]

Thus, fifth-order Ambisonics is not only a practical choice for immersive production in larger spaces, but it also strikes an effective balance between spatial resolution, technological complexity, and perceptual benefit [2].

I also had the opportunity to experience this myself during a small listening test we conducted with Matthias Frank. We listened to first-, third-, and fifth-order Ambisonics in a blind comparison and were asked to rate certain spatial parameters like spatial depth or localization. The first order was quite easy to identify due to its limited spatial resolution. However, distinguishing between third- and fifth-order Ambisonics proved to be much more challenging, as the differences were often subtle and less immediately perceptible.

After that, I started with setting up the routing, which was one of the most underestimated parts of this project. Similar to a traditional stereo production, I created a structure of groups and subgroups, but adapted it for Ambisonics. For example, in the drum section, encoding happens at the main drum group via the IEM Multi Encoder. All individual channels are routed into that group, allowing me to process them using conventional stereo plugins before spatializing them — saving both CPU resources and maintaining flexibility in the early mixing stages.

Within the drum routing, I created subgroups for kick, snare, overheads and the “Droom”, allowing for finer control and processing. When dealing with coherent signals, such as double-tracked guitars, I first routed both signals (panned hard L & hard R) into a stereo group to conserve CPU power by processing them together. This group is then routed into a master guitar group that handles Ambisonics encoding. Since the L and R signals remain separated, they can still be treated independently in the encoder and placed individually in the 3D field.

I followed the same approach with vocals, organizing them into groups before routing them into the Multi Encoder. For specific adlibs, I used the Granular Encoder to create glitchy, scattered spatial effects.

To add a sense of depth and immersion to the vocals, I used a small amount of FDN Reverb for diffuse reverberation and the Room Encoder for early reflections — all plugins from the IEM Suite.

Finding this optimal signal flow took considerable time and experimentation. It was a major learning process to understand how to best structure a large session for Ambisonics.


References

[1] Franz Zotter and Matthias Frank, Ambisonics: A Practical 3D Audio Theory for Recording, Studio Production, Sound Reinforcement, and Virtual Reality, Springer Topics in Signal Processing (Springer International Publishing, 2019), 19:18–20, https://doi.org/10.1007/978-3-030-17207-7.

[2] Zotter and Frank, Ambisonics, 19:18–20.

Workflow Comparison and Technical Reflection

As part of the ongoing series on spatial mixing approaches in practice, this post shifts the focus from artistic decisions to a technical reflection on the workflows used throughout the project. The following sections outline how different immersive production approaches influenced working methods, creative flexibility, and playback outcomes.

Workflow Overview

This chapter outlines the different production and mixing workflows used throughout the project. While all recordings were carried out using the same studio environment and similar recording setups, two distinct immersive audio workflows were applied during the course of the project.

The first workflow is based on Ambisonics and reflects my initial approach to immersive music production. This workflow was primarily explored during the production of Standby and served as an entry point into working beyond stereo formats.

As the project progressed, a second workflow based on Dolby Atmos was introduced and applied to the subsequent tracks Alter Me and Caught In Dreams. This shift allowed for a comparative evaluation of both approaches in terms of practical handling, artistic possibilities, and production implications.

All projects had about 120–150 individual tracks. Recording was carried out using Cubase and Reaper, depending on the session requirements. Ambisonics mixing was performed in Reaper, while Dolby Atmos productions were realized using Cubase 15 and Nuendo 13. The following blog entries describe both workflows separately, focusing on their respective structures and characteristics.

Motion and Vertical Movement as Structural Tools – Spatial Mixing Approaches in Practice

Continuing the series on spatial mixing approaches in practice, this post focuses on two spatial strategies applied in Caught In Dreams that intentionally challenge listener perception. Both examples explore motion and verticality as expressive devices and examine their role as structural and narrative tools within immersive music production.

Motion as Creative Risk

An experimental spatial decision was made during a two-bar drum fill preceding the second chorus. In this section, the drum signal is rotated around the listener. This moment coincides with the lyric “turning nights into nightmares” and was intended to briefly destabilize the listening perspective.

This decision was approached deliberately as a creative risk. While the movement can be perceived as engaging and expressive, it also raises questions regarding distraction and musical focus. The example was included to provoke reflection on how much spatial motion is appropriate within groove-based music and where the boundary between expressive effect and overuse may lie.

Vertical Movement as Formal Break

A further spatial strategy occurs during a short bridge following the second chorus. This section represents a moment of realization, expressed in the lyrics “I woke up and realized it was just a dream.” At this point, multiple elements—including ride cymbals, guitars, and vocals—are shifted upward in the vertical dimension.

This vertical movement functions as a formal break rather than a continuous effect. After this section, the mix collapses back toward a more frontal and dry presentation, reintroducing a mono-oriented guitar similar to the intro. The contrast emphasizes the narrative shift and prepares the listener for the final section of the song.

The spatial strategies discussed above were realized using two different immersive audio workflows. The following blog posts provides a comparative reflection on these workflows and their implications for music production and playback.

Reduced Masking Through Spatial Placement – Spatial Mixing Approaches in Practice

Caught In Dreams

As part of the ongoing series on spatial mixing approaches in practice, this post shifts the focus from Alter Me to the second track discussed in detail: Caught In Dreams. The following sections outline the song’s emotional context and a key spatial mixing strategy applied during its production.

Song Context and Emotional Arc

Caught In Dreams addresses the realization that certain dreams and ideals can become dangerous illusions. The song reflects a gradual loss of grounding driven by the desire for more, leading to a feeling of being trapped within one’s own expectations. While the track maintains a dreamy and indie-inspired character, it also aims to confront the listener with the consequences of losing balance and perspective.

Reduced Masking Through Spatial Placement

A central advantage of immersive mixing in Caught In Dreams lies in the increased spatial capacity compared to stereo production. By distributing sound sources across multiple loudspeakers rather than concentrating them within a left–right panorama, significantly more space is available. This spatial separation reduces the need for aggressive EQing and helps to minimize masking between competing elements.

As a result, overlapping frequency ranges—for example in the low-mid region—become less problematic, as spatial separation supports perceptual differentiation between sources.

The use of a dedicated center speaker further contributes to this effect. Unlike a phantom center, which relies on equal energy from the left and right channels, a discrete center channel allows the lead vocal to be placed alone in one speaker. This reinforces intelligibility and reduces interference with other centrally positioned elements.

A direct comparison between the stereo vocal mix and the immersive version demonstrates that the 3D mix achieves a more open vocal sound with reduced masking, not primarily through equalization, but through spatial distribution. This example highlights how immersive audio can create mix clarity by reallocating elements in space rather than by removing frequency content.