Storytelling Affordances
A theoretical framework for creating spatial stories
Abstract
The term “spatial storytelling” describes any narrative that uses audio and visual information to place the audience inside the scene.
Each spatial storytelling medium can be charted along an axis depending on the audience’s agency, between active participant and passive observer. Each medium along this spectrum has its unique set of affordances. The most compelling mediums for storytelling (spatial or otherwise) seem to be those that allow narrative transportation in which the audience loses their sense of self and immerses in the story. Finding a sweet spot of affordances is vital to create an experience that is truly immersive.
This paper charts spatial storytelling mediums according to the affordances each gives to its audience. Then it discusses several examples of spatial stories and the success or failure of each affordance. Next it describes the process of shooting and creating a spatial story in 3-Degrees of Freedom (DoF) 180 Virtual Reality (VR), along with its unique challenges. Ultimately it outlines a narrative “grammar,” or set of guidelines for effective spatial storytelling.
Review of Affordances in Apple Vision Pro's Submerged
Spatial Medium Affordance Spectrum
Different mediums grant the creator one set of affordances and the audience another. For example, in 2-Dimensional cinema, the camera lens affords the director the ability to focus the entire frame on one subject and hide, blur, or frame out everything else. But this causes an inverse lack of affordances for the audience, who may now only look exactly where the director forced them to. The chart below is concerned exclusively with the affordances granted to the audience, as these are ultimately the causes of narrative transportation.
A correlation is obvious in the chart: narrative transportation decreases as audience affordances increase. Stories are drowning in affordances.
Prototype Spatial Story: Launder
To apply the learnings from analyzing the above affordances, we created a 180-degree 3D VR short film called Launder.
Premise
One roommate is cajoled by the other to get their laundry from the creepy basement. They become convinced that they’re being followed and barely make it back to the apartment in one piece.
Successful Affordances
Launder built upon the successful affordances of Faceless Lady and Submerged like cuts, framing the action in the stereo 3d sweet spot, empathetic camera angles that showered the audience only views that the characters could themselves see, and following genre conventions.
Door cuts. In addition, Launder addressed specific affordance failures in Submerged, like bringing the audience through doors before they were closed to create a congruent sense of space.
Leading shots. To solve Submerged’s shot-reverse shot issues the audience was only placed in a scary environment when intentionally necessary to increase their narrative immersion, right as the characters themselves entered the environment.
Unsuccessful affordances
Static camera. Gear limitations required the camera to be on a tripod instead of moving, which led to a less dynamic sense of 3D space and creates a sense of “stuckness” in the audience where they are less included in the scenes than if the camera were free to intentionally, slowly, move.
Non-spatial audio. Many spatial cues come from audio, but spatial audio was beyond the scope of this test, leading to a disconnect between the rich 3D 180-degree world seen visually, and the “flat” stereo sound which does not place the character’s voices on their bodies, ruining the sense of reality and narrative transportation.
Vertical displacement. Filming 180-degree stereo is fraught with difficulties. Two cameras with two separate lenses and often two separate sensors must record wide-angle high resolution photographs 30 - 60 times per second, and must do so at the exact same instant or else the audience’s right eye and left eye will see slightly temporally offset vision, which is problematic when the subject is moving. The cameras also must be aiming directly ahead in an orthographic plane (Vienne et. al, 2016) Bourke, tk https://paulbourke.net/stereographics/stereorender/) or else the two planes perceived by the left and right eye will misalign along the vertical axis. Correcting this in post-production is time consuming and tedious, despite numerous tools like Mistaka VR being available.
Spatial Storytelling Affordances for Creators
Below is a list of spatial storytelling affordances that creators can use to increase narrative transportation in their viewers. This is non-exhaustive, and will be added to as the mediums evolve.
Successful affordances
Cuts
Edit between different shots as frequently as every 4 seconds, or as long as minutes
Longer cuts afford the audience more time for narrative transportation into the scene
Shorter cuts afford the audience more intuitive understanding of the space
Avoid shot-reverse-shot cutting between characters in dialogue
Resolution
Shoot in 8K 180-degree stereo at 60 frames per second, or as close as the gear allows
Narrative transportation increases with the fidelity of the world portrayed
Level camera angle
Orient camera directly level at the horizon
Stereo Sweet Spot
Frame all action in the central 45° of camera’s forward direction
Stereo effect is most comfortable in this range, creating a “frameless frame”
Camera movement
Move the camera laterally at a constant velocity
A rule of thumb is the Train Platform Effect: if you were in a vehicle that started moving at this speed, would the passenger know without looking out the window?
Do not rotate the camera on any axis.
Backaway reveal
Withhold information about what exists behind the camera intentionally
Either direct all action in front of the audience, or gradually reveal what was beyond the frame to increase tension
Empathetic perspective
Show the audience only what the character themselves can see
Dramatic Irony perspective
Show the audience what the characters cannot see
6 DoF Movement
Provide six degrees of freedom to allow the audience to gain more depth cues from the scene than mere stereo imaging
AI depth maps or game-engine generated images can provide 6 DoF
Acting/Dialogue
Ensure all dialogue sounds natural to real people spoken in real time.
Direct acting to be natural, which falls somewhere between cinematic (facial subtleties) and theatrical (bodily exaggerations)
Episodes
Break all stories into chunks of twenty minute or less for eye and neck strain
Genre
Follow genre conventions to allow audience to focus on storyline and character
Affordances to avoid:
Interaction
Picking up items, completing tasks, choosing dialogue, and moving through space all increase world immersion at the expense of narrative transportation
Leading shots
Moving the audience backwards must be done carefully to avoid placing them in a setting that is different emotionally than the character’s
Uncomfortable stereo imaging
Minimize vertical and horizontal displacement
Close-ups
Honor conventions around personal space, and keep characters as far from the camera as they would comfortably be to a stranger in the real world
Non-level camera angles
Aiming the camera anywhere but forward makes the world feel tipped, rather than making the audience feel like they are looking down.
Panning
Rotating the camera at all along the vertical axis induces dizziness immediately.