This post is created to keep track of my progress in Embodied Interaction course at Aalto University.
Seeking a Foundation for Context-Aware Computing
- Ubiquitous Computing (Weiser)
- Tangible Bits (Ishii)
"[...] opportunities to tie computational and physical activities together in such a way that the computer “withdraws” into the activity, so that users engage directly with the tasks at hand and the distinction between “interface” and “action” is reduced." - "the idea that the world is the interface"
- "presence and participation in the world, real-time and real-space, here and now"
- "participative status, the presence, and occurrentness of a phenomenon in the
- embodiment is about establishing meaning.
Bret Victor – A Brief Rant on the Future of Interaction Design
Responses and Follow-Ups to the Text Above
Imaginary drum, each surface around one has its own unique timbre where the sound can be manipulated in real-time
Inspo: "Ear to the ground" - John Sanborn
The presentation of the project can be found here.
Ideas for the Final Project
Inspired by Infinity, I would like to create generative monsters for each interactant. One can see these generative characters that are created in a randomized fashion on the projected surface as if seeing oneself. This randomization is mainly the combination of 3D assets to create a unique character each time a new onlooker comes into the picture.
3D Theater Costumes
Again character generation for each onlooker. The interactant will be seeing the character in the projected surface as if seeing one's own reflection in a mirror, yet the reflection will be constructed with geometric shapes as in what Archipenko has done. For the second step, these colored geometric shapes can be read as a graphic notation to generate sound. That means the body transformed into a generative digital sculpture and the sculpture turned into sound as a second level of transformation. I really like Oskar Schlemmer's costume designs for Triadic Ballet. I can think these 3D digital sculptures made of geometric shapes can be interpreted as such costumes.
The simulacrum is never what hides the truth - it is the truth that hides the fact that there is none.
The simulacrum is true.
(Baudrillard, J. (1981). Simulacra and simulation. The University of Michigan Press.)
The AI-driven world-reconstruction techniques have reached such an advanced level that precise representation of physical space is quite possible. Questioning potential dimensions of hyperreality, the interactive work Simulacrum proposes to rearticulate the realm between simulation and the observed one. The artist proposes to deconstruct the truth and reconstruct it in the field of the absurd, in the manner of a pataphysician. While walking on the tightrope between fact and fantasy as an aerialist, she is breaking the conventional functionalities of mirror objects by transforming the world into an experience where nothing seems to obey the laws of nature.
This interactive installation will consist of three full-length mirror-size screens (55-65 inches). The accelerating efforts put into learning-based AI research in the field of view synthesis now made it possible to achieve state-of-the-art results in synthesizing novel views of complex scenes by using a sparse set of inputs. These AI-driven view synthesis methods will be executed to reconstruct the space in a digital environment. The 3D-constructed space will reflect the environment where the screens are installed. In the end, the screens will be functioning as actual mirrors while showing the interior, yet the same does not apply to the reflection of onlookers.
AI-driven View Synthesis Methods
• NeuMan: Neural Human Radiance Field from a Single Video
• Instant-NGP: Instant Neural Graphics Primitives
• Mega-NeRF & Mega-NeRF Dynamic Viewer
• Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields
• Ref-NeRF: Structured View-Dependent Appearance for Neural Radiance Fields
• RawNeRF: NeRF in the Dark- High Dynamic Range View Synthesis from Noisy Raw Images
• SNeRG: Baking Neural Radiance Fields for Real-Time View-Synthesis
Example output of a view-synthesis method: https://www.youtube.com/watch?v=RyEVh1Orv2Y
The precision of the 3D constructed view is really important to emphasize the concept of simulation. That being the case, would it be better to go for 3D scanners (like laser scanners or what we have for that purpose) or AI techniques?
Inspired by the surrealist short film The Flat (Jan Švankmajer, 1968) the first mirror will take the reciprocal of the interactant. The interactant won’t be able to see its reflection in the way one would expect, rather the view from the back side camera will be shown on top of the reconstructed space in which the viewer encounters not oneself but their own back. The reflection of oneself will get closer while the interactant moves away from the screen and vice versa.
1. For this purpose, the background environment will be created in a digital 3D setting by executing the above-mentioned AI-driven methods.
2. In the second step, the user’s gaze-tracking data extracted from the front-side camera will be used to change the perspective of the constructed view of the space, since the field of vision changes depending on the viewer’s position with respect to the mirror. The gaze tracking task has already been implemented using Google’s AI Research called MediaPipe during the artist’s residence in ZKM’s Museumstechnik department.
3. The third step will be the real-time background removal from the video feed which came from the back-side camera since only the person’s silhouette will be demonstrated on top of the reconstructed 3D space. To reach this aim, the method called Robust High-Resolution Video Matting with Temporal Guidance has already been applied. The outputs of this research are highly satisfactory in terms of its real-time performance and ability to separate the rough areas such as hair from the background.
The second mirror will use the same AI principles as the first one but in a different manner. Another surreal mirror, where this time, the eye contact with oneself results in the disappearance of the viewer/gazer. The viewer won't be able to see himself on the screen while looking directly at it. When the interactant looks elsewhere or outside the boundaries of the mirror screen, the interactant's reflection will be visible on the constructed 3D space.
This time, there will be only one camera on the front side alongside the screen for gaze tracking. The space will be constructed using the chosen AI-driven method, and background removal will be applied to the video feed coming from a single camera as the first one.
The third mirror will change the facial expressions of the observer in their reflection. While doing so, Deepfake AI research such as Face2Face: Real-time Face Capture and Reenactment of RGB Videos will be implemented. This mirror will be reflecting one’s emotions which do not seem to exist at this moment in time. The facial expressions will be manipulated in the reflection in accordance with the pre-recorded video feed of an actor performing various emotions in an exaggerated way.
For constructing the background space in reflection, the chosen AI view-synthesis method will be applied as in the previous mirrors. The gaze tracking again will be the key for both calibrating facial expressions and changing the viewpoint of background space in accordance with the interactant’s position.
The viewer is not able to see himself on the screen while looking directly at it. When the interactant looks elsewhere or outside the boundaries of the mirror screen, the interactant's reflection becomes visible in the constructed 3D view of the reflected space.