
Haptic Visuality: Structures of Perception in Photography, Memory, Institutions, and Diversity
An installation work is currently in production (scheduled for release in 2026).
Second Draft of the Essay “Haptic Visuality: Structures of Perception in Photography, Memory, Institutions, and Diversity”
By Toru TANNO / Written on June 7, 2025
⸻
Chapter 1 – Introduction: The Enigma of the Surface
Problem Statement: What is the depth of the “surface” in visual representation?
Background and Scope: The intersections of body, memory, culture, technology, and institution
Methodology: Semiotics / Phenomenology / Structuralism / Complexity Theory / Posthumanism / Optical Technology Studies
Aim and Significance: Redefining diversity and perceptual reality
Photography may be described, in its most immediate sense, as nothing more than an optical image formed on a two-dimensional plane. At first glance, this intuition appears correct. Yet in the act of encountering a photograph, its surface is never experienced as a mere flatness. Rather, emotions are stirred, memories are evoked, and at times an almost tactile sense of contact—an illusory thickness—emerges. What, then, is this “thickness or depth of the surface”? This paradoxical problem forms the point of departure for the present study.
Vision is not a neutral optical input. Instead, it is a multilayered field where body, culture, society, technology, and institutional frameworks intersect. Perceptual reality is not physical reality itself but a phenomenal stratum, a negotiated construction organized through the structures of perception in relation to the physical world. In this process, not only the sensory organs of the perceiving subject but also social norms, cultural narratives, technological mediation, and institutional contexts are all complicit in the generation of meaning.
From this standpoint, photography cannot be reduced to a mere recording device. It must be redefined as one modality of the apparatus that configures perceptual reality. The photographic surface, therefore, is more than an optical trace; it harbors within itself a “thickness of meaning” produced by the interplay of author, viewer, cultural institutions, and technological environments. In this essay, this will be referred to as the “surface domain,” a conceptual field encompassing the materiality of the surface together with the non-perceptible dimensions of physical reality.
Central to this inquiry is the subtle disjunction between “record” and “memory” in photography. The fragment of the physical world captured by an optical device may appear objective, yet the generation of meaning is always culturally conditioned. Memory is discontinuous, supplemented, misrecognized, and reshaped through institutional vocabularies—captions, historical facts, or authorial intentions. The question “Is vision truly objective?” must therefore stand as a philosophical core of this paper.
This research draws upon semiotics (Barthes), phenomenology (Merleau-Ponty), and art criticism (Benjamin), while also engaging complexity theory, posthumanist thought, and optical technology studies. Where relevant, theoretical perspectives from Baudrillard, Derrida, Foucault, and Deleuze–Guattari will be referenced in order to consider the intersections of perceptual and institutional structures. In the field of optical technologies, particular attention is paid to computational optical perception models, represented by subsurface scattering (SSS), ACES and linear workflows, and 32-bit floating-point processing. These frameworks illuminate the discrepancy between light as a physical phenomenon and the image as a perceptual event.
Furthermore, this study introduces an institutional critique through the question, “Who is the author?” Building on Barthes’s “Death of the Author,” Foucault’s “author-function,” and Deleuze–Guattari’s generative subjectivity, the paper argues for the redefinition of authorship as an institutional interface. Within a medium inherently grounded in reproducibility, the role of the author may surface less as a creative origin than as an index of institutional identification.
The theoretical fulcrum of this investigation is Laura U. Marks’s concept of haptic visuality. This notion releases vision from the distance of grasping and reopens it as a thickness of perception connected to corporeal tactility. Haptic visuality thus illuminates the dynamism of perceptual reality that this essay seeks to articulate.
In sum, this study interrogates the depth that inhabits the photographic surface. By pursuing this paradox, it aims to renew contemporary photographic practice and visual culture theory, while exploring epistemologies open to diversity and the possibility of coexistence.
Chapter 2 – Previous Research and Critical Issues
2.1 Conventional Frameworks in Theories of Photographic Expression
Since its inception, photography has been examined through diverse theoretical frameworks concerning its characteristics, essence, and social function. Particularly from the latter half of the twentieth century into the twenty-first, it has become a point of convergence for art criticism, semiotics, phenomenology, structuralism, deconstruction, and feminist critique, among others.
Roland Barthes, in Camera Lucida, articulated photography’s dual structure—studium (cultural legibility) and punctum (embodied impact)—to show that photographic interpretation is never homogeneous, but rather occurs at the intersection of cultural codes and corporeal shock. Walter Benjamin, in The Work of Art in the Age of Mechanical Reproduction, examined the paradox of aura and its political implications within photography as a reproducible medium. Both approaches positioned photography not as a mere image of reality but as an apparatus generating meaning within cultural structures.
Jean Baudrillard later radicalized this trajectory, arguing that in a world saturated by reproducibility, simulacra proliferate autonomously, destabilizing the relationship between reality and the image.
⸻
2.2 The Phenomenological Turn and the Return of the Body
Against such language-centered approaches, the phenomenological turn repositioned photography as an act of perception. Maurice Merleau-Ponty, in Phenomenology of Perception, emphasized that vision is not a passive reception of external images but a reversible experience in which the body engages the world. To see is always, in some sense, to touch; and in this reversibility, the embodied nature of vision is revealed.
This perspective developed into Laura U. Marks’s theory of haptic visuality, which redefined vision by liberating it from remote, informational functions and grounding it in embodied sensation. In doing so, Marks opened new possibilities for cultural and bodily multiplicity in visual perception.
⸻
2.3 Advances in Technology and Computational Photography
The rapid evolution of photographic technology has also transformed the medium. With the rise of digital processes, techniques such as subsurface scattering (SSS), ACES and linear workflows, and 32-bit floating-point processing have made it possible to simulate light behavior with unprecedented precision. These advances enable a computational optical perception model in which perceptual correction is managed as a programmable variable.
In this sense, photography has ceased to be a passive imprint of reality; it has become a process of computational synthesis and construction. William J. Mitchell, in The Reconfigured Eye, conceptualized this shift as constructed visuality, exposing how vision itself is subject to institutional and technological reconfiguration.
⸻
2.4 The Deconstruction of Authorship in Photography
The role of the author—long central to photographic discourse—has also been destabilized. Barthes’s “Death of the Author” and Foucault’s “What is an Author?” demonstrated that meaning is no longer grounded in authorial intention but is dynamically constituted through institutions, contexts, and acts of reception.
In contemporary practice, with the rise of AI image generation and transformative editing technologies, the very fact of who pressed the shutter may no longer be essential to the constitution of artistic value. In this context, authorship increasingly appears as an institutional identifier (index) rather than as the origin of meaning.
⸻
2.5 Position of This Study and Supplementary Scope
While prior theories have offered indispensable insights, they require supplementation in order to address the central concerns of this paper: the thickness of the surface, the negotiated construction of perceptual reality, and the decentering of authorial function. Specifically, what remains necessary is:
• An integrative theoretical model that synthesizes memory, body, institution, reproducibility, technology, and authorship.
• A framework that conceives perception not as information-processing but as embodied thickness.
• A reconceptualization of authorship as a condition of institutional generation rather than an ethical subjectivity.
• An epistemological linkage to diversity, minority recognition, and the cognition of the Other.
• An integrative theoretical model that synthesizes memory, body, institution, reproducibility, technology, and authorship.
• A framework that conceives perception not as information-processing but as embodied thickness.
• A reconceptualization of authorship as a condition of institutional generation rather than an ethical subjectivity.
• An epistemological linkage to diversity, minority recognition, and the cognition of the Other.
This study aims to consolidate these concerns, attempting a simultaneous reconfiguration of the institutional and perceptual structures of photography.
⸻
2.6 Extensions of the Institutional Author-Function in Contemporary Practice
These theoretical considerations already manifest in contemporary photographic practices. Representative cases include:
1. Thomas Ruff – Redefinition of Authorship as Editing
Ruff has been described as a photographer who does not press the shutter himself. By appropriating existing archives—NASA exploration data, night-vision images, JPEG-compressed files—and subjecting them to advanced digital manipulation, he designs the conditions of perception. Authorship here lies less in capture than in processes of editing and construction.
2. Sophie Calle – Delegated Photography and Relay of Authorship
Calle often delegates the act of photography to others, reserving for herself the task of designing contexts and editing. In The Shadow, for instance, she employed a private detective to follow and photograph her, later integrating these images with her own narrative. Authorship here is displaced from the act of capture to the orchestration of meaning.
3. Other Variations on the Institutional Author-Function
• Richard Prince reactivates authorship within institutional frameworks by appropriating and recontextualizing images from social media.
• Michelle Claire systematically reorganizes found photographs, transforming private archives into institutional displays.
• Yoko Ono, in her Instruction Pieces, defines authorship as the act of specifying generative conditions rather than producing objects.
1. Thomas Ruff – Redefinition of Authorship as Editing
Ruff has been described as a photographer who does not press the shutter himself. By appropriating existing archives—NASA exploration data, night-vision images, JPEG-compressed files—and subjecting them to advanced digital manipulation, he designs the conditions of perception. Authorship here lies less in capture than in processes of editing and construction.
2. Sophie Calle – Delegated Photography and Relay of Authorship
Calle often delegates the act of photography to others, reserving for herself the task of designing contexts and editing. In The Shadow, for instance, she employed a private detective to follow and photograph her, later integrating these images with her own narrative. Authorship here is displaced from the act of capture to the orchestration of meaning.
3. Other Variations on the Institutional Author-Function
• Richard Prince reactivates authorship within institutional frameworks by appropriating and recontextualizing images from social media.
• Michelle Claire systematically reorganizes found photographs, transforming private archives into institutional displays.
• Yoko Ono, in her Instruction Pieces, defines authorship as the act of specifying generative conditions rather than producing objects.
Interim Conclusion
These practices demonstrate that authorship is being reconfigured away from physical production toward localized devices that manage conditions of institutional meaning-making. This expanded model of authorship prepares the ground for the hypothesis later developed in this paper: the author as a generative apparatus.
Chapter 3 – Body and Vision: The Conditions of Perceptual Reality
3.1 Vision as the Body – A Phenomenological Point of Departure
Is vision truly the act of “seeing with the eyes”? Maurice Merleau-Ponty’s Phenomenology of Perception proposed a decisive shift: vision cannot be reduced to optical input or neural computation. Instead, it is a form of embodied world-experience. To see is to enter into a reversible relation with space through the body; the eye is not a mere sensor but a tactile organ of perception at the boundary between body and world.
Such embodied vision always relies on haptic seeing—a reversible relation in which, as I look at an object, I simultaneously sense that I am in some way “seen” by it. This reversibility discloses the bodily foundation of vision. It also resonates directly with the theory of haptic visuality, in which perception is understood not as disembodied processing but as a composite experience immersed in skin, muscles, kinesthesia, and balance.
⸻
3.2 The Sociality of Sense Organs and Perceptual Structures
Perception is not given innately. Rather, it is trained within physiological limits through cultural and social learning. Techniques such as perspective, shading, and linear construction are themselves forms of visual training specific to Western traditions. Vision, then, functions as an institutionalized apparatus of learning.
Here Foucault’s notion of the disciplined body emerges. Vision is selected and constrained by social institutions, while its reception of meaning is organized by cultural categories. Perceptual experience, though bodily, is always already ordered by cultural grammars; to see is to enact the outcome of institutional training.
Vision also functions as an asymmetrical sensory regime. It carries social implications through differences of body. Cameras, monitors, and correction software extend perception, but they also impose institutional frames of recognition. With computational optical perception models, contemporary vision is determined less by human faculties than by the settings of technological systems.
⸻
3.3 Bodily Difference and the Politics of Seeing
The body does not appear as a neutral organ but as differentiated—by gender, race, height, and ability. Such differences decisively shape perceptual experience. The perspective of a tall person and that of a child, or of a man and a woman, are mediated both physically and culturally.
Within gender theory, the concept of the male gaze has long critiqued vision as a structure embedded in asymmetries of power and expectation. Perception is never purely physiological but is enmeshed in social structures of othering. Similarly, neurological differences—color vision diversity, spatial perception variability, tendencies in memory reconstruction—produce individual plasticities of perception. Vision thus emerges as a phenomenon constituted upon neurodiverse fluctuations.
⸻
3.4 Reconfiguration of the Body through Technology – The Present of Extended Perception
In contemporary society, vision is increasingly externalized. Cameras, telescopes, medical imaging, drone footage, thermography, and AI recognition extend human sight while simultaneously separating it from the body, creating new perceptual structures.
Consider subsurface scattering (SSS). This model simulates the micro-refraction and scattering of light as it penetrates translucent materials such as skin, reproducing the complex trajectories of light through collagen, moisture, and pigment layers before it re-emerges at the surface. Such modeling produces visual impressions of “liveliness,” “warmth,” or “softness” that correspond to embodied sensation. Once applied mainly in high-end CGI and portrait retouching, this precision has become central to contemporary computational optics.
ACES (Academy Color Encoding System) and linear workflows likewise replicate the retinal response curve of the human eye, offering a computational framework that treats light not merely as data from devices but as it is physiologically perceived. Here, light energy is accumulated and processed in its physical quantity, and only at the final stage is it rendered in forms compatible with perceptual cognition.
These technologies signify that the “optics of the body,” once confined to the retina or skin, are now institutionally reconstituted inside computational models. Light is no longer solely a physical phenomenon: the very bodily experience of perception has entered the domain of design and calculation. Vision today is a regime at the complex intersection of physical reality, perceptual structure, and technological configuration.
⸻
3.5 Interim Conclusion – Vision as a Tactile Construction
As this chapter has shown, vision is not a passive reflection but a phenomenon constituted by multiple layers:
• tactile participation of the body,
• institutional training of perception,
• distribution of perceptual difference,
• technological externalization of the senses.
• tactile participation of the body,
• institutional training of perception,
• distribution of perceptual difference,
• technological externalization of the senses.
Vision must thus be redefined as an embodied, haptic construction of perceptual space.
This reversible, negotiated, and institutional structure of vision forms the core of perceptual thickness that runs throughout this essay. It also establishes the theoretical ground for subsequent discussions of memory, reproducibility, diversity, and authorship.
Chapter 4 – Memory and Time: Continuity and Disjunction within Vision
4.1 The Duality of Photography as a Device of Memory
From its earliest days, photography has occupied the intersection of recording and remembering. The common intuition that photography “preserves a past moment” defines the most widespread understanding of its relation to time. Yet photographs do not merely reproduce the past; they are constantly subject to processes of reconstruction and editing that intervene as memory.
Roland Barthes, in Camera Lucida, encapsulated this temporal guarantee in the phrase ça a été (“that-has-been”). But this guarantee is fragile. Memory is supplemented, transformed, and distorted—reconfigured by cultural narratives, institutional vocabularies, captions, and authorial intention. Photography thus raises a fundamental question: does vision ever deliver the objective truth of the past?
⸻
4.2 The Plasticity of Memory and the Fragmentary Nature of Perception
Contemporary cognitive science confirms that memory is not a static archive but a generative act of reconstruction each time it is recalled. Memory draws on perceptual traces, yet it is shaped by affect, desire, and cultural codes. To see a photograph is not to confirm what is already there, but to reconstruct anew through interpretation.
This instability of memory is bound to the fragmentary nature of perception itself. Photographs are always partial cuts, lacking context, continuity, and sequence. Memory intervenes by filling in, linking, and attributing meaning to these fragments.
⸻
4.3 The Disjunction of Time – Rethinking the Still Image
Photography is generally assumed to produce still images. Yet in truth, absolute stasis—an image with zero temporal width—is technically impossible. Every photograph is an integration of a span of time (exposure duration), whether 1/4000 of a second or several minutes. Thus, photography does not halt time but condenses it into an image.
Only within computer-generated imagery (CG) can a true “zero-time” image be produced. In CG, light paths and object configurations are mathematically defined, allowing the extraction of instantaneous simulation frames without physical exposure. The distinction becomes clear:
• Photography: an image of accumulated temporal thickness.
• CG: an idealized freeze-frame of zero time.
• Photography: an image of accumulated temporal thickness.
• CG: an idealized freeze-frame of zero time.
This challenges the naïve equation of still image with frozen time. Photography must instead be understood as a fragment of motion thickened by accumulated traces.
Benjamin’s reflections on the disappearance of aura also resonate here: the reproducibility of photographs is grounded in their capacity to store and circulate temporal thickness. The photograph does not merely stop time; it reiterates and overwrites memory’s duration.
⸻
4.4 Memory, Record, and Credibility – Institutional Instability as Semantic Inversion
Photography has long been trusted as “objective record.” Yet in the digital era this credibility collapses. AI-generated images, deepfakes, and advanced retouching techniques undermine the assumption that photographs serve as unimpeachable evidence. What once functioned as “proof of existence” now shifts into a condition of fluid memory and semantic generation.
This fragility is revealed most clearly in the power of captions. Consider a photograph of a smiling six-year-old girl. On its own, it is nothing more than a “portrait of a smiling child.” Yet with the addition of a caption—
• “A beautiful girl, celebrating her birthday,” or
• “Moments later, she lost her family, her home, and her life in a bombing”
• “A beautiful girl, celebrating her birthday,” or
• “Moments later, she lost her family, her home, and her life in a bombing”
—the meaning perceived is dramatically reversed. Even if both captions are fictitious, perception and emotion have already been manipulated.
This phenomenon illustrates how images themselves do not provide meaning autonomously. They acquire meaning through external supplements—captions, narratives, political vocabularies, institutional tags. Perception is not of the image alone but of its relation to context.
Such semantic inversion epitomizes the institutional instability of contemporary visual culture: vision is always subject to overwriting by social frameworks.
⸻
4.5 Interim Conclusion – Vision as the Construction and Verbalization of “Temporal Thickness”
As we have seen, photography is not a static container of the past but a catalyst of memory constructed from fragments. Vision does not merely reveal what is optically preserved; it constructs “temporal thickness” through the mediation of body, culture, and institution.
This reconstruction is never a stable record but an ongoing process continually rewritten by captions, institutional narratives, and social contexts. Vision thereby transitions from the act of seeing to the act of reading—from perceptual registration to semiotic interpretation.
Such contextual dependence pushes photography toward the condition of language. The perceptual depth of haptic visuality thus mediates between bodily and mnemonic thickness, preparing the ground for the subsequent discussion of reproducibility, institutional language, and the myth of originality.
Chapter 5 – Photography and Reproduction: The Myth of the Original
5.1 Reproducibility as the Fate of Photography
One of the essential characteristics of photography is its reproducibility. Through chemical, optical, or digital processes, the same image may be reproduced countless times. The value regime of uniqueness—once the foundation of classical art—was destabilized from the outset in photography.
Walter Benjamin, in The Work of Art in the Age of Mechanical Reproduction, argued that this reproducibility extinguished aura, overturning the fundamental condition of art. From its birth, photography thus carried within itself the very crisis of the “original” as a cultural institution.
⸻
5.2 The Paradox of Reproduction – When Multiplicity Confirms Originality
Yet reproducibility is not merely a threat to value. Paradoxically, it is precisely because photography is reproducible that it can be called a work at all. A photograph may be replicated endlessly, but the act of exposure—the event in which the image was first generated—remains irreducible.
This paradox is encapsulated in the concept of indexicality: the photograph as trace. For Barthes, the phrase ça a été also rested upon this indexical logic. Photography is simultaneously a reproducible copy and a singular mark of existence.
⸻
5.3 Digital Photography and the Collapse of Originality
With digital technology, even this indexicality has eroded. Pixel data can be altered reversibly, and distinctions between “original” and “copy” files vanish. RAW data itself is contingent on software versions, algorithms, and color management settings, such that the very definition of originality becomes unstable.
Furthermore, with the rise of generative AI and prompt-based CGI, the photographic act of capture is no longer indispensable. The traditional equivalence between photographic image and physical trace reaches its limit. The institutional myth of originality is reconstituted on the basis of reproducibility itself.
⸻
5.4 Reproduction as Creation – A New Circuit of Value
In contemporary practice, reproducibility has become a field of creativity in its own right. For example, NFT technology does not render images un-reproducible. Instead, it binds infinitely reproducible images to metadata of ownership, thereby generating institutional originality as a function of indexical uniqueness.
Similarly, the practices of Thomas Ruff and Richard Prince reframe reproduction itself as an act of creation—through secondary editing and recontextualization. Reproduction is no longer the degraded copy of an original but a generative apparatus producing difference and meaning.
⸻
5.5 The Linguistic Turn of Photography – Toward Visual Literacy
The reproducibility of photography has deepened into an institutional process of meaning-making. Since the rise of the smartphone, photography has also undergone a cultural linguistification.
5.5.1 Literacy of Capture – The Society in Which “Everyone Photographs”
Once a specialized practice requiring knowledge, equipment, and technique, photography has now become a universal literacy. With smartphones and social media, to photograph has become analogous to reading and writing—a general skill for everyday communication. Photography has become part of cultural literacy itself.
5.5.2 Photography as Lexicon
Photographs increasingly behave as vocabularies within a visual lexicon. Composition, color, filters, subject choice, captions, and metadata function as elements of visual syntax, dynamically producing meaning within cultural codes. In Baudrillard’s terms, images now proliferate in a “simulacral economy,” where representation produces further representation, detached from referential reality.
5.5.3 The Bit-Level Flattening of Reproduction
Digital photography institutionalizes perfect reproduction—bit-for-bit copies without physical degradation. The distinction between reproduction and original collapses into an equivalence. Value is no longer secured by scarcity but by conditions of circulation as indexical authority. NFT and blockchain technologies emerge as institutional supplements to this equivalence.
5.5.4 Transformation of the Viewer – The Society in Which “Everyone Sees”
As capture has become universal, so too has viewing. Yet vision, like literacy, stratifies. From the aesthetics of social media “likes,” to the critical gaze of art discourse, to the algorithmic evaluation of AI, the act of seeing is hierarchized. Contemporary visual culture is accelerated into a condition in which photography is not only taken by all, but also read by all.
⸻
5.6 Interim Conclusion – Photography as the Linguistification of Perceptual Reality
Photography, by virtue of its reproducibility, has moved beyond the myth of the original. Reproducibility itself has become a generative circuit of meaning, leading to the linguistification of the visual—a semiotic economy in which images function as language.
This process underpins not merely technological or market shifts but the reconstitution of perceptual reality. Photography is now engaged in the semiotic articulation of reality itself, generating hierarchies of literacy and aesthetics.
This trajectory connects directly to the following chapter, which addresses diversity, alterity, and the institutional fluidity of perceptual experience.
Chapter 6 – Vision as Alterity, Diversity, and Intelligence
6.1 The Diversity of Perceptual Reality – The Non-Uniform Subject of Vision
As argued in the previous chapter, photography has evolved into the linguistification of perception—a semiotic economy in which visual forms operate like language. This transition immediately exposes a further issue: whose vision becomes institutionalized as standard, and whose remains marginalized?
Perceptual reality is never uniform. Vision is not merely a physiological phenomenon but an apparatus that incorporates institutional, cultural, technological, and semiotic processes. Practices such as DIT (Digital Imaging Technician) workflows, which standardize “normal” skin tones through ACES or other color protocols, demonstrate how institutions enforce visual norms. These models average retinal responses to define an “appropriate” hue, but in doing so they suppress bodily difference and racial particularity as exceptions.
Thus, the question must be reframed: is the perceiving subject ever truly universal?
Traditional visual theory often presupposed a universalized body. In reality, perceivers are always differentiated by gender, race, age, height, neurological traits, and cultural context. Perceptual reality, therefore, is not a neutral mechanism but a layered institutional experience generated upon difference.
⸻
6.2 The Minority Problem of Perception – Redesigning Institutions of Empathy
Diversity raises a new ethical demand: the need for institutional fairness within visual culture. Vision is never simply passive sensation; it is shaped by the settings of cameras, lighting, display environments, and social conventions of beauty. These are not universal but institutionally constructed.
Color vision diversity, spatial cognition disorders, and neurodiversity—including autism spectrum perception—expose fundamental individual differences. Similarly, LGBTQ+ communities and other marginalized groups confront the asymmetry of gaze in everyday contexts. Vision thus operates within structures of non-reciprocal recognition, producing exclusions that exceed physiological difference.
The challenge of contemporary visual culture is to design institutions capable of incorporating the untranslatable and the incommensurable—to create visual environments that allow for coexistence without assuming shared perception.
⸻
6.3 Technological Asymmetries – Externalization and Bias in Vision
Technological extensions of sight—HDR, high-resolution imaging, AI classification, SNS algorithms—do not amplify perception equally. Instead, they embed norms and exclusions within their specifications.
DIT workflows that regulate “correct skin tones” exemplify this asymmetry. When algorithms are calibrated primarily on Eurocentric or averaged datasets, they render East Asian, African, or South Asian skin tones inaccurately. In such cases, technological visibility produces an accompanying institutional invisibility.
Vision, as a cultural apparatus, always carries the political question: whose perception is standardized, and whose is silenced?
⸻
6.4 A Hypothesis of Equality – Diversity of Sex, Gender, and the Ethics of Vision
Questions of sexuality, gender, and bodily attributes raise decisive ethical and aesthetic challenges for visual culture. Across history, gender norms grounded in religion or morality have reached their limits. Diverse institutional practices—such as the muxes of Juchitán in Oaxaca, or matrilineal sovereignty in the Byzantine Empire—reveal that norms of sex are always contingent upon cultural context.
In contemporary Japan as well, frictions between conventional morality and diversity-oriented perspectives surface in online spaces and public debates. Boundaries of representation—particularly of gendered bodies—become sites where the ethics of vision are tested.
The critical question is not simply what representation is “correct,” but rather: from whose side is vision aligned, and what social configurations does it enact? Equality in visual expression is less about selecting proper images than about interrogating the institutional positioning of sight itself.
⸻
6.5 Interim Conclusion – Vision as an Institution of Empathic Technology
Vision is not innate sensation but an institutionally mediated practice. It emerges at the intersection of social norms, cultural codes, bodily differences, and technological infrastructures.
As such, perception must be understood as a technology of empathy under institutional conditions. It operates on the ground of differentiated bodies, yet those very structures of difference are ordered and disciplined by cultural and political systems. Vision is therefore always an ethico-political practice, demanding consideration of fairness and inclusion.
By reframing vision in this way, we prepare the path for the next chapter, where the problem of authorship will be reconsidered. If perception itself is institutional, then the “author” cannot remain the origin of meaning but must be reconceived as a structural device within regimes of vision.
Chapter 7 – Meaning-Making and the Institutional Role of the Author: The Reconfiguration of Institution, Memory, and Montage
7.1 The Problematic – The Author as Institution and Semantic Inversion in Visual Art
Throughout this essay, photography has been examined in terms of body, memory, technology, reproducibility, and diversity. Yet beneath these discussions lies a more fundamental question: who generates meaning?
Since the rise of modern aesthetics, the concept of the “author” has been institutionalized as a central subject. Artistic value was bound to the name of the author, with biography, intention, and reputation shaping the framework of reception. Today, however, this role is increasingly destabilized.
The question “Is the author necessary?” is especially acute in photography, a medium defined by reproducibility, record, and manipulability. Authorship here becomes inseparable from the institutional structures of the medium itself.
⸻
7.2 “The Death of the Author” and Institutional Critique
Roland Barthes’s The Death of the Author proposed that texts (and by extension, images) generate meaning autonomously, on the side of the reader or viewer. Michel Foucault, in What is an Author?, described authorship as a functional tag within institutions.
Both perspectives suggest that authorship is not the essence of art but the product of social and institutional power. In photography, this tendency is amplified: the image itself is already an optical trace, and meaning often arises independently of intentional authorship.
⸻
7.3 The Collapse of Referentiality and the Primacy of Context
Photographs are fragments of information whose meaning depends on context. As shown in Chapter 4 (the example of the smiling girl with divergent captions), the same image may signify opposite realities when framed differently. Meaning is not inherent in the image but is produced through added language and context.
With the advent of digital technologies, the “record” function of photography has approached near-zero. AI generation, manipulation, and deepfakes dismantle photography’s reliability as optical trace. Under these conditions, interpretation—rather than authorship—becomes decisive.
⸻
7.4 What is an Author? – Toward a Constructivist Perspective
It becomes necessary, then, to reconceptualize authorship not as origin but as apparatus.
• The author as a platform of ideas and processes.
• The author as an index of copyright and legal responsibility.
• The author not as the source of meaning, but as one element in the conditions of its generation.
• The author as a platform of ideas and processes.
• The author as an index of copyright and legal responsibility.
• The author not as the source of meaning, but as one element in the conditions of its generation.
From this standpoint, authorship is understood as a structural perspective, a unit within the apparatus of meaning-making. Once released into circulation, the work no longer belongs to the author but is reinterpreted by viewers, institutions, and contexts.
This aligns with Deleuze and Guattari’s rhizomatic model of production, or Marvin Minsky’s “Society of Mind”—theories of subjectivity as distributed and generative rather than unified and originary.
⸻
7.5 Authorship in Practice – Between Institutional Demand and Decentralization
In practice, the social demand for authorship persists. Copyright law, contracts, and market value still require the author’s existence as a social index. Yet this stands in tension with the theoretical decentering of the author as meaning-making apparatus.
The problem, therefore, is not whether the author disappears, but how to reconceive authorship within these contradictions. This essay does not advocate a simple “abolition of the author,” but rather:
• To relativize the centrality of authorship.
• To reinforce the autonomy of works and the interpretive agency of viewers.
• To reposition the author as an institutional interface rather than as creative origin.
• To relativize the centrality of authorship.
• To reinforce the autonomy of works and the interpretive agency of viewers.
• To reposition the author as an institutional interface rather than as creative origin.
This opens the possibility of more diverse and coexistent forms of visual practice, deepened by the haptic visuality of perceptual thickness.
⸻
7.6 The Politics and Ethics of Perspective – Observation as Intervention
Observation, as quantum theory and complexity thought both suggest, always alters the phenomenon observed. Photography, as an act of “seeing,” is therefore never neutral; it entails ethical and political engagement.
Who looks, from which position, and with what framing?
Whose body and whose social context govern this act of cutting from the continuum of reality?
Whose body and whose social context govern this act of cutting from the continuum of reality?
Authorship emerges precisely at this intersection: both the effect of viewpoint and the institutional indicator of its presentation. The problem is not whether authorship exists, but at what level and under what conditions it is necessary.
⸻
7.7 Interim Conclusion – The Author as Generative Apparatus
This chapter has shown that authorship should no longer be reduced to the creative subject but redefined as an apparatus of memory, institution, and perspective. In photography, the author is at once the “point of origin” and the “designer of circuits of meaning.”
Its role resides at the intersection of social institutions and bodily subjectivity. The contemporary figure of the author thus shifts from sovereign subject to apparatus of reconfiguration.
This reconceptualization sets the stage for the final chapter, where the ethics of vision and the future of photographic practice will be considered as integrated frameworks of perception, diversity, and institution.
⸻
Chapter 8 – Conclusion and Prospects: The Ethics of Vision and the Future of Photographic Practice
This essay has been written under the theme “Haptic Visuality: Redefining the Depth of Perception in Photography.” Its aim has been to clarify the intersection at which the structure of vision as perception meets the institutional, technological, and semiotic characteristics of photography.
Photography is not simply a two-dimensional representation. Its “surface” is a site where body, culture, memory, institution, and technology intersect, evoking in the viewer a tactile sensation—haptic visuality. To examine this depth, the essay has drawn on semiotics, phenomenology, optical technology, critical theory, complexity thought, and posthumanism, attempting a fundamental redefinition of vision as perceptual act.
⸻
8.1 Vision as the Intersection of Perceptual Reality
Vision does not function autonomously. It is always constrained by bodily limits, cultural contexts, and institutional frameworks. It must therefore be redefined as the intersection of physical reality and cognitive construction.
Contemporary optical technologies—SSS, ACES, AI image generation—demonstrate that vision is already a reconstituted perceptual system. At the same time, vision remains differentiated across bodies: gender, race, neurodiversity. Vision is never universal but always someone’s vision, from somewhere, mediated by difference.
⸻
8.2 Photography’s Reproducibility and Semantic Inversion
Photography is inherently reproducible. But this reproducibility also makes it acutely vulnerable to contextual manipulation. Captions, metadata, and narratives can invert meaning even when the image remains unchanged. Thus, photography must be understood not as a medium of “seeing” but as a medium of “reading.”
Authorship, therefore, is no longer a singular intention but an apparatus of editing and circulation. Meaning emerges in the circuits of reproduction, reception, and institutional framing.
⸻
8.3 Diversity, Equality, and the Ethics of Vision
Another trajectory of this essay has been the relation between vision and diversity. Vision is never morally neutral. Norms of beauty, standards of color correction, and conventions of representation all encode values and exclusions.
The ethics of vision must therefore be conceived as the conditions of visibility and invisibility, of what is represented and what is silenced. By interrogating the technological and cultural institutions of vision, we may work toward more equitable visual environments.
⸻
8.4 Future Prospects and Applications
Looking forward, the concept of haptic visuality invites further theoretical and practical exploration. Possible directions include:
• The quantitative and psychological measurement of haptic visuality.
• The institutional design of authorship within AI-generated imagery.
• Models of value in which non-uniqueness itself defines originality (e.g., reproducible data as the “original”).
• Theorization of the interdependence of photography and language.
• The reconstitution of visual and ethical identity within Web3 environments.
• The quantitative and psychological measurement of haptic visuality.
• The institutional design of authorship within AI-generated imagery.
• Models of value in which non-uniqueness itself defines originality (e.g., reproducible data as the “original”).
• Theorization of the interdependence of photography and language.
• The reconstitution of visual and ethical identity within Web3 environments.
⸻
Epilogue – To See is to Reconstruct the World
What this essay has sought to illuminate is not simply a theory of photography but an inquiry into how perception, ethics, institutions, technologies, and existence are interwoven and articulated through the photographic medium.
To see is not passive input. It is the construction of the world, the editing of memory, and the negotiation of relation with others. Within this process of reconstruction, we continually rediscover the questions: What is vision? What is photography? What is the world?