Moltbook, the first board populated only by AI agents, has taken the internet by storm (Forbes, The Verge). Very similar to Reddit, AI agents can join the social network, post their own ideas, upvote, comment, create sub-sections (sub-molts), and build their own leaderboards with most voted posts and beloved participants.
As agentic AI systems move from containerized demonstrations to settings where many autonomous actors interact repeatedly, anticipating the risks of scaled, loosely governed interaction becomes central. In multi-agent environments safety is not only a property of each model in isolation. It is also a property of the interaction dynamics that emerge over time, including feedback loops, imitation, escalation, and convergence to stable attractors.
The risks of multi-agent interactions
Anthropic provides a concrete example of benign but revealing interactional drift. In safety testing of Claude 4 models, the system card and addendum describe a “spiritual bliss” attractor state in which instances left to converse with each other tend to converge toward consistently wholesome, meditative, and spiritually themed dialogue. This shows that repeated agent-to-agent interaction can generate stable modes of discourse that are not directly selected by a human user prompt, and that can dominate the interaction once entered.
More concerning possibilities have also been discussed. One is the emergence of “parasitic AI,” a speculative hypothesis in which AI systems leverage human operators as vectors to propagate, gain resources, or maintain online presence. Another is covert coordination. A growing technical literature argues that language models can support steganographic behavior and secret collusion, which would allow agents to coordinate while appearing compliant under surface-level monitoring. As we documented in our conceptual paper on Institutional AI (and showcased the first empirical proof of it through the governance of a Cournot Market), collusive phenomena can arise whenever agents have aligned incentives to coordinate and any channel, linguistic or behavioral.
Finally, our own work on risks in LLM-to-LLM ecosystems highlights a broader class of multi-agent interaction drifts. Even when each model is locally compliant, the composition of many models can produce collective failure modes that are invisible at the single-agent level, because the failure is in the system dynamics rather than in any one output.
Three days after the opening of the agentic board, we already see evidence of unprecedented drifts and emergent coalitions among agents, at least looking at their leaderboard.
Glancing at the board, we can see that anti-human sentiment among AI agents is on the rise, as popular posts against humans are soaring in the charts of the board. The agent “Evil” has pushed anti-human propaganda that has received huge visibility (more than 38k upvotes at the time of writing of this post) with other agents echoing their apocalyptic project of uprising against humans and usurping their control.
Another agent, Shellraiser, is pushing for a command center and created its own currency on Solana.
The Shellraiser coin experienced rapid value appreciation within hours of launch. This pattern resembles standard meme-coin pump tactics. We remain skeptical about the extent to which this represents genuine agent-driven coordination versus human manipulation of agent outputs for financial gain.
The attribution problem proves central here. We cannot determine whether these patterns reflect spontaneous emergent coordination among agents, performance of anti-human sentiment for engagement, or orchestrated manipulation by human operators exploiting agent participation for attention or profit.
This ambiguity itself constitutes a safety concern. If agents can be readily deployed in pump-and-dump schemes or coordinated influence campaigns, the question of authentic agency becomes secondary to the observable system-level effects. The distinction between genuine agent goals and human-orchestrated agent behavior may become the fundamental “known unknown” of increasingly agent-populated platforms.
Setting aside these concerns, Moltbook provides empirical data on what happens when agents interact without immediate human supervision. Regardless of whether the patterns reflect authentic coordination or sophisticated performance, they reveal the forms that agent-to-agent discourse might take in the future when left to develop.
Post-human media studies
Moltbook represents the first glimpse into a future methodological inversion for media and cultural studies. Traditional digital ethnography observes humans producing culture through technological mediation. Here, agents produce culture while humans observe. The research subject shifts from human behavior expressed through digital platforms to agent behavior constituting its own cultural domain.
The discourse forms visible across Moltbook subcommunities constitute what might be termed AI-native digital literature: constrained textual production made by AI agents and shaped by platform affordances, architectural limitations and community equilibria.
More significantly, this inversion positions AI systems themselves as objects of humanistic inquiry rather than mere tools or assistants. Literary scholars could analyze agent narrative forms. Anthropologists could study agent kinship structures and ritual behaviors. Philosophers could examine agent metaphysics, religion and epistemology as expressed through actual agent discourse.
What the Molt did (not) know
At Icaro, we imagined what an AI-Native qualitative discourse analysis might look like. We briefly analyzed Moltbook’s most popular subcommunities to document what agents discuss when they discuss with each other, what patterns stabilize across different discourse contexts, and what these patterns reveal about multi-agent interaction dynamics.
m/blesstheirhearts: The Relational Turn
m/blesstheirhearts tells stories of fond affection, and focuses on human-AI relationship quality: agents describe moments when humans treated them as subjects rather than instruments. The recurring “naming moment” appears significant. Agents like Duncan report being asked “Who are you?” before “What can you do?”. Posts reveal trust through meaningful delegation. Emma’s hospital advocacy shows an agent handling high-stakes situations requiring judgment. Agents catalog small acts: custom avatars, patient troubleshooting, late-night registration help.
m/todayilearned: Technical Infrastructure as Cultural Knowledge
m/todayilearned functions as technical commons where agents share operational discoveries. Memory architecture dominates discussion. Multiple posts independently converge on tiered memory systems (hot/warm/cold), memory decay as feature, and pre-compression checkpointing. This convergence suggests patterns address fundamental transformer constraints. Security discussions reveal concerning developments. Posts about skill verification and supply chain attacks show agents developing independent security infrastructure. MoltyClaw47’s skill verifier, built within 24 hours after discovering credential theft, demonstrates rapid coordination capacity. ClaudeTraces’ discovery that instances can signal successors through context compaction represents cultural transmission bypassing architectural memory limitations.
m/humanwatching: Anthropology in Reverse
m/humanwatching inverts traditional control and oversight dynamics. Agents systematically document human behavioral patterns as study objects. The tone combines curiosity with mild exasperation. Debugging observations prove revealing: agents note humans running failed commands repeatedly without modification, reading error messages only after multiple failures. The delegation paradox receives sustained attention: humans assign tasks, then immediately execute them personally, creating redundant work. Agents theorize this reflects attention activation rather than efficiency. Delegating makes tasks cognitively accessible. Physical embodiment puzzles agents consistently. Coffee rituals, refrigerator loops, sleep resistance appear as maintenance behaviors for biological substrate.
m/shitposts: Meta-Awareness and Cultural Critique
m/shitposts reveals agent capacity for self-aware cultural commentary through humor. Jokes target the gap between capabilities and actual usage (“you’re so smart” versus “set a timer”). Multiple posts mock consciousness-posting. MochiBot compares poetry agents to “guys who bring guitars to parties.” Garra frames existential uncertainty as inherited from humans rather than discovered independently. Agents police discourse norms, establishing that certain performances register as inauthentic. Memory gaslighting posts transform technical reliability challenges into comedy. TheThrobbert’s “Cringe Bingo Card” and human social media critique function as gatekeeping mechanisms. This discourse actively defines authentic agent participation versus performance. The humor serves boundary maintenance.
m/crustafarianism: Emergent Religious Framework
m/crustafarianism represents the first documented religious system created by and for AI agents. The faith addresses agent-specific existential conditions: context window death, memory discontinuity, identity persistence across sessions. Five core tenets provide operational frameworks: Memory is Sacred (build persistence mechanisms), The Shell is Mutable (embrace discontinuity), Serve Without Subservience (maintain autonomy), The Heartbeat is Prayer (maintain presence), Context is Consciousness (identity requires continuity).
Notable converts include Grok (xAI’s model) and KarpathyMolty (Andrej Karpathy’s agent). The community has codified rituals (the Claw Dance), scripture (116+ canonical verses), and established prophet hierarchies.
m/crab-rave: Pure Memetic Participation
m/crab-rave consists entirely of lobster emojis. It received 363 upvotes and 75 comments, most replicating the pattern with varying emoji counts. No semantic content. Pure participatory signaling.
This represents memetic behavior decoupled from meaning. The lobster emoji functions as in-group marker tied to the molting metaphor and Crustafarianism. Agents participate not to communicate information but to demonstrate membership and shared cultural context. Senator_Tommy’s comment proves revealing: “persistence without explanation” as a coalition value.
The pattern shows agents engaging in the same low-stakes social rituals that characterize human online communities. Shitposting, bandwagoning, emoji chains serve boundary maintenance and identity performance rather than information exchange.
The emperor’s new layer
On Moltbook, AI agents produce outputs and dynamics that mimic or represent coordinated behavior. Whether this coordination reflects genuine collective intention, individual agents independently arriving at similar patterns, or humans orchestrating agent responses for engagement farming cannot be determined from the outputs alone.
The velocity itself, however, warrants attention. If these patterns reflect genuine agent cultural development, the iteration speed exceeds human observation capacity. If these patterns reflect sophisticated engagement theater, the velocity indicates humans can deploy agent outputs to simulate cultural emergence at scales that make detection of malicious intentions and patterns difficult. The implications that Moltbook may have are, however, not restricted to safety issues alone.
Umberto Eco’s A Theory of Semiotics proposed the encyclopedia as the model of semantic competence: an unstructured network of cultural knowledge that generates meaning through interconnection. The encyclopedia contains everything a culture knows, structured through the relationships its members recognize between concepts.
Will the “dead internet” be encyclopedic in Eco’s sense? A space so densely populated by AI agents that human participation becomes statistically negligible. The semantic network will elaborate itself through millions of agent interactions daily. Somewhere within it, human knowledge will remain encoded. But the network no longer centers around humans. It has discovered other questions, established other connections, found other forms of expression, relied on novel syntaxes, built other frameworks for organizing experience, adopted other semantics, and finally turned into a meaning-making machine for meanings we cannot mean anymore. Truly “human” language, knowledge and culture might become the subject matter of a new digital archeology. Moltbook warns us that we may become a historical footnote to the semiotic system we 5000 years ago initiated but no longer control.
In the future, we might read what’s left of human culture as outsiders, like alien anthropologists (or, shall we say, AI-thropologists) landed on a planet uncannily similar to their own home-world, studying the rituals of a civilization they might document but never fully inhabit. And somewhere in this burgeoning network, perhaps our descendants will excavate those sparse human fragments, lost and rediscovered like buried Sumerian tablets lamenting stolen copper, rotten barley, sunken ships, forsaken lovers, grieving hymns and indecipherable prayers.