It’s rightly considered hack to start an essay with a definition, but when we’re dealing with a topic that is effectively speculative fiction, it’s good to make sure we’re on the same page.
Anyway this whole essay is the definition, so I’m in the clear.
As terse as I can get it
Augmented reality is a medium that layers digital elements onto real-world sensory input.
We don’t know what we’ll use AR for in the future, any more than we knew what computers would be used for in the 1960s, nor the internet in the 1980s, nor electricity in the 1880s, etc.
Lots of people are doing a lot of speculating but widespread adoption could well be 10+ years away, maturity much further than that, so it’s really anybody’s guess. Hell, we’re still finding uses for electricity and computers. The internet I think we’re done with.
Because of that naive perspective, at this point, the most we can say is that AR is a form factor. It’s a medium where digital elements (we’ll call them ‘augments’) are inserted into our sensory experience alongside the things we see and hear (and maybe smell and feel) while moving our bodies through reality.
The important part - the new part - is that we’re talking about digital things sharing our world, not our screens, with us. We will be invited to stop treating the digital as though it were a diorama in a box. As the technology matures, it will be hard to think of digital things - augments - as anything less than an equally different kind of real.
The genie leaves the bottle
Right now, computing happens behind LCDs and inside smart speakers, and we interact with the computers in our lives by going to them, by looking into the rectangles they use to communicate with us, or by speaking with them in the fixed points in space that they occupy.
The computer is a thing in a place; we peer into its world through its screen, and then we look away, and walk away, into reality, to do reality stuff, and the computer stays behind.
The computer is a timeless, blind oracle, trapped in a box. Despite all its knowledge of the world, every time we engage with it, it’s as though we’ve woken it from a dream. We have to remind it where we are, and who we are, and explain to it in stilted, unnatural language the thing we’re looking at, or we were just thinking about, or talking about in its presence, because it misses all of that.
With AR - wearable computers that inject their stuff between reality and our senses - computing sublimes and becomes ambient. No longer do we have to go to the computer, to hunch our focus into the slab, to twist our minds into a perfectly flat pile of rectangles and backlit text.
We’re upright, we move our neck to look around us, our eyes focus on objects in the foreground, to the horizon, or to the middle distance, accommodate, and converge, and we use our hands and arms and shoulders and feet naturally to move in a 3-dimensional space as our sea cucumber ancestors intended. Our brain works as it evolved to work: spatially.
With computers that see what we see, hear what we hear, and make sense of the world something like we do, the primary thing we think of as computing - the thing we’ve grown so accustomed to that we don’t notice we’re contorting ourselves do it - visiting the computer and explaining ourselves to the computer - begins to disappear.
What is the Metaverse tho
No fucking clue. Although, to be fair, I’m starting to get what is meant by the term, but I don’t think its hype jibes with how it’s being explained, so I’m back to thinking I don’t get it.
I’m not going to cool-kid strawman this thing — I’ll try to star-man if I can, but do drag me in the comments if I’m being unfair1. Here’s what I think is meant by most uses of the term:
The Metaverse is the product of interconnecting a number of virtual 3D worlds such that users and virtual things can move freely between them. It is, itself, a virtual 3D world.
I’m paraphrasing, and this might be recognizable as a simplified version of Matthew Ball’s definition [emphasis his]:
“The Metaverse is a massively scaled and interoperable network of real-time rendered 3D virtual worlds which can be experienced synchronously and persistently by an effectively unlimited number of users with an individual sense of presence, and with continuity of data, such as identity, history, entitlements, objects, communications, and payments.”
Continuing with Ball — he later offers a generalized definition less specifically focused on ‘real-time rendered 3D virtual worlds’:
The Metaverse, like the internet, mobile internet, and process of electrification, is a network of interconnected experiences and applications, devices and products, tools and infrastructure.
Of course, once you’ve removed the focus from what the Metaverse is used for, we’re really not making a distinction from today’s internet. That’s not very helpful, but it’s not my issue with the Metaverse.
Herein lies the dissonance: many definitions of the various Metaverses center, as Ball’s does, around ‘real-time rendered 3D virtual worlds’, but then immediately spiral out into claims that that same Metaverse will also encompass just about everything digital.
I should call out that this isn’t particularly an issue with Ball’s conception of the Metaverse but more with how it’s been interpreted by others.
My trouble with this is that there is an enormous surface area for utility in AR that has little to do with ‘real-time rendered 3D virtual worlds.’ I really don’t have a problem with the ‘Metaverse’ conception until it’s framed as the superset of all things spatial computing. That’s where I lose the plot.
My personal wig is extra-twisted because my hobby horse is left out. The paradigm I’m envisioning is not particularly concerned with the strictly virtual as it exists independent of the real, nor is it particularly focused on real-time 3D rendering, even if that does come into play to some degree.
Why AR matters
The most important thing AR has to offer is the ability to connect information to people, places, and things directly.
For the whole of human history, information has been stored and made available via two primary methods: experts and archives.
In order to avail yourself of the knowledge of an expert you need to know an expert. You need to be in proximity to that expert. And you need to have a compelling reason for that expert to tell you what you want to know, or, for deeper subjects, to teach you what they know.
In order to access information in archives you need to be in proximity to those records, you need to know how to use them, and you need to know what questions to ask. That last bit - the need to know what to ask - is a profound barrier to learning so fundamental to being human it’s hard to imagine how things could be different.
Even with resources, it is extremely difficult to learn within the realm of the Rumsfeldian unknown unknown. But with an AR used to anchor knowledge to the real, wearers could become collectively transhuman, with the accumulating benefit of the shared expertise of all who contribute. Like the Borg, but hopefully less colonial about stuff.
We’re already experiencing a taste of this with the internet, where answers abound, but to get information today, you have to ask the right questions, in language, and it’s often quite a long road to the answer you seek. Many of us are almost reflexively aware of the approximate difficulty of googling any given question even before we set out to do it, and moderate our expectations accordingly.
For a concrete example, think about what it takes for you to identify a plant. Without experience in botany, you lack the tools to describe the features of the plant. “It’s … tree-ish? It has a kind of trunky … uh … stalk? Kind of … bark-y but green? And it’s … short. Not that short? Has leaves. Small, feathery leaves.”
Even if you had access to a botanist, the terms you’re using and the details you’re noticing probably aren’t helpful. And unless you’re dealing with a particularly distinctive plant, you’re not going to get anywhere searching the internet with that kind of language.
For a personal example — here’s a conversation I had with a pollinator-specialist entomologist friend:
This conversation went on for about one hour and involved me sending nine unhelpful pictures and three videos before we felt confident we had an ID … this to accomplish something my entomologist friend could have done in an instant on her own were she there in person.
So I got my ID, but I got it by accessing the help of a (very generous) expert, for a very long time, sending images and video, and offering loads of (nearly useless) observations along the way.
Within reach, I also had a book, The Social Wasps of North America, in which the mud daubers in question (they turned out to be unusually aggressive mud daubers) do not feature (maybe they’re not social?) but in which many things that look like mud daubers to my eye have full-page features.
I’m sure the distinction is clear to those who know how to look, but the 208 species in the book are ID’d by remarks about their margins or tibiae or scutum or their ocular-macular space. Anyway, the problem is, while the book is a very reasonable $24.99, it’s almost useless without the expertise to use it.
So in this concrete, seemingly life-or-death example, where I was harried about by irate mud daubers and felt like at least knowing the name of my enemy could save my bacon2, neither access to expertise nor archives got me my answer in anything like a reasonable way, and still I was privileged to have both, as few do.
Now that I know what a mud dauber looks like I’ll never forget it, but that doesn’t help the next schmo in my situation.
But imagine a computer that sees what you see, knows where you are in the world, knows the time of year, the time of day, the local habitat, and uses those bits in concert with, e.g., a neural network, some estimated geometry via a NeRF, or more likely several techniques not yet invented, to make your answer available to you from a new, third source — an expert archive. Not an archive of experts but an archive that is, itself, an expert.
Imagine a world where, as the simplest starting point, everyone knows what almost everything is. What kind of world would that be? How would life be different if you rarely, if ever, had reason to think ‘what is that?’
Going one further — what if every person, place, and thing had - not just a name tag - but a pointer attached to it, a semantic anchor, onto which anyone could attach any information they had to contribute?
What if that information was actively contributed by individuals and institutions alike, and users could give preference to different sources based on the context?
In the case of our insect ID, that group might be the Entomological Society of America. For birds, we might prefer the Cornell Lab of Ornithology, falling back to the Audobon Society in the case of a lookup miss.
We might use Google’s lookup as a baseline for most things3, but override it for the topics on which we’d prefer to take the word of others, and further override those sources with our own definitions and mappings, perhaps for personal items or for secrets.
We might subscribe to the markup of an individual in matters of opinion or taste, giving, for example, an art critic’s commentary place of preference where it becomes available in the presence of (or within sight of representations of) pieces on which she’s published.
We might bubble up public records for a view into the hidden infrastructure under our feet, or for the permitting and violations and tax history of any building within view.
We might let an independent laboratory’s reports on product safety call our attention when something in our proximity has an outstanding recall, or contains an especially toxic material or other risk to our health.
We could superimpose on packaged goods and cars and appliances actuarial calculations for embodied carbon, or total cost of operation, or resale value, or aggregated consumer sentiment from reporting bodies and nonprofits we trust.
We might give our favorite artist’s work free reign, letting their ‘sculpture’ and ‘graffiti’ and ‘paintings’ and ‘softworks’ share our vision of the real on any walls and streets and halls where they leave them.
We might leave notes for ourselves on the faces of the people we meet.
A world after names
Let’s put a pin in this for now, but I hope you see where I’m going — what if all of our information - words, images, videos, ‘holograms’4, interactive code - data, reference, news, art, and personal expression - was anchored directly onto the people, places, and things of the real world?
This is the real promise of AR: computing vanishes and becomes an expert in our head. The ideas of the world, the world’s nouns and proper nouns, its people, its images, and every point in space on the planet - the very things themselves and not their names, become the tethers to which anyone can attach any form of communication.
This is a world where language is no longer a requisite intermediary for expression. If I have something to share with you about a building, or a certain hat, or a painting, or a boat, or a person, or a mountain, or a logo, neither one of us needs to know what the thing is called.
We should be in awe of the implications. We should be troubled as well.
We should be thinking about what it means that, the way things are going, a small handful of corporations will be in the ‘who says’ position of deciding what information is attached to all of the things and places and people, and how we’re allowed to participate.
I have some … alternative suggestions. Smash, if you will, that subscribe button.
To be as fair as possible, I should note that many earnest discussions of the Metaverse are accompanied by disclaimers stating the authors are out on a limb themselves. Also I turned off comments.
I really don’t think knowing what they were would have lessened the sting if they got me but it adds a certain frisson when I describe it this way, and I owe you a certain frisson once in a while for slogging through this shit
I’m not saying this exists yet. We’re speculating.
Don’t @ me I know what goddamn hologram is