30 minutes into the future
Apple Vision Pro launched in Australia on Friday 12 July 2024, but somehow I had forgotten all about it, despite the barrage of advertising on the App Store. The US launch was massively hyped on YouTube and on tech blogs, but the wider release was considerably more muted. The following Friday, Jess and I were in Sydney for a show, five minutes’ walk from an Apple Store. “Wait a minute”, I thought to myself, “is Vision Pro out yet? Can you book a demo session?” The answer to both questions was “yep”, so I booked us in for a demo the following day. I was happy that Jess agreed to join me, giving her the full context for my overexcited ravings in the days and weeks to come.
Full disclosure: I don’t have much experience with virtual reality (VR) or augmented reality (AR) devices. I used the HTC Vive years ago at a friend’s house, and thought it was fun. But it was fleeting fun, just like Rock Band, Kinect and Wii Fit were fun for a while, before the peripherals became e-waste cluttering your cupboards.1 The Vive is the extent of my VR experience; I don’t know anyone who even owns a Meta Quest (or will publicly admit to it). That’s part of the appeal of Vision Pro for me: to take cool technology and give it a compelling, lasting use case. Apple is often criticised for not being ‘first’ to new technologies, but it’s often first to give those technologies a thoughtful purpose.
We arrived at the Apple Store and checked in, the staff member sat us down so we wouldn’t bump into anything in our VR stupor. I scanned my face with my own iPhone to measure the right light seal for the headset. The assistant scanned my glasses (you’re asked when booking the demo if you have a prescription) and magnetically clipped ZEISS lenses to the inside of Vision Pro. I put the headset on, went through a short calibration process in the darkness, and then I was suddenly thrust into the world of ‘spatial computing’.
Deja view
Vision Pro is a VR headset masquerading as an AR headset. True AR would be something like a holographic projector, layering information on top of what your eyes can see in the real world, like Microsoft Hololens. Instead, Vision Pro uses cameras and screens to create an illusion of augmented reality: cameras capture the outside world, layering a user interface on top of this representation. Your eyes see only two ultra-high-resolution displays, 50x the density of an iPhone’s screen, with the appearance of your app windows floating in the real world. Whether you find this a convincing illusion or not will determine what you think of the experience in general: personally, I found it breathtaking. I quickly forgot that I was seeing the world through a camera. I didn’t feel any latency between my head movements and the camera feed. I didn’t notice the edges of the field of view, and as I moved around, the glassy spatial windows remained locked in place2.
One thing I did notice: a slight fuzziness to objects when I wasn’t staring straight at them. This is called foveated rendering, and it’s a smart way to reduce the processing load on Vision Pro (and also used by PlayStation VR2). If you’re not staring directly at something, there’s no need to render details you can’t perceive. It makes sense from a technical perspective, but it distracted my literal perspective. It’s a little aggressive. I was never quite sure if my prescription was right or Vision Pro was dialled in properly.
I aim with my eye
Spatial computing requires new paradigms of interaction. You can’t rely on a mouse to point in 3D space3: when you think about aiming a gun in a videogame, you’re pointing a crosshair to shoot through a line within that space, but you don’t have direct control of selecting along that line. You control visionOS with your eyes and hands: look at what you want to interact with, and pinch together your finger and thumb to select it. Vision Pro is riddled with sensors and cameras, inside and out, so you don’t need to hold your hands in front of you like you’re conducting an orchestra. You can rest your hands on your lap; it will see them just fine.
Eye tracking is an alien experience. You don’t realise how accustomed you are to the separation of eyes and hands for computing until you need to use them in unison. Now, every glance at the user interface has weird control implications. You can’t click in one place while you look at another because looking and clicking are now intrinsically linked.
These kinds of unnatural interactions aren’t new, of course. When the iPad came along, wet hands or resting your fingers on the wrong corner of the screen would lead to uncontrolled screen spasms. We had to learn how to type reliably, swipe, and pinch on that giant screen. These adjustments take time: I upgraded to an Apple Watch with an always-on screen almost a year ago, after eight years without one, and I still reflexively raise my wrist like I’m revving a motorbike to ‘wake up’ the screen.
I’m confident that in time, I could adjust to the new paradigm of pointing with my eyes. But from my limited demo experience, it just isn’t reliable enough. Even during the calibration process, Vision Pro wasn’t accurately tracking my eyes towards the periphery of my view – and that’s where most of your controls are. Similar to macOS, buttons and widgets lurk around the edges of the windows. Perhaps I could have recalibrated for better success but, within the limited demo time, this alone puts me off buying Vision Pro. It was always a little jerky and inaccurate, like moving a mouse with a dwindling battery, or trying to type with a broken keyboard. Yet, it’s worse than either of those examples because at least those have a mechanical root cause. Vision Pro’s problems emerge from its reliance on algorithms and trust.
Vision Pro does support the ‘Magic’ Trackpad for mouse-like control within apps, and third-party keyboards4 are essential for serious productivity (just like on tablets). There’s a virtual keyboard you can poke at, and they’ve tried their best, but it’s barely suitable for the lightest of typing.
Tears on Tape
Once I was calibrated and oriented with Vision Pro, the highly scripted demo began for real. My guide could see everything I could, from an iPad tethered to the headset. She shows me how to open Safari and manipulate windows: there’s a grab handle in the corner I can pinch to move them anywhere within space, including above me, which arcs the windows in an (invisible) dome around me like the Millennium Falcon’s gun turret.
On to the Photos app, where I stretch out a panorama to a massive field of view. These look pretty neat, but not as cool as the ‘spatial’ photos and videos. These can be taken while wearing Vision Pro if you’re that weird dad from the adverts, looking like Robocop patrolling a birthday party, but more likely you’ll want to use an iPhone 15 Pro or newer.
Spatial photos look almost holographic. They’re not the same as stereoscopic 3D: they have a real depth and shape to them, more than just a parallax effect. Spatial videos are astonishing: they’re like gazing through an uncanny portal into the past, closer to the full fidelity of a memory than a typical photo or video.
I am not sure if I’m ready to see lost loved ones in such a potent way. If I could watch spatial videos of Spyro on this thing, I’d fill the inside of Vision Pro with tears. visionOS recently5 added a new feature to ‘spatialise’ regular photos. I’m not sure if processed photos will be as compelling as photos initially captured as spatial images. But I am sure that, once you’ve experienced memories this way, you will find it difficult to go back to the older, flatter ways. Much like Apple’s Live Photos, they add a little extra flavour that makes photos blander without it.
Face/Off
Apple is primarily marketing Vision Pro as the ultimate video entertainment device, like strapping a cinema to your face. I can’t remember what clip I watched in the TV app, but I remember how I watched it. I was surrounded by a virtual theatre with a very convincing cinemascape, right down to the textured sound-dampening ceiling tiles. You can choose to ‘sit’ in the middle, back, or front of the cinema – I’m not sure why the front is an option, but it’s there for masochists6. You can also control whether you’re in the stalls with the plebs, or elevated into a royal box.
This is a superlative movie experience: a convincing illusion of an enormous, perfectly lit cinema screen, without the dickheads scrolling on their phones or muttering to themselves. Even the sound, which comes from speakers built into the Vision Pro strap, is surprisingly good – albeit impossible to objectively judge in a noisy Apple Store.
Just as spatial photos are a quantum leap over regular photos, ‘immersive video’ is a paradigm shift over regular video. Remember 3D movies? Those years when every movie was in 3D, glasses wearers couldn’t attend the cinema without neck braces to support their six-eyed screen experience, and TV manufacturers were desperately trying to make 3DTV in the home a thing?
Well, this isn’t that. Immersive video makes Avatar look like a children’s pop up book. While watching the demo reel, it’s like you are teleporting to the side of a basketball court, or a mountain peak to watch a tightrope walker defying death. Afterwards, the guide commented she’d loved watching my reactions to the underwater footage and the herd of elephants on safari. I was absolutely enthralled. This was almost good enough to make me interested in watching live sports. Almost.
Virtual reality check
The immersive video reel marked the end of the demo and, since you’re reading about a Vision Pro demo and not a Vision Pro review, I obviously didn’t buy one. Vision Pro is an astonishing glimpse into the future, and it will probably be a part of my future, just not today.
There are thousands of reasons not to buy a Vision Pro today — and I don’t just mean the AU$6000 asking price, which is enough money to buy a new iPhone, iPad, and MacBook Pro. The eye tracking isn’t reliable enough. The immersive video is spectacular, but there’s not a lot of it7. Third-party software is in a dire state due to Apple’s years of developer hostility.
My biggest problem with Vision Pro is the sheer loneliness of it. It only supports one user, like an iPad; yet unlike an iPad, did I mention it costs $6000? You might say Apple are encouraging families to purchase more than one, but let’s get real: for most families the choice is one Vision Pro, or zero.
As great as the movie-viewing experience is, who wants to watch movies alone? I only watch TV, and play a few games, together with my wife. That’s not possible with Vision Pro: you can look at spatial video representations of old family memories, but it’s a time capsule for one. And while you’re walled off from the world in VR, you won’t be forming new memories with your family.
If you live alone, or you’re in a relationship and would like to spend your way closer to living alone, Vision Pro is a relatively affordable home cinema for your face. It does more than a TV and soundbar, and I think the viewing experience would be better. The weight is probably more fatiguing, but that was tough to judge from the demo.
The most surprising thing about Vision Pro for me is that Apple’s greatest strength is in humanising technology. Whether it’s the Mac, iPhone, iPad, or Apple Watch, they’ve all brought technological innovations to the average human, usually in thoughtful ways that seem obvious in hindsight. I hoped that Vision Pro would do the same, taking the VR possibilities of the HTC Vive and Meta Quest and turning them into a mainstream hit. I’m not sure if it will get there.
Maybe that’s because this is a first generation product that needs time to find its feet (like the Apple Watch). Maybe we need another technological leap forward before spatial computing reaches its potential – like using regular glasses lenses as screens8. Maybe VR headsets are destined to remain niche products, and it’s a category even Apple can’t ‘solve’. For now, Vision Pro has all the technology, but it lacks a little humanity.
Tunnel vision
There’s where this would end in a traditional review: weird intro, cover the main bases, pithy outro, publish to WordPress. Job done9. In this case, a heavily staged 30-minute demo in an Apple Store isn’t nearly enough time for a fair review, especially with an entirely new product category. It would also take more than a short demo for me to part with that much money for an unproven product. It’s different to the Apple Watch or the iPad, which were undoubtedly indulgent purchases – especially that first Watch with no GPS and the super-slow processor, what was I thinking? – but they also had clear use cases. The iPad is the perfect light computer for web browsing, email, reading and watching videos. I wish it could become a ‘heavier’ computer running macOS, but it serves a niche.
The Apple Watch is perhaps the best comparison. It was a confused product at launch before it found its calling as a health tracker and fitness motivator. It replaces the iPhone if you’re going for a run, but for everything else, it’s a companion – not the main event.
With Vision Pro, Apple are positioning this as the new main event – “the era of spatial computing”. It’s a great video watching experience, but is that enough of a niche?
- Although in the case of Rock Band, those peripherals cluttered my house for a very long time. ↩
- If you’ve watched videos of Vision Pro, especially reviews, it might look like the windows aren’t so stable, but you’re seeing the head of the wearer making micro-adjustments without the natural corrections of your brain’s own visuo-spatial perception. From the perspective of the wearer, it’s rock solid. ↩
- Maya and other 3D design apps have figured this out, but those are niche user interfaces that can’t scale to a whole operating system. ↩
- Maybe that’s ‘keyboard’, I’m not sure if non-Apple models work ↩
- Editing this essay took an embarrassingly long time… ↩
- I’ve only watched one movie from the front row: Star Wars Episode 1: The Phantom Menace. They were the only seats left in the cinema. Imagine sitting through 2.5 hours of trade negotiations and Jar Jar Binks, giving your teenage self permanent neck damage in the process. Never again… ↩
- And the Alicia Keys studio performance I caught a glimpse of in the demo reel was a little creepy, almost voyeuristic. ↩
- I took so long to edit this essay that Meta showed off their AR glasses during the editing process, and I’ve waffled on enough about adding my impressions. In three words: real artists ship. ↩
- I am omitting the final step after publishing to WordPress – “spot and correct the typos that eluded me during the editing process”. ↩