By Matthieu Rouif, CEO, Photoroom
The UK’s decision to invest billions in AI infrastructure is the right one. Compute, the processing power behind today’s models, is as fundamental as electricity. But electricity sitting in the grid does nothing on its own – it only creates value when people can plug in and use it. In the same way, compute only drives productivity when businesses and workers can access it directly.
The issue is usability
Despite the infrastructure boom, adoption remains patchy. Large corporates may have dedicated AI teams, but most smaller firms – and the employees on the front line – are being left behind. The challenge is not willingness. Businesses understand the potential of AI, but most people are not “prompt engineers.”
Relying on carefully crafted text instructions is like asking everyone to learn a coding language before they can turn on the lights. It creates a skills barrier that slows down adoption and concentrates benefits in the hands of specialists. To unlock widespread productivity, the interface must change.
The cause is text
AI today is designed to respond to written prompts. But a blank text box is not an intuitive starting point. It can be intimidating, inconsistent, and time-consuming. Creativity stalls when people are forced to translate ideas into precise wording.
Images offer the opposite experience. A photo is immediate, universal and anchored in reality. With image-first inputs, ambiguity falls because AI has something concrete to work with – a product, a space, a document. Consistency improves because outputs flow from the same visual baseline.
At Photoroom, we see this daily. A single photo of a shoe, a shelf or a receipt can be cleaned, standardised and repurposed across multiple formats in seconds. Teams without design expertise can create assets that meet brand guidelines effortlessly. Workflows move faster and results are more accurate precisely because they begin closer to the truth.
The solution is to design for images
Image-based AI does more than simplify workflows – it builds trust. In e-commerce, buyers want to zoom in on details. In regulated industries, provenance and traceability are essential. Starting from a real photo means edits are visible, approvals trackable and compliance easier to demonstrate. Trust matters, and grounding AI in reality rather than imagination makes it possible.
Momentum is already building. A TikTok shop can compete with established brands by turning phone photos into polished, studio-quality campaigns. Marketplaces scale this same approach through integrated tools that process millions of listings. In 2024 alone, Photoroom users edited 2.2 billion images, with installations surpassing 300 million across 180 countries. Partnerships with Amazon, DoorDash and Warner Bros’ Barbie campaign show that this principle extends across industries: start with reality, then let AI enhance rather than hallucinate.
For the UK, the choice is clear
Billions in compute are the engine, but the steering wheel is interface. Government pilots should prioritise image-first tools in areas where cameras are already standard, from inspections to permits and benefit verification. Public funding should back image libraries that can be safely used across key sectors such as healthcare, logistics, and agriculture. And meaningful allocations of compute should go to smaller firms building visual-first tools, so they can deliver practical products rather than stall at prototypes.
The most universal input device in history is already in our hands: the camera. If photos are the global language, then the path forward is clear. Let people show what they mean and use words only to refine. That’s how the UK can turn infrastructure into adoption, and adoption into visible, measurable productivity – across shops, studios, warehouses, and council offices alike.







