Cloud AI Is In The Wrong Place For The Media Industry
As Big AI, typified by hyperscalers OpenAI, Grok, Meta, Microsoft and Google, continue and even accelerate their exponential growth, it's an important time to ask whether this approach is the right one for the media industry.
This is a conversation that's distinct from the one about whether AI itself is a Good Thing, and what its role should be in the creative space. This one is about – when AI does make sense, where should it be running?
Many industries assume that most of AI's power will continue to stem from the cloud providers who are aggressively leading on development of new Large Language Model (LLM) technologies, and even newer, bigger tech that may build on or even cut against the grain of that approach, while using even more aggregate GPU power.
However, incredibly powerful AI tools can now be run on premise, using standard computer hardware like the latest Mac minis and Mac Studios, high-end gaming GPUs from Nvidia, and more. These tools can make massive contributions to creative workflows, and they are distinctly better than Big Cloud AI for the media industry on several key grounds – which I will simplify as Cost, Trust, Speed and Guilt.
Cost
Cloud AI providers take advantage of the 'rent-seeking' aspects of the cloud economy to charge multiples of their underlying hardware cost for GPUs and storage. This means that in most cases, you are either paying for heavily marked-up capabilities, often at prices per minute of footage processed, or taking advantage of billions of dollars of short term VC investment which will force your supplier to make customer-unfriendly choices at some point in the future. Neither of these is very attractive.
Meanwhile, on premise AI costs, well, what it costs. The hardware itself is getting dramatically better over time – the latest Mac minis measure 5 x 5 x 2 inches and can do real supercomputer-class AI – and much like the original PC versus Mainframe wars of the 80's and 90's, this is a case where the little guys are likely to surprise us with their power over time. Those Mac minis, and affordable if sometimes clunky PC gaming machines with Nvidia cards, can already do a surprising amount of AI work, taking up far less space and power than you'd expect. Our own Axle AI Tags software is capable of doing scene understanding, vector search, trainable face recognition, object, logo and character (OCR) recognition and speech transcription, all using only local resources with no cloud connection required. Meanwhile, compact versions of LLMs, and even a new breed of Small Language Models – SLMs - are now fully capable of running on local hardware as well.
Trust
Cloud AI's power, even the latest multimodal models, derives in large part from 'scraping' large parts of the public and semi-public Internet. When you send your media to a cloud provider, you are putting your intellectual property at risk. Exactly how much risk depends on your situation, and who your provider is. But as recent blowups over terms of service on Amazon's Rekognition, Adobe's Sensei, TikTok's CapCut and most recently even WeTransfer have shown, cloud vendors seem to edging towards a stance where they own the right to reprocess and train on your content. All four of these vendors have put out terms of service that assert significant rights to reuse your content for training purposes.
For nearly every content owner we work with at Axle AI, this is a bridge too far. Why expose your valuable IP to the possibility of being turned into someone else's product with no attribution? This is especially true for VFX special effects, or cinematographers' styles, which are notoriously hard to protect. But with the latest multimodal LLM's like Google's VEO3, it can even extend to acting and dialogue. The possibilities are endless, and not in a good way. Content owners and creators are right to look for a way to benefit from AI power, rather than the other way around, and keep control of their intellectual property.
Speed
95% or more of media files are stored on premise. Typically, these are a mix of Network Attached Storage (NAS), loose hard drives, RAID arrays, LTO tape libraries, and occasionally high performance storage are networks (SANs). With the increasing trend towards 4k and higher resolution footage, these files are very large, and the bandwidth limitations of even today's high-speed internet means that uploading them to the cloud for processing is necessarily going to be slow and inefficient compared to local options, which can leverage wired 10 Gigabit Ethernet and even faster options.
For many media AI applications, lower-resolution proxy versions of the original media are actually a better fit for AI processing needs, and on-premise hardware can be used to generate these proxies. But once you have hardware in place to do all that work on hundreds of terabytes or even petabytes of media, why not upgrade it a bit and run the AI locally as well? This is the realization that many media companies are increasingly coming to.
Guilt
Recent headlines have included “Meta will build data center the size of Manhattan with latest AI push”, “Google inks $3bn US hydropower deal as it expands energy-hungry datacenters”, and “Three Mile Island Is coming back to life – to power Microsoft's AI”. And this is all for what amounts to the first generation or two of Cloud AI.You don't have to be a hardcore environmentalist to be concerned about the long term impact as these trends continue to accelerate. We will all pay for the impact of these trends, either in environmental costs or in resources diverted to them instead of other, potentially better goals.
Many people in the media industries have chosen to do what they do in part because it's not harmful – there are other professions where ethical tradeoffs have to be made, lives put at risk, and major resources squandered, and media work has largely been spared these concerns. Suddenly, media organizations – and the even larger, rapidly growing media portion of every part of the economy including sports, houses of worship, corporations and government – will have to reckon with the environmental impact of their work. But it doesn't have to be that way. By focusing on powerful, increasingly efficient local computing power to run its AI, the creative world can continue to deliver its value without harming the world at large.
Admittedly, there are benefits to be gained from using Cloud AI, and the latest multimodal and generative AI models will continue to be developed there. Nobody is suggesting an outright ban or boycott on this kind of solution. But given the clear advantages of on-premise AI, and the combination of immediate (cost, trust and speed) and big-picture (guilt) benefits of choosing local processing, the media industry has a sensible, clear path forward – and it's not the cloud.
🎥 Watch the Interview
For more on this topic, check out this short video of Sam Bogoch in conversation with Larry Jordan, where he shares his insights on why the cloud isn’t the ideal home for AI in the media industry.
By Sam Bogoch, CEO of Axle AI.