This web app uses cookies to compile statistic information of our users visits. By continuing to browse the site you are agreeing to our use of cookies. If you wish you may change your preference or read about cookies

March 22, 2024, Pedro Trillo

Lights, Camera, Sora AI: The Tech Driving the Future of Film.

Generative AI Sora video

In the heart of bustling Brooklyn, in a small Italian restaurant called «Mamma Mia,» two usual suspects from the neighborhood, Joey «snoring» LaMotta and Frank Costello, «the quiet,» sit to enjoy a delicious dish of pasta. Between laughter and the tempting scent of the kitchen, an unexpected theme arises the revolution of generative artificial intelligence. This is a very peculiar video starring Will Smith.

Joey: Hey, Frank. The other day, I met Vito Sora at the crossroads of 36 and 30, and he told me that people are starting to make little movies with GenAI or something. Have you heard anything about this?

Frank: GenAI, what the hell is that, Joey?

Joey: It’s like a kind of magic dwarf inside a computer. You can make videos of almost anything; when I say anything, it’s anything. She showed me a video of Will Smith eating spaghetti, can you believe it?

Frank: Will Smith eat spaghetti? What’s next, De Niro lacing with a jazz quartet? Where is this world going?

Joey: Unbeliever! With today’s technology, anyone can be a Scorsese.

Frank: I don’t understand. Why the hell would anyone want to see Will Smith eating spaghetti?

Who knows, Frank? People have strange tastes. Maybe one day I will miss the old days when stars like us sat in a place like this and ate spaghetti on a plate.

Frank: I hope those days come back, Joey. Now, all are artificial things without souls: ” What happened to the good times? Where’s “The Godfather?”

Joey: Times are changing, Frank. You must keep up with disruptions, even if that means seeing Will Smith eat spaghetti.

Frank: I don’t know anything about all this mess of generatives, but I can assure you that it will never surpass the real, the authentic, like the spaghetti we have in front of us.

Let’s go for craftsmanship, Joey!

Joey: Come on, friend Frank!

AI video Will Smith eating spaghetti

Open AI launches Sora to the market.

There was a lot of interest in the community to know the recipe that OpenAI was cooking for generative video, and once again, we’ve been shocked, and it’s that this talent team is normalizing exceptionality virtually every six months.

Sora is not a text-to-video model, but as they explain on their website, it is a three-dimensional simulator of the world. It represents a world that we are already beginning to understand very well—a world governed by the rules of physics, camera positions in a scene, and the properties of light in objects. At first glance, we can reason that the disruption on stock video platforms is imminent, but Sora goes far beyond the first impressions.

Developing a model of the world is not trivial, there are significant problems or walls in other technologies that need, like water, to understand the world in three dimensions to be able to evolve so that autonomous driving or robotics can be deployed on a global scale, a three-dimensional world data model is needed so that machines can understand, reason and make decisions through enriched information. Then, there will be Soras, who specializes in solving these problems through simulated video generation.

These kinds of models could completely change the way video games are created, and even the metaverses of the future could be reinvented through these new three-dimensional broadcasting models. Transformer architectures specific to each case of use and technological environment will be realized.

Sora will prove to be the key in the race to the AGI, even gaining more weight than the multimodal models of the type ChatGPT. No data set is richer and more informative than the video signal. The video merges text, image, and sound, and it will make sense that AGI does not come by evolution in an isolated silo in a model like ChatGPT but that the spark will light up at the point where different evolved models of text, audio, and especially video like Sora collide.

The future of the video industry.

More than ten years ago, I worked in the video streaming industry. My work focused on building a CDN (Content Delivery Network) platform that provided Internet streaming services for OTTs (over-the-top). As a sound and image engineering student, I also made some video streaming. I’ve spent about four years of my time in this industry. I have more or less an idea of how this world works.

It’s an industry with a high degree of complexity, and many things happen behind it when you sit on your couch and give it to the Netflix button. The value chain begins with video production, which requires a lot of human and technological resources. Creating quality video is very expensive, it consumes a lot of resources. Then begins the tortuous legal path to exploit the license of that content. Finally, it proceeds to distribute the video on different channels and platforms.

The first disruption began with the digitization of content. As digital, it was replicable. Then, the Internet created an uncontrollable distribution channel. Years passed, content platforms like Netflix were normalized, and today, most of us consume content by paying for subscriptions. In these years, the disruption emphasized changing the distribution model, but the way audiovisual content was created from the outset had barely changed.

These new text-to-video models radically change how we create that video from the source to create an aerial sequence plane of a cliff where waves break against rocks. You can rent a helicopter with a pilot, install an aerial camera operated by an expert camera, and record that plan at a cost of approximately $10,000, or you can hire a high-resolution drone and plan a flight at the expense of $2,500, Or you can go to Sora, create a prompt, launch the process and get a spectacular video in seconds, for $0.04.

Sora OpenAI Text To Video – Drone View Of Waves

Historically, since the advent of the Internet in our lives, technological advances have always first-personly impacted the video and music industries; they are somehow the rabbits of India, where initial disruptions occur, which are transferred to other industries in later years. What is happening today with tools like Sora is changing the game of poker in this market, the change that will take place in these audiovisual industries will be profound, and it is not a ballad.

Sora opens new opportunities and new video startups.

It will have to be invented if it does not exist, but there is a gap to fill. New video platforms will emerge based on the architecture of the creator economy. I imagine a substack for video content creators. After all, soon we will all be Scorsese, all of us will have the capacity to create and produce high-quality audiovisual content, everyone will be able to develop our scripts and series, they will be accessible to everyone, and new formats of video consumption and audiovisual experiences will come out that do not yet exist.

Just as we subscribe to creators’ newsletters, we subscribe to content from potential film directors. In some remote Himalayan village, there will be a nine-year-old boy with a special gift to produce cinema, and he doesn’t even know today that he has that innate gene. When you access Sora, you will be empowered to showcase your art to the world.

Enjoy audio-visual entertainment in a fully personalized experience, log into your Netflix, and don’t spend half an hour finding the content you want. Today’s recommendations will evolve towards a tailor-made and highly personalized experience. You will choose the time of the filming, write a short input script through a prompt, define the actors, select the locations, and give it a generate button. The computational challenge will be to play that content in “almost” real-time.

With GenAI, the script is never written, and the possibilities are endless. When you have to watch the second chapter of your ultra custom series, you probably have to select at the end of chapter 1 how you want the story to continue.

Generative Artificial Intelligence will open up a new world of hyper-personalization with endless possibilities, in which you will be the protagonist of the story, create your movies, compose your favorite songs, and build your custom software, this is just the beginning, welcome to a new era of generative video.

Pedro Trillo

CEO at Vizologi | Website

Pedro Trillo is a tech entrepreneur, telecommunications engineer, founder of the startup Vizologi, specialist in Generative Artificial Intelligence and business strategy, technologist, and author of several essays on technology.