Ever wonder what AI technology will do next? The field of AI has made impressive gains on image understanding in the past decade, and it will make considerable advancements in video understanding in the next decade. How will this impact video production? There is more an more demand for video online. Younger generations have a low barrier to just try things and start filming, whether in Snapchat or Tiktok or the next thing. We can look to the history of photography. As more people had access to point and shoot film cameras, as the quality went up for single click mobile phone cameras, as distribution became easier, demand and usage just grew. This is true from kodak to iphone.
However, video production has unique challenges. Time, consistency, sound, acting. Even when focus, exposure and color balance are “solved,” there is more burden placed on the creative’s shoulders, and this can be intimidating. Teams of experts can overcome this using specialized roles in traditional film and video production. The role of AI in video production needs to facilitate these disparate human responsibilities, enabling non-expert creatives to indicate aesthetic preferences that guide chains of actions. Together groups of people need to “steer” an AI technology. Or looking at it from the other side, a small group of people can be empowered to do the work that previously required a much larger crew with more expertise. They can fill in the gaps with crowd knowledge, best practice, and ai.
Lets look at “workflows” that facilitate production. These flows allow content to flow from edges to a centralized record of re-watchable / reusable / evaluated content. At first, I expect a system of distributed human contributions to provide value with NO role required for AI. That is to say that the initial tools will be built for collaborative and social video, and AI will enter the picture at a later stage when UX, speed and cost become driving factors. In this case the media framework is an organizational tool, a contract between roles. Useful, but taking no creative action. At a later stage, the digital actions of contributors can be estimated and proposed, especially in post production. At some point creative work may feel more like “evaluating alternatives” and adjusting high level parameters. For example, rather than adjusting 8 knobs on each video clip to gain color consistency, a user may express the intent to group six clips together with a common target look and feel, and all other clips with a different look. Then an AI could propose a few different versions of the look and feel for each. But I digress. At first the company needs to find a sustainable source of value: to be workflow that coordinates between roles in a mobile production environment. Simplifying creative actions of contributors to focus on the essence of what matters in a good story.
I do think there is a role for an “AI plugins” that focus in on a single re-occurring media task. For example, for filtering out bad video clips prior to reviewing footage (for example from sections with unusable camera motion). Maybe for color correction? The alternatives to compete with are the existing set of tools. The tools will improve with AI, but yet already have discovered a workflow (UX) that aligns with their audiences desired tradeoff in speed / control of selecting parameters. Yet plugins don’t have the customer loyalty and lack insight to their context. Even companies like Codex / Pix are looking to create a consistent workflow for the “connected set” where all the media from the production shoot migrates to the cloud. It seems that their killer feature was to maintain ARRIRAW file as a lossless compressed file in stead of ProRes, striking a balance of giving cinematographers more control, and saving on storage cost of raw media. But owning the whole workflow means owning the media needs (sync, review, commenting, security). And in many ways the smart AI for the future of media transformation should be “context aware” (of the project, the team, the roles, those users past requests, past transforms, etc.). This demands maintaining the chain of digital actions from RAW files, various transforms from various users, especially including selection. The more custom tools are used, such as a custom notch filter to cutout a background sound, the more it becomes challenging to have one system to keep track of all transforms. Indeed, the existing collections of digital tools inside the biggest NLEs define a relevant baseline API – because thats what creators currently use, and its unlikely they want to abandon features they like. (However, doubling down on social sync might obviate the less used features … As the best content involves money and complex teams, the approval process of content needs to mirror that social complexity. Honestly, frame.io ‘s approach seems spot on for acquiring the creative users that glue it all together: the editors. These users will likely have the skills to command (and train) the future content creation AIs. Certainly Adobe is thinking about AI for video: they know the re-occuring media tasks that will save time for their customers.
What about content production for social media channels? And I don’t means live streaming services, but something a bit more structured and planned at the moment of the shoot. We see the world move this direction. Tiktok exists. Challenges provide a scaffolding for what to do. Snapchat filters nudge you into action. I keep an eye on Vimeo’s smart templates for business owners (previously Magisto), Google and Apple produce videos from our camera roll and send them to us (e.g. no parameter or low parameter video types “make me a valentine’s day themed video with these two people”). I am fascinated by creating simple social roles for content creation, especially for camera operators that fit into a bigger picture of content production. One thing that is not consistently defined is the motivation for people to work together. I personally dream of the satisfaction of collaboration, but how many people do? The community at HitRecord is a very interesting case study. I don’t think the community is merely JGL-fandom; I think these people are excited about the vision of co-creating. Yet I think a more realistic vision of content production is that some people will receive content quickly (and may pay for it), while others will complete media tasks quickly (and may get compensated for it) though possibly some of the compensation is brand equity, a credit coin, or something the is redeemable in future value within the production ecosystem. It might be simply because they want to be included in the process, or they care about the outcome of the content.
But what is the problem being solved? I see three different value propositions for where a collaborative video production product might go, and they lead the three very different businesses.
Efficiency. Save time and money making video for a business. A pipeline from ideas to sharable social videos. This lets companies explore and control their messaging without sinking a ton of money into an external agency for every video. Priority is to maintain quality and control.
Aspirational. Improve skills and network with creatives. Allows creative talent to improve their skills in a gamified world of speed production. Team formation. Level up and learn new techniques. Gimble free pans. Soft focus on mobile. Priority is improving and sharing creative skills. Business model: like meetup. Pay to control a project, free to join as a credited contributor. Maybe a gig economy one day? Or maybe centered on networking, with freemium mini-gigs to land serious creative contracts.
Entertainment. Have fun creating content with friends. Ridiculous situations, and speed challenges. Push your creative spirit, and play together with media, even when you are in different locations. Priority is an enjoyable shared production experience. Business model: maybe in app purchases? a marketplace for project structure / games.
It is interesting that the online video editor WeVideo, seems to have 3 separate channels: for Business, Education and Life. These roughly track my three value propositions above. However, their “educational” is much more a tool for teachers, and less of a learning tool for independant networking semi-professionals. And their “life” user seems more focused on family self-documentation, and less of the social world of video challenges that are in vogue.