Artificial intelligence continues to break new ground, with OpenAI at the forefront, unveiling an AI tool named Sora that brings the power of text-to-video capabilities to the creative industry. This new technology can generate highly realistic 60-second videos from simple text prompts, marking a significant leap in the realm of AI-generated content and raising both possibilities and concerns about the future of deepfakes.
Sora, which is currently being made available to a select group of artists, filmmakers, and researchers who actively seek out AI’s potential for malicious use, represents the next evolution of OpenAI’s image-generating tool, DALL-E. The tool interprets user prompts, turning them into detailed instructions, and employs an AI model trained on video and images to craft entirely new content. With the rapid advancement in AI-generated imagery, audio, and now video, the race among tech giants like Google, Meta, and various startups to develop increasingly competent tools is heating up.
In an email statement, a representative from OpenAI emphasized that the model will not be widely released in their products in the near future. They added that the company is currently sharing its research progress to gather early feedback from the broader AI community.
OpenAI, known for its popular chatbot ChatGPT and text-to-image generator DALL-E, is among several pioneering tech startups at the forefront of the generative AI revolution since 2022. In a recent blog post, the company mentioned that Sora is capable of accurately generating multiple characters and various types of motion.
“We’re teaching AI to understand and simulate the physical world in motion, with the goal of training models that help people solve problems that require real-world interaction,” stated OpenAI in the post.
In its blog post, OpenAI acknowledged that Sora may encounter difficulties in accurately capturing the intricacies of physics or spatial details within more complex scenes. This limitation could result in the generation of illogical scenarios, such as a person running in the wrong direction on a treadmill, subjects morphing in unnatural ways, or even disappearing inexplicably.
Despite these challenges, OpenAI showcased numerous demonstrations highlighting hyper-realistic visual details, making it challenging for casual internet users to discern AI-generated video from genuine footage. Examples included a drone footage capturing waves crashing against the rugged coastline of Big Sur, illuminated by the setting sun, and a scene featuring a woman leisurely walking down a bustling Tokyo street, still moist from the rain.
On Thursday, the Federal Trade Commission (FTC) introduced proposed rules with the objective of prohibiting the creation of AI-generated representations of real individuals. These rules aim to expand the existing protections established for government and business impersonation.
In a news release, the FTC stated, “The agency is taking this action in light of surging complaints around impersonation fraud, as well as public outcry about the harms caused to consumers and to impersonated individuals.”
The FTC highlighted the potential threat posed by emerging technologies, including AI-generated deepfakes, in exacerbating this issue, and affirmed its commitment to utilizing all available resources to identify, prevent, and mitigate impersonation fraud.
Relevant articles:
– Sora: Creating video from text
– OpenAI shows off life-like videos made with AI, The Washington Post, 2 hours ago
– OpenAI teases ‘Sora,’ its new text-to-video AI model, NBC News, 3 hours ago
– Meet Sora, OpenAI’s Text-to-Video Generator, CNET, 43 minutes ago