
Runway Gen-4 is actually insane (text-to-video)
AI Generated Summary
Airdroplet AI v0.2Runway just dropped their Gen 4 text-to-video AI model, and it's seriously impressive, marking a big leap in creating videos just from text prompts or images. The quality and realism shown in the examples are pretty wild, showing off major improvements in how realistic things look, how smoothly they move, and how well the AI understands the real world compared to older versions.
Here's a breakdown of what's covered and the key takeaways:
Introduction & Initial Wow Factor: Runway Gen 4 is here, pushing text-to-video generation forward significantly. The quality in the demo examples is described as "insane." An example shows a crosswalk sign figure realistically jumping off the sign and walking around, with accurate lighting and cars. Another example features unsettling bird-guy hybrids, labeled "nightmare fuel," including chicken, flamingo, and turkey versions. The flamingo-guy turning its head all the way around is highlighted as particularly creepy. The turkey version with huge necks is called "absolutely gross."
Hyper-Realistic & Cinematic Examples: Several demos look like they could be from actual movies. A scene resembling the movie Dune is shown. A woman walking through a forest demonstrates impressive object permanence – she looks correct even when briefly obscured by branches and leaves. Examples include a beautiful Chevy, a futuristic sci-fi scene (like Star Wars), a foggy environment (like Alaska), realistic water physics, and a Game of Thrones-style scene. Another car example doesn't look quite as good, but the physics are still noted as solid. A rocket launch is shown, described as "beautiful." Other impressive visuals include asteroids, an explosion, a woman realistically on fire ("fire looks crazy"), jellyfish with good physics, a Final Fantasy-esque scene, another spaceship, a stylized robot, a woman made of rope, and a monkey. A bridge collapsing on fire has some "oddness" but is generally incredible.
User-Generated Content & Flaws: An example by user Eleanor, created in 20 minutes using images from Reeve, shows people. Most people and hands look perfect in Eleanor's example. However, a mistake is spotted: one man's arm appears to be pointing the wrong way, and another arm seems to come out of a different person. Old cars driving by look good, and a woman's face, blinking, glasses, and hands are called "flawless." Another minor issue is noted with car wheels spinning strangely.
Official Gen 4 Announcement Details: Runway introduced Gen 4 as their new state-of-the-art AI model series for media generation and world consistency. It's positioned as a significant step up in fidelity (visual quality), dynamic motion, and controllability. Image-to-video features are rolling out immediately to paid and enterprise plans. Gen 4 is stated to be a marked improvement over the previous Gen 3 Alpha. Key strengths highlighted: generating highly dynamic videos, realistic motion, subject/object/style consistency, superior prompt adherence, and best-in-class world understanding. This "world understanding" (physics, object interaction) is noted as a crucial aspect, similar to what made OpenAI's Sora stand out.
More Diverse Examples: The presenter emphasizes that the current output is "the worst it will ever be," implying rapid future improvements. Cartoon examples looking like Pixar movies are shown, described as "gorgeous." A short film example, "The Lonely Little Flame," features clay-style animation (claymation/stop motion look). The movement style matches the perceived material (clay/stop motion). Features a pigeon, a skunk, a flame character, and a rock character. The story involves the flame setting the skunk on fire, then befriending a rock. Overall, the short looks "beautiful" despite the odd story. Another short, "New York is a Zoo," shows various animals (rhino, giraffe, elephant, bear, zebras, lion, monkeys) in NYC settings. Some awkwardness in movement is noted, but overall it looks "really good." The monkeys on a traffic light look "flawless," reminiscent of Jumanji. "The Herd" short film features a man chased by cows at night, created using Gen 4 and image references combined with Act 1 software. The cows look "flawless." There's an "uncanny valley" feeling with the human character. A scene with gasoline, a match, and fire shows "incredible" and "very realistic" flames and smoke. The silhouettes of cows against the fire are highlighted as impressive. The potential for generating full movies with just prompts ("vibe movie making") is discussed.
Detailed Analysis of User Examples: A subway scene shows a man standing still; the way a pole realistically passes between him and the camera, preserving details on his shirt (like folds), is called "very cool" and "very impressive." A man running looks "very natural." Paint splashing looks "really good" with beautiful detail. A car on a snowy mountain road looks "really good." A flame in a glass ball is shown. A woman with an umbrella walking: while beautiful, a flaw is noted where puddles react to her movement even when she doesn't touch them. A guy in a room with light balls. A volcano scene looks "gorgeous." Fireworks/light explosion shows some minor mistakes. A guy skateboarding with an umbrella looks "really cool." An awkward scene of a mother and daughter arguing: characters don't look at each other, daughter points randomly, mother's posture changes oddly, daughter fumbles unrealistically with a suitcase. This example is deemed "not good actually." A fabric/felt-style animation example: Features a bird, caterpillar, snake, porcupine, fox, deer, raccoon, and ducks. This style looks "phenomenal" and is a personal favorite of the presenter. The snake is called "unbelievable." The cartoonish, fabric-y look is consistent. Minor critique: water dripping from the raccoon's mouth looks good, but the water it hits doesn't react quite right. The fabric ducks floating (instead of absorbing water) are noted but still look "so cute." A rabbit running with a light trail looks "pretty good." A subway scene combining a hyper-realistic human, a 2D cartoon, and a fuzzy teddy bear: The lighting is praised for reflecting correctly and consistently across all three distinct styles. Called "very nice." A Pixar-style shot by Mad Pencil shows a character with electricity/lightning; the lighting is again called "beautiful." Futuristic cars backing up: Wheels spin unrealistically fast for the movement speed. Looks "okay" and "cool" but "far from realistic." A view from inside a luxury train looking out at a misty forest: Trees and scenery pass by consistently and realistically. Called "very nice." Hyper-realistic wolves: Details like wet matted fur, teeth, tongue, nose, and hair look "really good." Focus shifts (defocus/unfocus) and water interaction are well-handled and "really nice." Hyper-realistic screaming woman with turquoise glow: Teeth remain consistent. Some slight, awkward morphing of the tongue is observed. Overall, considered "very, very good." Hyper-realistic model shoot style: Dewy skin, shadows moving across the face. Called "very impressive."
Concluding Thoughts: Gen 4 handles an immense amount of detail and nuance, making this a "huge launch" and "very, very impressive." Runway Gen 4 is definitely giving competitors like OpenAI's Sora "a run for its money."