Theme 07: The Route Dispute of AI Film and Television—2D Generation vs. 3D World Simulation
Core Argument
Sora's exit doesn't signify the death of AI film and television; rather, it declares: the business model of generating video out of thin air via brute-force computing power is unviable without low-cost compute and distribution platforms. The future of AI video depends on the combination of technical route selection, cost structure, and distribution channels.
I. The Technical Divergence of Two Routes
Route A: 2D Generative (Sora / Veo / Seedance)
Mechanism: Diffusion Models or Transformers "paint" every pixel directly from text descriptions. It is essentially an extremely complex process of "predicting the next frame."
Advantages:
- Extremely low barrier to entry—anyone who can type can generate videos.
- High visual impact, capable of producing stylized, breathtaking imagery.
- Rapid iteration speed, with model capabilities surging annually.
Fatal Weaknesses:
- Character Inconsistency: The same character might look entirely different, wear different clothes, or have a different build across shots.
- Spatiotemporal Discontinuity: Lack of physics understanding leads to chaotic spatial and temporal relationships between frames.
- Compute Black Hole: Every generation requires massive computation from scratch; nothing can be reused.
- Uncontrollable Creativity: Like a lottery—if unhappy with a shot, you must regenerate entirely. Specific elements cannot be precisely edited.
Route B: 3D World Simulation (Polyhedron Dimension, etc.)
Mechanism: First construct a 3D virtual world (scenery, characters, lighting, physical rules), then let AI "film" within this world. It is essentially an AI-accelerated version of the traditional CG filmmaking workflow.
Advantages:
- High Character Consistency: Fixed 3D models ensure characters never change appearances.
- Accurate Physics: 3D engines naturally simulate gravity, lighting, and collisions.
- Reusable Assets: Once modeled, scenes and characters can be reused infinitely, driving down marginal costs.
- Controllable Creativity: Precise adjustments to camera movement, lighting, and actor performances are possible.
Weaknesses:
- High technical barrier; current maturity is far behind 2D generative models.
- 3D asset modeling still requires massive human or AI-assisted labor.
- Content quality (especially facial expressions and dynamics) still awaits a breakthrough.
II. Latest Battlefield of the 2D Route (As of March 2026)
Google Veo 3.1
- Status: Released Jan 2026, accessible via Gemini API, Vertex AI, and Gemini App.
- Features: Native audio sync (dialogue + foley), 4K upscaling, native vertical video. Integrated into YouTube Shorts and Google Ads.
- Structural Advantage: Custom TPUs slash costs + YouTube/Ad distribution → The 2D route works in Google's hands.
ByteDance Seedance 2.0
- Status: Rolling out on CapCut globally (Brazil, Indonesia, Mexico, etc.) starting March 2026.
- Features: Native A/V sync, character consistency, multi-camera generation.
- Structural Advantage: TikTok/Douyin super-distribution → Even with rented compute, distribution sustains it.
Kuaishou Kling
- Status: Topped $240M ARR with 60M users by late 2025.
- Features: Omni series models deeply integrated into professional workflows for marketing, e-commerce, and film.
- Structural Advantage: Kuaishou platform + aggressive commercialization strategy.
The Lesson of Sora's Demise
Sora's exit perfectly proves Theme 01's thesis:
It's not that the 2D route is a dead end. It's that OpenAI lacked both low-cost compute (unlike Google) and a distribution platform (unlike ByteDance/Kuaishou), possessing no structural advantages.
III. "Huo Qubing": A Warning of Data Fraud
Before discussing positive AI video cases, we must address the industry's most dangerous trap.
Incident Review
The Chinese AI mini-series Huo Qubing went viral in 2025, claiming:
- Production cost: 3,000 RMB.
- Production time: 2 days.
- Views: 500 million. Media hailed it as a "landmark event" of AI overturning film.
Dismantling the Truth
- The director later admitted that most cited figures were exaggerated or rumors.
- "500 million views" was likely inflated by adding all hashtag traffic, not individual episode views.
- Upon closer inspection, details like clothing, weapons, and architecture severely violated historical accuracy.
- The 3,000 RMB cost significantly underestimated actual prep, model training, and editing labor.
Why This is Dangerous
When capital and attention flow toward "data-falsified successes":
- Investors make flawed decisions.
- Teams doing solid technical work are ignored.
- The entire industry's credibility is damaged.
IV. The Case for the 3D Route: Polyhedron Dimension
Company Background
Beijing-based Polyhedron Dimension spent 12 years walking the 3D route, developing the "Cyber Director" platform.
Checkable Claims ✅
- Founded in 2016, holds 16 invention patents.
- Acknowledged in Beijing's "Specialized, Refined, Peculiar and Innovative" SME directory in 2022.
- Served CCTV and China Mobile.
- Jointly released multimodal 3D video capabilities with Huawei Cloud.
Claims Requiring Skepticism ⚠️
- "Efficiency improved 100x," "cost down to 1%": Unverified claims straight from the company.
- "Monthly production of 1,000 titles": No independent capacity verification.
- "Releasing the first mini-series by Spring Festival with 20 titles" (promised early 2025): No concrete viewership data found.
- Scale and vision mismatch: 37-70 employees vs. "Achieving AGI" vision.
Assessment
Unlike the Huo Qubing fraud, this seems like a case of real technical foundation masked by overzealous marketing packaging. Patents and clients are real, but the astonishing efficiency numbers should be treated as marketing spin until proven by launched works.
V. Microsoft's "Zero Position" in AI Film and TV
Why is Microsoft Absent?
While Google launched Veo 3.1, ByteDance pushed Seedance, and Kuaishou pushed Kling, Microsoft has remained completely absent in the video generation market.
Reasons discussed earlier:
- Microsoft outsourced video AI entirely to OpenAI → Sora got axed → Arsenal returned to zero.
- Microsoft lacks a consumer entertainment distribution platform (no YouTube/TikTok).
- Microsoft is a "productivity company," genetically unsuited for pan-entertainment content.
What Can Microsoft Do?
Microsoft can only sell APIs in the background—providing compute via Azure to video tools. But this forever relegates it to a "cloud wholesaler" rather than the leader of the creator ecosystem.
VI. The Three Conditions for AI Video Survival
Players hoping to survive in the AI video market must fulfill:
- Can Afford It: In-house chips or ultra-low compute costs.
- Can Distribute It: Owned content distribution channels.
- Can Verify It: Works must stand the test of the market, not inflated data.
The future of AI video doesn't belong to the company with the most chips or the loudest claims—it belongs to whoever can get audiences to pay.