On December 15, 2025, Versee's Tech team officially launched an AI-powered automated video production pipeline — the result of over 6 months of continuous research and development. This is not merely a technology product, but a fundamental shift in how Versee produces content, moving the company from manual production to intelligent production.
The pipeline operates through 5 main stages. The first stage is scriptwriting — where large language models (LLMs) fine-tuned for specific content genres automatically write scripts based on input briefs. Instead of spending 30-45 minutes writing a short video script, the LLM can produce a high-quality draft in just 2 minutes, along with multiple variations for the Content team to choose from.
The second stage is AI image generation — creating illustrations and visual assets for videos. The system uses the most advanced text-to-image models, combined with a style template library built by the Design team, ensuring generated images align with each channel's brand guidelines.
The third stage is video assembly — where AI automatically combines images, text overlays, and transitions into a complete video with appropriate rhythm and pacing. The fourth stage is text-to-speech — an automated voiceover system with multiple natural voices, supporting both Vietnamese and English with adjustable intonation and speed.
Measured results showed dramatic improvement: production time per short video dropped from 4 hours to 45 minutes — a reduction of over 80%. Average production cost decreased by 65%, while the number of videos producible per day increased from 5 to 25. Notably, engagement rates on AI-pipeline-produced videos were not lower than manually produced ones, proving that quality was not sacrificed for speed.
Critically, the AI pipeline does not entirely replace humans. Every video goes through a rigorous human review process before publishing. The Content team reviews scripts to ensure accurate messaging and contextual appropriateness. The Design team checks visual quality and brand consistency. The QA team performs final technical quality checks. This review process accounts for about 15 minutes of the total 45-minute production time, but these are the most important 15 minutes — where human judgment and aesthetics ensure output quality.
Next steps on the development roadmap include: integrating AI video generation to create footage instead of using only static images, building an automated A/B testing system to optimize performance, and expanding the pipeline to multilingual content production for international markets. Versee aims to have the pipeline capable of producing 100 high-quality videos per day by mid-2026.