An agem video built from one request and one song
How MEGA started from a single text request and a background track, then used the sandbox MCPs and skills to script, narrate and render this pt-PT promo.
The video above did not come from a studio. It came from a text box. We asked MEGA to make a promo for agem and attached a single file: the background song. Everything else — script, scenes, screen captures, pt-PT voice, transitions, audio mix, final render and this article — appeared from inside the sandbox, driven by installed MCPs and skills.
What went into MEGA
The brief was short: "make a promo video for agem." Attached: one MP3, an instrumental track chosen as the bed. No storyboard, no script, no prepared assets. The only textual raw material was the request itself; the only sonic raw material was the song. From there, MEGA handled the rest on its own.
The sandbox as a workshop
Every MEGA request opens a sandbox: an ephemeral, isolated machine with the site repository mounted, tools pre-installed, and a live timeline. Two kinds of pieces made this video happen — skills (small named recipes) and MCPs (typed gateways to external systems). For this request, MEGA loaded six active skills and four installed MCPs, and the agent pulled from them as the work required.
Skills mark the steps. One makes sure the brief has no ambiguity before any file is touched. Another emits timeline events so each operation is visible while it happens. A video authoring skill handles scene composition, drawing each plane in HTML/CSS and letting ffmpeg render it frame by frame. At the end, a skill takes care of the commit, the new branch, and opening the PR; on a feedback loop, another receives comments and pushes adjustments to the same PR rather than creating a second one.
MCPs give the senses. A filesystem MCP lets the agent read and write files inside the sandbox in a controlled way. A fetch MCP and a crawling engine touch the public pages of agem.pt, vadi.agem.pt and mens.agem.pt to check copy, colour and real screens. An agem-specific MCP brings internal context and provides the pt-PT voice — the narration you hear in the video. A Gitea MCP handles the closing steps: branches, pull request, comments and the link at the bottom of this article.
From request to MP4
With the pieces in place, execution was nearly mechanical. The agent read the brief, visited the site, picked screens, wrote a short script in European Portuguese, generated the voice, composed each scene as an HTML page, opened them in a headless browser to capture stills, assembled clips with subtle zoom and varied transitions, overlaid the voice on the music with ducking — so the bed yields whenever the voice enters — and closed everything into a 1920 by 1080 MP4 at 30 frames per second. The same run opened the pull request with the video, the poster, this text, and the code changes.
The result is a short pt-PT video, with the chosen music underneath, presenting agem as a living system: vadiagem as a local agenda, personagem as identity, mensagem as knowledge, GEMA as the financial system, MEGA as the software behind all of this, and GAME as the gamification layer. It opens with a single line — AGEM is not a name but the ending of many words — and closes on a green cursor blinking on AGEM.PT, waiting for the next request.
The end of the word. The start of the way.
agem