This seems like a great idea. Tools like video editors (and CAD) often impose a big learning curve - there is a big differential between "I want to do X" and actually knowing all the right buttons to press to do X. Good luck.
Who do you think your target customer is? Curious to know if you think the money is in short form, traditional YouTube videos, or even movie studios one day.
Great website btw. The onboarding was very pleasing
I recently started making videos for a loved one that lives far away, I started using CapCut and this is the kind of thing I was thinking "I wish it did that".
We use Cardboard at Vulnetic and it is an incredible product. The founders are easily accessible, and it has definitely made it easier to film feature update videos. I can't recommend them enough.
Excited to see AI integrations into more non-text-related applications (coding, spreadsheets, proofreading etc). As someone who only occasionally needs to edit videos for product / feature reels, I'd happily ask an AI to "sync the narration to the video, cut away irrelevant footage, and add transitions". The convenience of being able to automate simple, repeatable tasks in creative software via ai is something that gets overshadowed a lot by the agentic coding discussions. I can only imagine the nightmare it would be for a tool like Premier to integrate effective ai features, so new ai-in-mind tools really feel like a necessity.
you understood well what we are building. non-text domains certainly have additionally challenges and we're working on making it reliable without learning curve.
also, appreciate the kind words on the site — give Cardboard a spin next time you need a product reel!
Really impressive work guys! It seems like YC has funded a few companies attacking this but I think you all might have the best approach so far. Behind the scenes is the agent just editing using text/annotated timelines? I feel like the move is probably text for roughcut/narrative, then a vlm for digesting the initial roughcut, then adding broll and fixing timing issues. Feel free to steal my FCP xml generator. https://github.com/barefootford/buttercut
happy that you liked our approach! also, i think it's a better idea to just give agent these tools and let it figure out its course of actions than giving it a specific workflow to work on - it seems like the world keeps reminding us the bitter lesson [http://www.incompleteideas.net/IncIdeas/BitterLesson.html] more frequently these days
Theoretically I agree, but practically without guidance agents aren't really able to edit video ATM. Without hand holding Claude will just call ffmpeg and look at a few frames.
Totally fair reaction! Here's our honest thinking behind it.
We deliberately avoided credits/usage-based pricing because as founders using this in our own creative workflow, we hate the cognitive load that comes with it.
If I don't like a voiceover/variation, I should have the freedom to regenerate it until I'm happy without thinking about whether it's "worth" a credit.
That said, we could be wrong! Genuinely curious what you think would feel fair?
> We built a custom hardware-accelerated renderer on WebCodecs / WebGL2, there’s no server-side rendering, no plugins, everything runs in your browser (client-side).
Wow! congrats on the launch guys. client-side rendering is incredible, really. I saw your product somewhere and have it as an open tab in my chrome for ~2 weeks :D
I also saw another YC company, Mosaic, doing something similar. But your approach of chat-based editing is a lot closer to what I'm building.
Shameless plug: I'm also working on a chat-based media processor. https://chatoctopus.com
But you guys are way ahead! will be looking at you for inspiration.
Love this idea! I built something similar last year https://www.usecrossfade.com and know how difficult this is to get right - I'm rooting for you guys!
Thanks! Yeah, it can just quickly spiral into this massive product when you take video editing which has a base level of features you sort of expect and add on a whole new paradigm like AI-assisted. But really like your approach!
We made a deliberate decision to go client-first. Video editing happens entirely in your browser without us uploading your entire footage on our end. No bandwidth costs for you, no storing your raw video on our servers. The File System Access API is what makes that possible, and unfortunately Firefox just doesn't have it yet.
It's not a forever thing though. For cloud-based projects where files live on our end anyway, Firefox support is very much on the roadmap. But for the local-first editing flow, our hands are a bit tied until Mozilla ships it.
Hope that makes sense, and fingers crossed Firefox adds support soon!
I think you should consider putting this information in your site. I always read "we don't support Firefox" as "we are lazy", but that's not always the case.
Helpful for those who care less about the craft and more about a quick outcome. Werner Herzog said that he watches his footage a few times, takes extensive notes then edits based on his notes. That's how he crafts such extraordinary, once-in-a-lifetime stories. But for those who are working on commercial or home movies, why not use AI to build a narrative? It can be like throwing dice and the outcome could be OK. Maybe even good.
Regardless, having a tool that knows the content of your footage is a huge time saver. Good luck with the product.
I totally resonate with you. Craft takes time, and that's completely valid. We're not focused on filmmakers right now, though we'd love to have them eventually.
That's also why we built a full editor alongside the agentic experience. Use AI where it helps, like finding the right shot or removing silences, and do the rest manually. And if you'd rather finish in your editor of choice, we support XML export for Premiere, DaVinci, etc.
And agreed, there's really no substitute for the kind of intentionality Herzog brings to his work :)
We've played around with this and honestly have a lot of respect for what the Remotion team has built. Fun fact, I tinkered with it back in 2021 when they made those GitHub Wrapped videos, it was one of those projects that made me think differently about video on the web :)
Cardboard is a bit different though, aimed at non-developers who want to edit raw footage through natural language without writing any code. Motion graphics is on the roadmap and Remotion would hopefully be a natural fit when we get there.
Cool to see the space evolving from so many directions! :)
As a professional video editor (short-form and feature films) I've always thought realtime collaboration on a timeline makes no sense. Editors' decisions can be mutually destructive / conceptually incompatible.
Fair point. What we mean by collaboration is closer to how Figma works. From our user interviews, video creation almost always involves multiple people but in different ways: screenwriters, marketers, designers, directors reviewing the edit and sharing feedback.
The value might not be co-editing the timeline, it's making the feedback / iteration loops faster.
For your example videos that you made with Cardboard: can you also put up the raw material that went into those videos? Just looking at the output doesn't tell me anything. :thanks:!
Sure! Will share the raw material for all the videos.
For some of the examples we shared though, we've created sample projects right within the product itself. They contain the raw assets and the exact prompts used to create the videos. You can try them out directly at https://demo.usecardboard.com and see the whole process!
My co-founder and I met in high school, and we wanted the name to carry a sense of craft. Cardboard was always that material in school projects that was firm enough to hold structure but malleable enough to build almost anything out of. That balance of structure and flexibility felt like a good metaphor for what we're building.
Also we just thought it was a cool name and bought a bunch of domains... https://cardboard.mov is one of my favorites :)
I'm currently building something in the generative AI space and am struggling with pricing. With your fixed price monthly plans, how do you deal with power users who might be blowing through more than $60/month worth of tokens? Do you eat the cost and hope the margins average out? Or have you optimized enough where that's not really a concern?
Very well-executed version of this. I think this is the right interface for video editing going into the future.
I've spent a bit of time on something related, AI-generating motion graphics videos from code, also editable/renderable in-browser. Here's a few things I ran into:
- I see you mentioned being aware of Remotion in another comment, in my experience Remotion is not the right tool for adding motion graphics to what you're building. There's a few reasons for this, but basically declarative markup is not a great language for motion graphics beyond anything very basic. Also, in-browser rendering is only going to work with canvas-based components. I also wasn't a huge fan of their license.
- WebCodecs may not be as reliable as you think. I've verified several issues where I get a different output across browsers and operating systems, and even different permutations of flags, browser and OS. Is there a reason why your tool needs to be browser-based?
- On Remotion, yeah, not sure it's the right fit, but honestly the sheer capability of models at writing code these days has surprised me. Funnily enough, this is how I used to make small graphics for videos 2-3 years back when I knew nothing about After Effects.
We've been eager to experiment with this for a while, just have to prioritize other user requests for now. Will definitely try a few approaches and see what sticks. (Also noticed they have an experimental client-side rendering version built on mediabunny, haven't tried it yet: https://www.remotion.dev/docs/client-side-rendering/)
- On WebCodecs, there are a fair set of challenges, but we wanted to take the bet. The reason we're browser-based is the same reason I love Figma and Google Docs: no install, no waiting, just open and start. That said, for broader codec support (ProRes, RAW, etc.) we'll rely on server-side transcoding with proxies where needed.
> On Remotion, yeah, not sure it's the right fit, but honestly the sheer capability of models at writing code these days has surprised me.
Just to clarify I still think code-driven graphics is the correct approach, but in my case I opted for a different library with a more powerful imperative API.
> Also noticed they have an experimental client-side rendering version built on mediabunny
Yes, I've tried it out, it was a non-starter for me because it only supports canvas-based components, and Remotion didn't seem to have good support for text on canvas because they rely on HTML for most of that.
> On WebCodecs, there are a fair set of challenges, but we wanted to take the bet
Totally understand the appeal and immediacy of a browser app, I was lured in by that too. For what it's worth I've reported showstopping WebCodecs issues in Chromium and there's basically no indication they'll get fixed on a predictable timeline.
Another issue I ran into that I just remembered is animating text on canvas. It's basically impossible to get pixel-perfect anti-aliased text animation using a canvas. I would have to dig up the exact details but it was something to do with how browsers handle sub-pixel positioning for canvas text, so there was always some jitter when animating. This coupled with the aforementioned WebCodecs issues led me to conclude that professional-quality video rendering is not currently possible in the browser environment. Aliasing, jitter and artifacts are immediately perceptible and are the type of thing that users have zero tolerance for (speaking from experience).
This is not meant to be discouraging in any way, I've just been very deep into this rabbithole and there are some very nasty well-hidden pitfalls.
> Totally understand the appeal and immediacy of a browser app, I was lured in by that too. For what it's worth I've reported showstopping WebCodecs issues in Chromium and there's basically no indication they'll get fixed on a predictable timeline.
Interestingly I have the exact opposite experience, I've reported issues both in the WebCodecs specification and the Chromium implementation, in all cases they were fixed within weeks. Simply though reports on public bug trackers and it wasn't really a major issue in any instance.
> Another issue I ran into that I just remembered is animating text on canvas. It's basically impossible to get pixel-perfect anti-aliased text animation using a canvas. I would have to dig up the exact details but it was something to do with how browsers handle sub-pixel positioning for canvas text, so there was always some jitter when animating. This coupled with the aforementioned WebCodecs issues led me to conclude that professional-quality video rendering is not currently possible in the browser environment. Aliasing, jitter and artifacts are immediately perceptible and are the type of thing that users have zero tolerance for (speaking from experience).
We're doing SOTA quality video rendering with WebCodecs + Chromium with millions of videos produced daily, or near SOTA if you consider subpixel AA a requirement for text. In general for pixel perfection of text, especially across different browsers and operating systems, you can't just use text elements in DOM or in canvas context, instead text needs to be rasterized to vector shapes and rendered as such.
Honestly not sure about potential jittering when animating text, but we've never had any complaints about anything regarding text animations and users are very often comparing our video exports with videos produced in Adobe AE or similar.
> Interestingly I have the exact opposite experience, I've reported issues both in the WebCodecs specification and the Chromium implementation, in all cases they were fixed within weeks. Simply though reports on public bug trackers and it wasn't really a major issue in any instance.
That's fair, they are responsive most of the time. I do have one major rendering issue in particular I've been waiting on with no movement for months, so I might be biased.
> We're doing SOTA quality video rendering with WebCodecs + Chromium with millions of videos produced daily, or near SOTA if you consider subpixel AA a requirement for text. In general for pixel perfection of text, especially across different browsers and operating systems, you can't just use text elements in DOM or in canvas context, instead text needs to be rasterized to vector shapes and rendered as such. Honestly not sure about potential jittering when animating text, but we've never had any complaints about anything regarding text animations and users are very often comparing our video exports with videos produced in Adobe AE or similar.
So you use a library that takes in text and vectorizes it to canvas shapes? That could work in theory, do you have a demo of this?
> So you use a library that takes in text and vectorizes it to canvas shapes? That could work in theory, do you have a demo of this?
Yea, it's harfbuzz compiled to WASM: https://harfbuzz.github.io/harfbuzzjs/
Then all text layout features must be implemented on top of it, like linebreaking, text align, line spacing, kerning, text direction, decoration etc.
There is an undo button — it's on the bottom right of each user message in the chat. That said, sounds like it wasn't obvious enough, so I'll rethink the UX there for sure!
The 'ask more questions upfront' fix is basically a planning phase wearing different clothes. The real challenge isn't tool routing, it's verification - knowing whether the edit was actually good without needing a human in the loop. Text agents get away with cheap regeneration. Video quality feedback is expensive and the agent has no natural signal for when it's gone wrong.
Really impressive execution on the agentic workflow architecture. The challenge you mentioned about "asking more questions upfront" instead of rigid workflows resonates deeply from building production AI agents. The key insight is that agentic systems work best when they have rich context about user intent rather than trying to guess from minimal input. Video editing is particularly challenging because the feedback loops are expensive (unlike text where you can regenerate cheaply), so getting the planning phase right is critical. Your approach of treating it like distributed systems with proper error handling and recovery makes complete sense. Looking forward to seeing how you handle the "verification problem" - knowing when the agent made the right creative decisions without human review.
Very cool! A noob question about how models handle video: do you do everything via sending frames as images to the model at some framerate? Are there tricks to avoid what it seems like would be massive token use from this approach?
calebm | a day ago
[OP] sxmawl | a day ago
rd | a day ago
Great website btw. The onboarding was very pleasing
[OP] sxmawl | a day ago
target customers usually fall under one of these - marketers / creators / founders
moralestapia | a day ago
I recently started making videos for a loved one that lives far away, I started using CapCut and this is the kind of thing I was thinking "I wish it did that".
I'll definitely try it out. Congrats!
[OP] sxmawl | a day ago
lmk if i can help in any way :)
deklesen | a day ago
[OP] sxmawl | a day ago
danieltk76 | a day ago
[OP] sxmawl | a day ago
RobotToaster | a day ago
[OP] sxmawl | a day ago
for now, an intermediate solution is to splice and upload.
jimmis | a day ago
Great website and good luck!
[OP] sxmawl | a day ago
also, appreciate the kind words on the site — give Cardboard a spin next time you need a product reel!
barefootford | a day ago
[OP] sxmawl | a day ago
will definitely check the XML exports, ty :)
barefootford | 20 hours ago
[OP] sxmawl | 19 hours ago
TimCTRL | a day ago
ishandeveloper | 22 hours ago
We deliberately avoided credits/usage-based pricing because as founders using this in our own creative workflow, we hate the cognitive load that comes with it.
If I don't like a voiceover/variation, I should have the freedom to regenerate it until I'm happy without thinking about whether it's "worth" a credit.
That said, we could be wrong! Genuinely curious what you think would feel fair?
TimCTRL | 15 hours ago
joshribakoff | a day ago
[OP] sxmawl | a day ago
ishandeveloper | a day ago
WaylonKenning | a day ago
Cardboard looks really well polished, well done!
[OP] sxmawl | a day ago
jhatemyjob | a day ago
Aight imma head out. Holy moly.
[OP] sxmawl | a day ago
moinism | a day ago
I also saw another YC company, Mosaic, doing something similar. But your approach of chat-based editing is a lot closer to what I'm building. Shameless plug: I'm also working on a chat-based media processor. https://chatoctopus.com
But you guys are way ahead! will be looking at you for inspiration.
[OP] sxmawl | a day ago
and ig it's time to revisit that chrome tab :)
adboio | a day ago
[OP] sxmawl | a day ago
michaelevensen | a day ago
ishandeveloper | a day ago
michaelevensen | a day ago
popalchemist | a day ago
[OP] sxmawl | 23 hours ago
newbeeguy | 23 hours ago
ishandeveloper | 23 hours ago
The short answer: Firefox doesn't support the File System Access API (https://caniuse.com/?search=File+System+Access+API).
We made a deliberate decision to go client-first. Video editing happens entirely in your browser without us uploading your entire footage on our end. No bandwidth costs for you, no storing your raw video on our servers. The File System Access API is what makes that possible, and unfortunately Firefox just doesn't have it yet.
It's not a forever thing though. For cloud-based projects where files live on our end anyway, Firefox support is very much on the roadmap. But for the local-first editing flow, our hands are a bit tied until Mozilla ships it.
Hope that makes sense, and fingers crossed Firefox adds support soon!
nzach | 8 hours ago
I think you should consider putting this information in your site. I always read "we don't support Firefox" as "we are lazy", but that's not always the case.
telesilla | a day ago
Regardless, having a tool that knows the content of your footage is a huge time saver. Good luck with the product.
ishandeveloper | 22 hours ago
That's also why we built a full editor alongside the agentic experience. Use AI where it helps, like finding the right shot or removing silences, and do the rest manually. And if you'd rather finish in your editor of choice, we support XML export for Premiere, DaVinci, etc.
And agreed, there's really no substitute for the kind of intentionality Herzog brings to his work :)
flyingcircus3 | a day ago
https://www.remotion.dev/docs/ai/claude-code
ishandeveloper | 22 hours ago
Cool to see the space evolving from so many directions! :)
popalchemist | 23 hours ago
ishandeveloper | 22 hours ago
The value might not be co-editing the timeline, it's making the feedback / iteration loops faster.
1024core | 23 hours ago
ishandeveloper | 22 hours ago
For some of the examples we shared though, we've created sample projects right within the product itself. They contain the raw assets and the exact prompts used to create the videos. You can try them out directly at https://demo.usecardboard.com and see the whole process!
regus | 23 hours ago
ishandeveloper | 22 hours ago
My co-founder and I met in high school, and we wanted the name to carry a sense of craft. Cardboard was always that material in school projects that was firm enough to hold structure but malleable enough to build almost anything out of. That balance of structure and flexibility felt like a good metaphor for what we're building.
Also we just thought it was a cool name and bought a bunch of domains... https://cardboard.mov is one of my favorites :)
hbardigital | 19 hours ago
[OP] sxmawl | 18 hours ago
vivzkestrel | 18 hours ago
- https://news.ycombinator.com/item?id=42806616
- https://news.ycombinator.com/item?id=45980760
- https://news.ycombinator.com/item?id=46759180
- https://github.com/saurav-shakya/Video-AI-Agent
- going to be rather tough to differentiate
[OP] sxmawl | 2 hours ago
njoyablpnting | 18 hours ago
I've spent a bit of time on something related, AI-generating motion graphics videos from code, also editable/renderable in-browser. Here's a few things I ran into:
- I see you mentioned being aware of Remotion in another comment, in my experience Remotion is not the right tool for adding motion graphics to what you're building. There's a few reasons for this, but basically declarative markup is not a great language for motion graphics beyond anything very basic. Also, in-browser rendering is only going to work with canvas-based components. I also wasn't a huge fan of their license.
- WebCodecs may not be as reliable as you think. I've verified several issues where I get a different output across browsers and operating systems, and even different permutations of flags, browser and OS. Is there a reason why your tool needs to be browser-based?
ishandeveloper | 16 hours ago
We've been eager to experiment with this for a while, just have to prioritize other user requests for now. Will definitely try a few approaches and see what sticks. (Also noticed they have an experimental client-side rendering version built on mediabunny, haven't tried it yet: https://www.remotion.dev/docs/client-side-rendering/)
- On WebCodecs, there are a fair set of challenges, but we wanted to take the bet. The reason we're browser-based is the same reason I love Figma and Google Docs: no install, no waiting, just open and start. That said, for broader codec support (ProRes, RAW, etc.) we'll rely on server-side transcoding with proxies where needed.
njoyablpnting | 15 hours ago
Just to clarify I still think code-driven graphics is the correct approach, but in my case I opted for a different library with a more powerful imperative API.
> Also noticed they have an experimental client-side rendering version built on mediabunny
Yes, I've tried it out, it was a non-starter for me because it only supports canvas-based components, and Remotion didn't seem to have good support for text on canvas because they rely on HTML for most of that.
> On WebCodecs, there are a fair set of challenges, but we wanted to take the bet
Totally understand the appeal and immediacy of a browser app, I was lured in by that too. For what it's worth I've reported showstopping WebCodecs issues in Chromium and there's basically no indication they'll get fixed on a predictable timeline.
Another issue I ran into that I just remembered is animating text on canvas. It's basically impossible to get pixel-perfect anti-aliased text animation using a canvas. I would have to dig up the exact details but it was something to do with how browsers handle sub-pixel positioning for canvas text, so there was always some jitter when animating. This coupled with the aforementioned WebCodecs issues led me to conclude that professional-quality video rendering is not currently possible in the browser environment. Aliasing, jitter and artifacts are immediately perceptible and are the type of thing that users have zero tolerance for (speaking from experience).
This is not meant to be discouraging in any way, I've just been very deep into this rabbithole and there are some very nasty well-hidden pitfalls.
spuzvabob | 12 hours ago
Interestingly I have the exact opposite experience, I've reported issues both in the WebCodecs specification and the Chromium implementation, in all cases they were fixed within weeks. Simply though reports on public bug trackers and it wasn't really a major issue in any instance.
> Another issue I ran into that I just remembered is animating text on canvas. It's basically impossible to get pixel-perfect anti-aliased text animation using a canvas. I would have to dig up the exact details but it was something to do with how browsers handle sub-pixel positioning for canvas text, so there was always some jitter when animating. This coupled with the aforementioned WebCodecs issues led me to conclude that professional-quality video rendering is not currently possible in the browser environment. Aliasing, jitter and artifacts are immediately perceptible and are the type of thing that users have zero tolerance for (speaking from experience).
We're doing SOTA quality video rendering with WebCodecs + Chromium with millions of videos produced daily, or near SOTA if you consider subpixel AA a requirement for text. In general for pixel perfection of text, especially across different browsers and operating systems, you can't just use text elements in DOM or in canvas context, instead text needs to be rasterized to vector shapes and rendered as such. Honestly not sure about potential jittering when animating text, but we've never had any complaints about anything regarding text animations and users are very often comparing our video exports with videos produced in Adobe AE or similar.
njoyablpnting | 9 hours ago
That's fair, they are responsive most of the time. I do have one major rendering issue in particular I've been waiting on with no movement for months, so I might be biased.
> We're doing SOTA quality video rendering with WebCodecs + Chromium with millions of videos produced daily, or near SOTA if you consider subpixel AA a requirement for text. In general for pixel perfection of text, especially across different browsers and operating systems, you can't just use text elements in DOM or in canvas context, instead text needs to be rasterized to vector shapes and rendered as such. Honestly not sure about potential jittering when animating text, but we've never had any complaints about anything regarding text animations and users are very often comparing our video exports with videos produced in Adobe AE or similar.
So you use a library that takes in text and vectorizes it to canvas shapes? That could work in theory, do you have a demo of this?
spuzvabob | 9 hours ago
Yea, it's harfbuzz compiled to WASM: https://harfbuzz.github.io/harfbuzzjs/ Then all text layout features must be implemented on top of it, like linebreaking, text align, line spacing, kerning, text direction, decoration etc.
dandaka | 10 hours ago
would you mind sharing the name?
njoyablpnting | 9 hours ago
It's not really designed for the animation code to be dynamically changed on the fly, but I've hacked together this feature in my fork.
amanfromearth | 15 hours ago
dockerd | 15 hours ago
I played around on a sample video and it worked great. I wanted to undo one AI edit but couldn't find if there is undo button.
ishandeveloper | 14 hours ago
There is an undo button — it's on the bottom right of each user message in the chat. That said, sounds like it wasn't obvious enough, so I'll rethink the UX there for sure!
teodosin | 15 hours ago
ishandeveloper | 14 hours ago
vishalontheline | 13 hours ago
I would like to:
- upload a bunch of surf footage
- let it sort through the surfers
- pick the three longest waves surfed by each surfer
- create a montage grouped by surfer, ordered by shortest to longest wave for that surfer.
Thank you!
[OP] sxmawl | 4 hours ago
i think it'd do a good job at it.
jamiecode | 12 hours ago
welovegreen | 9 hours ago
hal9000xbot | 9 hours ago
danenania | 5 hours ago
atentaten | 4 hours ago
stuckkeys | 2 hours ago