There are tons of behind the scenes pictures and video of the Rocky puppet being used on set, and Andy Weir talks in interviews about how almost no CG was used to enhance the puppet. I guess it's possible to fake all that, but it's a lot of lie to cover up.
Andy Weir is a wonderful novelist and was truthfully relating his understanding but he's not a VFX person.
I didn't see the quote you did but he probably confused the fact that PHM used physical elements in place of some CGI in certain scenes and the separate fact that a realistic physical puppet was used on set for reference. Some parts of that puppet are seen on-screen in some shots but most of the creature in most shots was CGI or CG enhanced (which looked great thanks to the ideal in-camera puppet reference it replaced). I explained more here: https://news.ycombinator.com/item?id=48198851
I agree PHM was great (and I loved the book before the movie). But as a VFX person, please be careful not to buy into the currently popular studio PR line: "it's all real, almost no CGI". Media and influencers love this line and often unknowingly muddle the studio's very carefully crafted press release wording into outright lies by paraphrasing and making assumptions. The problem is these aren't just white lies, they deprive some very talented VFX artists from getting credit for amazing work.
About the misunderstood puppet: A real Rocky puppet was indeed used on set (actually a few different puppets) and some of the puppet is sometimes seen on camera. But most of the puppet was digitally replaced or enhanced in most of the scenes. However, using a much more realistic puppet on set is indeed notable but not because the character wasn't CGI. The puppet is worth talking about because it directly enabled the final mostly-CGI character be really good CGI. It's good because shooting the physical puppet gave the VFX character animators an ideal reference that's "grounded" in the physical reality of the set, camera and lens. The subtle interplay of light, shadow, texture and specularity in the CGI are all grounded in reality. The puppet also let the actor interact with something closer to reality. It's a wonderful technique and should be celebrated instead of obfuscated to promote a falsehood that trends well on social media.
Also, PHM did use real sets (like most movies) and they were able to avoid using green screen for some of the ship exteriors but those backgrounds were still digitally replaced with CGI rendered elements, they just didn't use green screen to pull the matte. But on social media, "No green screen" (true) was conflated into "No CGI" (false). Instead of green screen they used a black backdrop with careful lighting and some hand rotoscoping to extract the digital mattes. Doing it this way had the advantage of not needing to digitally remove green spill on reflective surfaces by hand and it saved money over doing a StageCraft virtual volume at that size. Done well, a green screen could have produced the exact same shot but it would have cost more and taken longer.
But influencers and media are unintentionally perpetuating "No CGI" myths instead of focusing on the actually interesting, more nuanced reality. Using more and better physically grounded references on-set IS a breakthrough that helps turn bad CGI into great CGI. Another good example is Top Gun where "careful wording" in studio press releases grew into outright falsehoods online. Tom Cruise truthfully said in interviews that he was flown in a jet right alongside other REAL jets doing simulated dog-fighting. The lost nuance is that all the other jets Cruise flew with in those dog fight scenes were old Soviet trainer jets that look quite different and are much smaller than real MIGs. So the trainer jets were entirely replaced by CGI MIGs in post and are never seen in the final film. And we couldn't tell because the digitally removed jets provided ideal grounded reference for the CGI pixels that replaced them. And that's how we ended up with several famous YouTubers proclaiming "These are REAL jets, not CGI!" while showing 100% CGI jets. Same with Wicked and the CGI tulips. The fact that Wicked used thousands of specially grown tulips on-set (true) was confused into proclaiming "ALL these tulips are real, no CGI!" (false) while showing a scene where >90% of the tulips were CGI.
AI is already in a bunch of creative workflows. Just look at modern Photoshop. Selecting and hitting delete has AI infill for the background replacement.
Creates can these video gen AI in various ways. There are some youtube channels of people using these in creative workflows that are really impressive, from mocap replacement, character insertion, background replacement, changing camera angle in post, animating/inserting characters from character boards, animated between stills generated in traditional methods, etc. It's not just "prompt and generate". It can be, because it's easy, but it also doesn't have to be. It's a tool.
i do photo restoration as part of my research (bizarre place to be for a math undergrad), so i do think AI is a lifesaver for very small adjustments that would be tedious or subpar otherwise. i just disagree that its creative output is of value (which isn't the case you made, anyway).
I do wonder how studios are working around consistent human faces, it's a problem on almost every discussion forum I have read for AI videos and not something that seems to be solved yet.
Do you have any examples of those creative workflows that have made it into Hollywood for example?
So it's really good, and we have reason to believe, never again, anything that happens in a video. Unless there's a super-product somewhere to authenticate footage?
It seems like this super-product will have to be a thing soon or we will have to just stop using video evidence in court and other critical applications
At first usage I'm not impressed. I've probably spent a couple grand on Seedance 2 to date, and I can't find anything google omni flash does better than Seedance from running a handful of samples through the system. You can find some of the videos I've made in my HN bio link.
I have exactly the same thought. Anyone who had used seedance 2.0 a bit can tell Gemini is a bit behind, and seedance 2.1 is on the horizontal already.
Back in 90s during the first wave of the desktop video revolution when desktop editing became possible and consumer camcorders got pretty good, there was a popular marketing slogan: "Now your imagination is the only limit."
I used to joke that was the moment we discovered "for most people that's a pretty big limit."
> Prompt: A skeuomorphism stop motion explainer about how the brain hippocampus works with a compelling voiceover. Don’t add seahorses. No voice cuts at the end. Don’t add text
And the fact that a transformer model can't distinguish between the two in the context of the sentence given is a point against the general nature of the intelligence.
Yes, if you watch the video closely you can see that the "lensing" effect only really covers a circular area—this prompt probably went through multiple iterations where the author was trying to improve it so that the shape of the hand was reflected more closely.
I mean if we're just blasting past our climate tipping points anyhow, why not just actively dump entire lakes' worth of water out for people to post slop for clout, right?
May as well power off the whole grid now and have the Amish start teaching us how to survive
I'm an AI optimist. But AI video is probably the one thing that does depress me. Seeing that we can make anything visually, there's nothing that impresses me visually. I watch a video that two years ago I would've thought was really cool, and now my first thought is, "Yawn, is this AI?".
Video, more than anything else, is the place where I really care if something is AI or not. If I could get a TikTok that had no AI usage -- I'd be in. Which is weird for me, because I'm typically the guy who is all-in on AI.
Yeah, I'm kinda sad about that one. Most of my friends and family are aware many of these are fake now, but argue that it still invokes the same response in us so it's okay. For me, though, however intangible or irrational it may be, I do feel a sense of loss.
Funny enough, this is actually one of the few things which has bothered me with the AI boom, and I'm mostly pro-acceleration. A lot of what's happening seems inevitable. But surprisingly, knowing that cat or dog or bird or lizard or butterfly or whatever has a strong chance of being generated really does take something out of it to my mind. And I say that also knowing the extreme amount of staging which has long gone on with traditional nature videography. Somehow, knowing the animal is real means something... I'm still trying to figure out how to better understand and express this.
I tried to watch it, but TikTok kept throwing up a dialog over top asking me to slide a puzzle piece into place. I did three or four before just closing it.
I think the opposite. It allows more people to be creative. Similar to how the DAW allowed more people to become musicians. You can produce a hit song with just a laptop now.
Now you can have people producing videos without needing a crew of people.
For a few weeks, YouTube thought I wanted to see videos of package thieves being surprised by a booby-trapped box that was actually a glitter bomb. Video after video were these AI created shorts of supposed doorbell camera footage showing a thief running away with a box that explodes into a giant pink cloud.
I eventually picked one and opened the comments and the top comment was something like "This is obviously an AI video. Who watches this?" and the reply was along the lines of "me because I like seeing thieves get what's coming to them".
So you, like me, aren't interested in AI videos but I think there's a lot of people who don't care if it's real or not.
Thankfully, YouTube eventually stopped showing those to me. Now it thinks I'm interested in road rage videos. My YouTube feed outside of the three of four channels I've subscribed to is terrible.
> and the reply was along the lines of "me because I like seeing thieves get what's coming to them".
I really wish a subject matter expert would pitch in to tell us what this is about?
like a totally made up thing that is fake, somehow gives a sense of justice and satisfaction?
is it something about imagining it happening in reality, or what?
for me, if I see that something is AI, it's like I just feel nothing. because there's nothing in it, it has nothing of real value? like it doesn't evoke anything in me, it doesn't make me think "this was a great find!" or make me want to send a link over to my friends, etc.
Do you ever feel a sense of satisfaction watching a movie? I'm thinking of scenarios like when the bad guy is finally defeated or the hero achieves their goal.
You get back as much as you put in. Just like with all generative tools the quality of the output depends on the quality of input. Slapping a prompt together will only get you so far, if you want the models to generate something really striking and unique you need to get your hands dirty. Gotta break out ComfyUI and build yourself a specific workflow, once you dig deep and understand how things are put together, why and so on, you can make really amazing stuff with any generative models. But you have to pay for that experience in patience and knowledge.
In my day job I program rigid body behaviour in real time amongst other simulations.
I think rigid body contact is hard to learn as it is inherently discontinuous.. something you discover when trying to code a solver.
As such I always use this prompt as a test:
"A video of a jenga brick tower falling over as a brick is removed. The physics of each brick must be realistic."
It gave me a video of where bricks suddenly disapper or morph into others[1]. The linked video is after 2-3 iterations of me insisting on realistic physics. If you are just glancing at this, you would believe it is realistic.
That said this is still very impressive and one more step towards .. IDK what. But I am a bit reasurred that at least my job won't be fully replaced with AI :)
Such videos are essentially dreams: how it feels that the planks should move, not what equations of rigid body physics would compute. And the feeling is realistic (even if overly dramatic in the end). If "stylistic transfer" works for static pictures spread out in space, why won't it work for the character of motion spread out in time?
My point here being that representationally, it might be possible to learn good dynamics without a radically different approach/arch. There are already models that extract 3D tracking points from videos, so they could possibly be leveraged for learning dynamics (which on its own gives precedent for end-to-end approaches also possibly working).
Does anyone else feel like Google is just always a dollar short and a day late here? Maybe not a dollar short, but it's like they've consistently been focused on the wrong thing. First they missed chatbots, now they're missing coding agents while they double down on chatbots and video gen (which OpenAI has already basically abandoned). Maybe this strategy is actually genius and I'm too stupid to grasp it.
While at a cursory glance it looks as impressive as always, subtle spatial errors, and geometry that changes as it goes out of sight and comes back again hints at the fact that Google has still yet to solve the problem of deep spatial understanding.
Which considering just how pretty and detailed this whole thing looks, imo points at a fundamental issue at how these things are trained - it's as if there's no structure to its knowledge and training, like how an artist trained to draw would first try to understand simple 2d composition, then perspective, then light and shadow, mastering each concept and gradually building up a hierarchical understanding - it seems like its trying to learn everything at once.
I would rather see an AI model that I could give a floorplan of a building and it would generate an accurate flythrough on any path, even if it looked like butt.
Im not just talking out of my arse, I did work for a while in data science/engineering, and one of the big lessons people needed to be reminded of is to clean/downsample the data - a dataset consisting of a million samples could very well take 1000x as long to process as if we downsampled the whole thing to just a couple of thousand samples and we could learn the same conclusions with the fraction of expended time/effort.
I'm sure there's a similar logic in RL, that if you dump a trillion samples into the datacenter that consumes the same power as a city, what the model learns is what it could've learned with a much more curated training set and directed approaches.
clapthewind | 3 hours ago
advisedwang | 3 hours ago
boredhedgehog | 31 minutes ago
franze | 3 hours ago
andrewstuart | 2 hours ago
This tech won’t change anything.
mrandish | 2 hours ago
nomel | 2 hours ago
Insanity | an hour ago
mrandish | 22 minutes ago
tencentshill | an hour ago
dymk | 39 minutes ago
mrandish | 14 minutes ago
I didn't see the quote you did but he probably confused the fact that PHM used physical elements in place of some CGI in certain scenes and the separate fact that a realistic physical puppet was used on set for reference. Some parts of that puppet are seen on-screen in some shots but most of the creature in most shots was CGI or CG enhanced (which looked great thanks to the ideal in-camera puppet reference it replaced). I explained more here: https://news.ycombinator.com/item?id=48198851
mrandish | an hour ago
About the misunderstood puppet: A real Rocky puppet was indeed used on set (actually a few different puppets) and some of the puppet is sometimes seen on camera. But most of the puppet was digitally replaced or enhanced in most of the scenes. However, using a much more realistic puppet on set is indeed notable but not because the character wasn't CGI. The puppet is worth talking about because it directly enabled the final mostly-CGI character be really good CGI. It's good because shooting the physical puppet gave the VFX character animators an ideal reference that's "grounded" in the physical reality of the set, camera and lens. The subtle interplay of light, shadow, texture and specularity in the CGI are all grounded in reality. The puppet also let the actor interact with something closer to reality. It's a wonderful technique and should be celebrated instead of obfuscated to promote a falsehood that trends well on social media.
Also, PHM did use real sets (like most movies) and they were able to avoid using green screen for some of the ship exteriors but those backgrounds were still digitally replaced with CGI rendered elements, they just didn't use green screen to pull the matte. But on social media, "No green screen" (true) was conflated into "No CGI" (false). Instead of green screen they used a black backdrop with careful lighting and some hand rotoscoping to extract the digital mattes. Doing it this way had the advantage of not needing to digitally remove green spill on reflective surfaces by hand and it saved money over doing a StageCraft virtual volume at that size. Done well, a green screen could have produced the exact same shot but it would have cost more and taken longer.
But influencers and media are unintentionally perpetuating "No CGI" myths instead of focusing on the actually interesting, more nuanced reality. Using more and better physically grounded references on-set IS a breakthrough that helps turn bad CGI into great CGI. Another good example is Top Gun where "careful wording" in studio press releases grew into outright falsehoods online. Tom Cruise truthfully said in interviews that he was flown in a jet right alongside other REAL jets doing simulated dog-fighting. The lost nuance is that all the other jets Cruise flew with in those dog fight scenes were old Soviet trainer jets that look quite different and are much smaller than real MIGs. So the trainer jets were entirely replaced by CGI MIGs in post and are never seen in the final film. And we couldn't tell because the digitally removed jets provided ideal grounded reference for the CGI pixels that replaced them. And that's how we ended up with several famous YouTubers proclaiming "These are REAL jets, not CGI!" while showing 100% CGI jets. Same with Wicked and the CGI tulips. The fact that Wicked used thousands of specially grown tulips on-set (true) was confused into proclaiming "ALL these tulips are real, no CGI!" (false) while showing a scene where >90% of the tulips were CGI.
senko | 30 minutes ago
wcxcv | 15 minutes ago
mackeye | 2 hours ago
yojo | 2 hours ago
garciasn | 2 hours ago
mackeye | 2 hours ago
nomel | 2 hours ago
Creates can these video gen AI in various ways. There are some youtube channels of people using these in creative workflows that are really impressive, from mocap replacement, character insertion, background replacement, changing camera angle in post, animating/inserting characters from character boards, animated between stills generated in traditional methods, etc. It's not just "prompt and generate". It can be, because it's easy, but it also doesn't have to be. It's a tool.
mackeye | 2 hours ago
CommanderData | an hour ago
Do you have any examples of those creative workflows that have made it into Hollywood for example?
raincole | 2 hours ago
[0] e.g. Don't Look Up
senko | 31 minutes ago
drusepth | 27 minutes ago
[OP] meetpateltech | 3 hours ago
model card: https://deepmind.google/models/model-cards/gemini-omni-flash...
franze | 3 hours ago
I did not create any videos yet.
Google, building great AI that nobody can try out.
But thx for the press release.
andrewstuart | 2 hours ago
tristanb | an hour ago
throw03172019 | 3 hours ago
nicce | 2 hours ago
SoKamil | 2 hours ago
Foomf | 2 hours ago
zarzavat | 2 hours ago
fuzzy2 | an hour ago
SyneRyder | an hour ago
dsign | 3 hours ago
svieira | an hour ago
https://blog.google/innovation-and-ai/products/identifying-a...
(and the previous SynthID: https://deepmind.google/blog/identifying-ai-generated-images...)
But it very much is "close the barn door after the horse has bolted and the barn has otherwise burned down".
spogbiper | an hour ago
adenta | 3 hours ago
kamranjon | 2 hours ago
layer8 | 2 hours ago
adenta | 2 hours ago
gowld | an hour ago
wcxcv | 18 minutes ago
red2awn | an hour ago
CommanderData | an hour ago
The other problem is Seedance is heavily censored because of copyright concerns.
andrewstuart | 2 hours ago
Certainly not me - you have to be a great artist /designer to even imagine what to do with it.
mrandish | 2 hours ago
I used to joke that was the moment we discovered "for most people that's a pretty big limit."
enragedcacti | 2 hours ago
There's got to be a reason this is phrased so insanely, right?
layer8 | 2 hours ago
bar94 | an hour ago
> Prompt: A skeuomorphism stop motion explainer about how the brain hippocampus works with a compelling voiceover. Don’t add seahorses. No voice cuts at the end. Don’t add text
Seahorses???
gfaure | an hour ago
incognito124 | an hour ago
svieira | 38 minutes ago
nightpool | an hour ago
raincole | 2 hours ago
Oh god...
kordlessagain | 2 hours ago
entropicdrifter | 2 hours ago
May as well power off the whole grid now and have the Amish start teaching us how to survive
kenjackson | 2 hours ago
Video, more than anything else, is the place where I really care if something is AI or not. If I could get a TikTok that had no AI usage -- I'd be in. Which is weird for me, because I'm typically the guy who is all-in on AI.
raincole | 2 hours ago
slfnflctd | 14 minutes ago
Funny enough, this is actually one of the few things which has bothered me with the AI boom, and I'm mostly pro-acceleration. A lot of what's happening seems inevitable. But surprisingly, knowing that cat or dog or bird or lizard or butterfly or whatever has a strong chance of being generated really does take something out of it to my mind. And I say that also knowing the extreme amount of staging which has long gone on with traditional nature videography. Somehow, knowing the animal is real means something... I'm still trying to figure out how to better understand and express this.
sleno | 2 hours ago
criddell | 2 hours ago
impulser_ | 2 hours ago
Now you can have people producing videos without needing a crew of people.
LetsGetTechnicl | 2 hours ago
criddell | 2 hours ago
baq | an hour ago
criddell | an hour ago
criddell | 2 hours ago
I eventually picked one and opened the comments and the top comment was something like "This is obviously an AI video. Who watches this?" and the reply was along the lines of "me because I like seeing thieves get what's coming to them".
So you, like me, aren't interested in AI videos but I think there's a lot of people who don't care if it's real or not.
Thankfully, YouTube eventually stopped showing those to me. Now it thinks I'm interested in road rage videos. My YouTube feed outside of the three of four channels I've subscribed to is terrible.
r_lee | 2 hours ago
I really wish a subject matter expert would pitch in to tell us what this is about?
like a totally made up thing that is fake, somehow gives a sense of justice and satisfaction?
is it something about imagining it happening in reality, or what?
for me, if I see that something is AI, it's like I just feel nothing. because there's nothing in it, it has nothing of real value? like it doesn't evoke anything in me, it doesn't make me think "this was a great find!" or make me want to send a link over to my friends, etc.
criddell | an hour ago
kenjackson | an hour ago
nowittyusername | an hour ago
manas96 | 2 hours ago
As such I always use this prompt as a test: "A video of a jenga brick tower falling over as a brick is removed. The physics of each brick must be realistic."
It gave me a video of where bricks suddenly disapper or morph into others[1]. The linked video is after 2-3 iterations of me insisting on realistic physics. If you are just glancing at this, you would believe it is realistic.
That said this is still very impressive and one more step towards .. IDK what. But I am a bit reasurred that at least my job won't be fully replaced with AI :)
[1] https://streamable.com/2em1r3
nine_k | 2 hours ago
darkwater | an hour ago
jddj | 51 minutes ago
E-Reverance | an hour ago
I honestly can't comment with certainty that training from videos alone and whatever tokenization scheme they're using will ever get perfect dynamics.
However it is worth noting that transformers can do a pretty good job at learning dynamics with the right pipeline (not video): https://arxiv.org/pdf/2605.15305 https://arxiv.org/pdf/2605.09196
My point here being that representationally, it might be possible to learn good dynamics without a radically different approach/arch. There are already models that extract 3D tracking points from videos, so they could possibly be leveraged for learning dynamics (which on its own gives precedent for end-to-end approaches also possibly working).
christoff12 | 47 minutes ago
staindk | 18 minutes ago
We were sharing game clips with each other and after a while realised our old clips were just gone, being deleted after 30 or 90 days or something.
baq | an hour ago
dwa3592 | an hour ago
uejfiweun | 44 minutes ago
torginus | 9 minutes ago
Which considering just how pretty and detailed this whole thing looks, imo points at a fundamental issue at how these things are trained - it's as if there's no structure to its knowledge and training, like how an artist trained to draw would first try to understand simple 2d composition, then perspective, then light and shadow, mastering each concept and gradually building up a hierarchical understanding - it seems like its trying to learn everything at once.
I would rather see an AI model that I could give a floorplan of a building and it would generate an accurate flythrough on any path, even if it looked like butt.
Im not just talking out of my arse, I did work for a while in data science/engineering, and one of the big lessons people needed to be reminded of is to clean/downsample the data - a dataset consisting of a million samples could very well take 1000x as long to process as if we downsampled the whole thing to just a couple of thousand samples and we could learn the same conclusions with the fraction of expended time/effort.
I'm sure there's a similar logic in RL, that if you dump a trillion samples into the datacenter that consumes the same power as a city, what the model learns is what it could've learned with a much more curated training set and directed approaches.