Google’s New AI Learned To See In The Dark! 🤖

  Рет қаралды 372,159

Two Minute Papers

Two Minute Papers

Күн бұрын

Пікірлер
@Ila_Aras
@Ila_Aras 2 жыл бұрын
Very excited for this one... (Going to edit after finishing the video) Edit; Okay, first of all: WOW! How well the denoising technique worked on the dark images... That's insane! I could imagine that technology being used in phones eventually. This paper is one of the most exciting I've seen in the last month at least... Thank you 2MP for this showcase!
@JustJory
@JustJory 2 жыл бұрын
whats a edit?
@Goatfooted
@Goatfooted 2 жыл бұрын
@@JustJory hes gonna change the comment
@yarin.suissa
@yarin.suissa 2 жыл бұрын
Thanks 2 mega pixels ❤️
@notthatrkelly
@notthatrkelly 2 жыл бұрын
Apple has a feature called "Deep Fusion" that uses the data from the multiple frames captured in "Live Photo" mode to do something similar, but that's just the tip of the iceberg. You could theoretically utilize all three cameras at once along with the LiDAR sensor and end up with a bit of NeRF-like 3D movement capabilities and accurate depth of field magic from a single photo.
@leoff2
@leoff2 2 жыл бұрын
am I the only one who thinks this AI voiceover is bad? it has a pause between every other word, you can clearly notice the difference between the intro vs rest of the video (AI)
@kylehessling2679
@kylehessling2679 2 жыл бұрын
As a cinematographer and photographer, that depth of field technique looks like a cheat code! You could do portraits with a smartphone by taking a couple of photos from different angles, and then you could completely reframe your shots in post-production and even go as far as changing the focus point...that's so incredible
@SyterPK
@SyterPK 2 жыл бұрын
I was going to say the same thing! That is what I normally see in viewfinder, it's so real.
@estebanod
@estebanod 2 жыл бұрын
Just imagine how simple it could become. All we would need is a smartphone with two cameras placed horizontally next to each other, to do it. Well we already have phones like that, just need to implement the AI.
@larion2336
@larion2336 2 жыл бұрын
@@estebanod Think there is even some with three cameras in a triangle.
@kylehessling2679
@kylehessling2679 2 жыл бұрын
@@estebanod You could literally set up a shoot with like 6 canon t2i's on tripods, which can be bought for like 100 bucks a piece and remote trigger them simultaneously...that would be so rad
@circuit10
@circuit10 2 жыл бұрын
Modern phones already have depth sensors and they often let you refocus an image you already took like this
@servidig483
@servidig483 2 жыл бұрын
Imagine this being implemented in VR Google's Street View, instead of still image, you can be totally immersed
@FrotLopOfficial
@FrotLopOfficial 2 жыл бұрын
This would make a world size GTA game very possible. Wow
@d00mfish
@d00mfish 2 жыл бұрын
@@FrotLopOfficial Without even rendering 3d objects, just display the output video frames in an order that matches the player movement
@dots223_
@dots223_ 2 жыл бұрын
Considering this research is being done by google, I wouldn’t be surprised if this happens within the next couple years
@seanlim7940
@seanlim7940 2 жыл бұрын
can imagine the meme potential
@georglehner407
@georglehner407 2 жыл бұрын
Combine the google street view images with images uploaded by users and you have viewable 3D models of any tourist hotspots in the world. We were already impressed by Microsoft Flight Simulator, now imagine the possibilities for that.
@Hookens
@Hookens 2 жыл бұрын
The craziest part to me is how it's able to handle light reflections on the glass while moving. Like it's able to tell that the furthest reflection will move faster relative to the closest because of the depth in the glass. What a time to be alive!
@thomasdemharter2153
@thomasdemharter2153 2 жыл бұрын
It has this information though; the process relies on taking multiple images from different angles. The AI model is good at creating the intermediate positions in a stable way (no sudden changes to image elements while the perspective changes). But it has no "understanding" about a reflective surface and that it would need to create a reflection for a certain object on that surface.
@vectoralphaSec
@vectoralphaSec 2 жыл бұрын
The fact that after watching this channel for years and seeing how FAST the improvements have been in AI in only a matter of months to less than 2 years. Is proof to me that AI will be beyond human capability in 10 years and beyond. I cant even imagine what AI will be capable of in the next 20, 30 and 50 years if these are the types of advancements we get in less than 1 year.
@be1tube
@be1tube 2 жыл бұрын
Consider working on AI alignment. If AI will be superhuman in 10 years, we want to make sure that such a powerful force helps rather than harms our species.
@Niksky2
@Niksky2 2 жыл бұрын
This is not AI this is just an algorithm with self learning. Nothing close to AI.
@bricaaron3978
@bricaaron3978 2 жыл бұрын
@@Niksky2 Thank you for a dose of sanity.
@johnjack3578
@johnjack3578 2 жыл бұрын
AI is going to peak real soon and the hopes for the future will crash and be more sober. It's like when people though we are going to have mars and moon colonies 10 years from moon landing.
@bricaaron3978
@bricaaron3978 2 жыл бұрын
@@johnjack3578 For some reason, your comment got censored from the main thread.
@justanotherhotguy
@justanotherhotguy 2 жыл бұрын
We’re getting closer to a point where movies actually accurately portray AI power.
@lexscarlet
@lexscarlet 2 жыл бұрын
🤣🤣🤣
@David460
@David460 2 жыл бұрын
Enhance! Enhance! Enhance!!
@theguythatcoment
@theguythatcoment 2 жыл бұрын
if you just knew how stupidly simple and limiting NERFs are. We live in the age of the 10,000 very stupidly simple and limiting AIs, all of them look great but none of them have any real computational value, only artistic.
@Jet_Set_Go
@Jet_Set_Go 2 жыл бұрын
@@theguythatcoment You say that as if that were a bad thing, the very fact that it is available to so many is a major advance compared to what we had 10 years ago, it seems that under accustomed eyes progress will never be a thing
@skeithus
@skeithus 2 жыл бұрын
advance in what over 10 years, AI? it became a thing in 2014, before that only a couple companies/people fooled around with neural nets but noone took them seriously
@nahuelisimo1197
@nahuelisimo1197 2 жыл бұрын
7:39 I'm truly amazed at how it knows to defocus the piano, but not the reflection on the piano, since the reflection is further away. Mind-blowing
@animowany111
@animowany111 2 жыл бұрын
This actually causes issues with current NeRF models, since it effectively treats the reflective piano surface as a hole into the reflected area 'behind' it. There's room for future improvement by actually recognizing flat reflective surfaces (and maybe even curved reflections or even refractive materials) so the neural network doesn't have to model a 'shadow world' on the other side of the mirror.
@SirPembertonS.Crevalius
@SirPembertonS.Crevalius 2 жыл бұрын
A few photos becoming a navigable 3D environment is mind blowing. I can't imagine what the future papers will be if they are like this!
@fischX
@fischX 2 жыл бұрын
It's not 3D data AFAIK (yet)
@ahtoshkaa
@ahtoshkaa 2 жыл бұрын
@@fischX yet! 2 more papers down the line it will be)!
@Korn333
@Korn333 2 жыл бұрын
@@fischX if a neural point cloud is similar enough to a traditional point cloud then there are already a bunch of different algorithms to translate that into triangle mesh and texture data, we just need the first step of translating it into a traditional point cloud
@GenericPast
@GenericPast 2 жыл бұрын
We'll reach a point where AI can take a 2D photo and make it a very convincing 3D render
@meateaw
@meateaw 2 жыл бұрын
@@fischX we already have the capability to take a video and convert it into a 3d representation using photogrammetry. Plug this into the front of that pipeline, and you can take a couple hundred shittyly framed poorly lit photos from a few angles and turn that into a 3d model pretty easily.
@TheJonathanExp
@TheJonathanExp 2 жыл бұрын
3:48 that's mind blowing! The shallow depth of field looks incredible
@rijaja
@rijaja 2 жыл бұрын
"Do not look at where we are, look at where we will be two more papers down the line" You know what? I think I'll keep looking at where we are right now just a couple minutes more because this is mind blowing
@Kram1032
@Kram1032 2 жыл бұрын
Imagine using this with Montecarlo style renders instead. Surely the noisy lowish sample output of those would be able to produce quite similar results as well-compressed 5D or more NeRFs. That way you could completely separate out rendering and camera parameters. And if you add more dimensions, you could include the lighting setup and animations as well!
@TwoMinutePapers
@TwoMinutePapers 2 жыл бұрын
Hey there Kram, nice to see you around! The noise patterns of Monte Carlo renders are very insidious, but nothing a couple more papers down the line couldn't solve. Here's hoping!
@Kram1032
@Kram1032 2 жыл бұрын
@@TwoMinutePapers shouldn't Monte Carlo noise be quite similar to camera short noise? It's not gonna be identical, but it should at least have the similar property of a zero mean, in particular if it's unbiased, which is why you can add together multiple noisy renders to get a cleaner one. The exact same property, if I understand right, is why RAWNeRF can work so well with noisy data. So I'd imagine you could "simply" render to a NeRF (rather than to a 2D image) and it'd work just about as well as RAWNerF. Perhaps it'd be possible to use the entire light path, rather than just the end point, in a NeRF, since it stores plenoptic data, so you can easily convert the ray path into plenoptic data, at least in principle. How it'd actually work in practice I don't know.
@Oldman_Gamer2
@Oldman_Gamer2 2 жыл бұрын
@@Kram1032 I'm no scholar, but one way I can see Monte Carlo noise being different to camera noise is that if a pixel should be bright, the camera raw image will always have a bright pixel - camera noise doesn't darken bright pixels. Monte Carlo noise often produces dark pixels that should be bright. The noise looks similar to humans, but the underlying data is different. And anyway, we have some incredible de-noising techniques for rendering that can use information a camera just doesn't have, like normals, pixel depth, unlit textures etc.
@Kram1032
@Kram1032 2 жыл бұрын
​@@Oldman_Gamer2 yeah, that's true. I think it'd probably take some clever variance reduction techniques to help with that. I'm imagining a combined sampling-and-training here. Very much forming the entire NeRF like currently a single fixed-angle image is formed. It may not work 1:1 with current sampling algorithms, but I suspect it wouldn't actually be that different in the end. Doing it end to end like that would mean you get all that info (normals, depth, textures, what have you) *from any view*. Not entirely sure how to properly do that. Perhaps, instead of rendering from fixed cameras, you basically just take *any* point in space within your sampling volume, shoot a ray from there in *any* direction, and assume it's a virtual camera's path. Perhaps with constraints to stay outside of opaque bodies. And slowly the whole plenoptic volume emerges in the same way a regular render would.
@etiennelamole9565
@etiennelamole9565 2 жыл бұрын
I can only imagine how this could be used in smartphones when taking night pictures I’d love to see this on Google Pixel phones
@littlegandhi1199
@littlegandhi1199 2 жыл бұрын
Huaweii are the frontrunners on night shot camera phones. It's a big reason I bought mine.
@neopabo
@neopabo 2 жыл бұрын
@@littlegandhi1199 Current pixels are good for night photos too. If this was included Huawei would be a joke
@kamranki
@kamranki 2 жыл бұрын
@@neopabo It's true. Pixel phones are generally very good for night photos (even the very old Pixel 1 is still fantastic).
@gzy2001
@gzy2001 2 жыл бұрын
@no AFAIK Huawei introduced the night mode since Mate 9, and updated their algorithm to support handheld exposure during launch of P20
@phozel
@phozel 2 жыл бұрын
@@gzy2001 “ Huwai introduced the night mode” funniest quote ever. What is a night mode? 😂 you know nothing about photography
@veggiet2009
@veggiet2009 2 жыл бұрын
6:25 I'm more impressed by the appearance of the spokes on the bike wheels than I am about the appearance of the car and the bricks behind the fence
@Rebel101
@Rebel101 2 жыл бұрын
This is absolutely amazing development! Not sure is people realize the full potential of it in consumer grade photography. I want this on my Pixel 6 in the next update!
@jenkem4464
@jenkem4464 2 жыл бұрын
The incredible thing about the results is that is seems like what you'd expect our eyes to perceive in the same situation. So many great papers, too little time!
@jurjenbos228
@jurjenbos228 2 жыл бұрын
That may be because there's a neural network coupled to your eyes. It is called "brain".
@MrHandyBlock
@MrHandyBlock 2 жыл бұрын
I would to try and experiment with this for short films. I'm affraid, however, that dowloading and running the program is beyond my capabilities. If anyone knows a good tutorial for using this, please let me know! Fantastic video as always btw!
@moverecursus1337
@moverecursus1337 2 жыл бұрын
I can just imagine the Google Maps, the movies, games and experiences in the next decade with this technology. Incredible!
@ndavid42
@ndavid42 2 жыл бұрын
The CSI "Enhance!" feature is even closer!
@maurizio_italy
@maurizio_italy 2 жыл бұрын
seems blatantly copied from blade runner (1982) enhance kzbin.info/www/bejne/nnnam5abe8h5aLM
@thePavuk
@thePavuk 2 жыл бұрын
as ex-photographer, my jaw just hit the table. edit: My second idea was that it would not be very useful in real shooting dynamic movements, but new mirrorless cameras has so high fps that it could work.
@amarug
@amarug 2 жыл бұрын
Technologically, this is unbelievable. But all these new crazy AI methods to produce "art" somehow also make me a bit sad. We always throught burger flippers in fast food chains would be the first ones to go when the robot revolution comes, but it turns out a big chunk of artists in the tv/cinema/media world are among the first to be heading for the chopping block. that's kind of a depressing thought. it used to be a wicked skill to replace a background in a photo or so, now its pushing a button in a free app. and i am an engineer who cant draw a stick figure. i can't imagine how rotoscopers, matte painters etc must feel.
@rph_redacted
@rph_redacted 2 жыл бұрын
This is insane. Can't imagine google putting this in Gcam in coming years
@mongini1
@mongini1 2 жыл бұрын
not to this extend... although they'll use that AI denoising to improve on night shots (they would be really really dumb if they dont, cause this would elevate the pixel to new heights)
@hztn
@hztn 2 жыл бұрын
- Cannot wait to play Quake on virtual streets on a World map, yeah.
@oisiaa
@oisiaa 2 жыл бұрын
Video games of 2030 are going to be insanely realistic.
@Sanzunno
@Sanzunno 2 жыл бұрын
Bank Security Cams : a blessing from the lord
@Gomisan
@Gomisan 2 жыл бұрын
This seems like it would be really useful for things like taking low grade footage from security systems/dashcams and enhancing for number plates etc. Perhaps those devices need to be recording RAW output to have a basis for this as it becomes possible.
@Marc-io6pv
@Marc-io6pv 2 жыл бұрын
I thought that, what about old footage from unsolved crimes etc. Would it be accepted as the AI has interpreted it I wonder.
@Gomisan
@Gomisan 2 жыл бұрын
@@Marc-io6pv I think the key was teh RAW sensor data though
@purplestarinferno5119
@purplestarinferno5119 2 жыл бұрын
This is great i hope we can now scan 3D objects with a couple pictures taken from multiple angles this would greatly help solo game devs
@armandfgs
@armandfgs 2 жыл бұрын
PolyCam can do this already and is free to download on apple and android :D
@rowannadon7668
@rowannadon7668 2 жыл бұрын
Lost of photogrammetry software already exists to generate meshes, works pretty good with solid objects
@muhammedkoroglu6544
@muhammedkoroglu6544 2 жыл бұрын
@@rowannadon7668 don’t they mostly require LiDAR though?
@yesnone8813
@yesnone8813 2 жыл бұрын
​@@muhammedkoroglu6544it is not required. Lidar is very useful for real time applications/ moving objects, but for static things you just need enough angles and you can stitch them back together. Sometimes specks are painted on the object to increase the point accuracy, but the techniques are still good without it. kzbin.info/www/bejne/aqa6hKJuq5ich6M
@SwervingLemon
@SwervingLemon 2 жыл бұрын
@@muhammedkoroglu6544 Many, yes, and the ones that don't are still pretty fiddly.
@ReynaSingh
@ReynaSingh 2 жыл бұрын
It’ll be great to see this technology integrated with smartphone cameras
@p5rsona
@p5rsona 2 жыл бұрын
at this rate they could flat out replace dedicated cameras. imagine not having to carry all that heavy gear or having to spend thousands for lenses.
@fedyx1544
@fedyx1544 2 жыл бұрын
This plus a lidar sensor would probs be able to instantly give a full detail 3d scan of an area
@Dekatelon
@Dekatelon 2 жыл бұрын
@@fedyx1544 many smartphones already have lidars for face recognition
@moverecursus1337
@moverecursus1337 2 жыл бұрын
It's not exactly the camera but dedicated power processing
@danko5866
@danko5866 2 жыл бұрын
It would need a lot of processing power and storage
@Arkryal
@Arkryal 2 жыл бұрын
Ok, but the real question... When will Google deploy this on Google Earth? It's a cool research project, but I'd like to see it put to use in a commercial project. Advances like this are always much more exciting when they finally leave the lab. And this would be amazing for traffic cams at intersections. In my home state (NY), the Dept of Transportation has live cameras in major cities that are publicly accessible. If you go to the feed URL and follow it with the typical "?t=xxxxx", where the xxxx is the time in seconds (just like KZbin or other online videos) you can rewind the stream by up to a week and play back from there. It's amazing for things like insurance claims for traffic accidents. Anyone can just grab a video from the intersection in question and save it (assuming they know how, it's not well advertised). But many intersections have multiple camera views, and if they could be stitched together as a NERF, and in video format, rather than stationary images, that would revolutionize traffic cams, security cameras, etc. The commercial applications are staggering. This could be a multi-billion dollar technology just in licensing.
@kettlefleet829
@kettlefleet829 2 жыл бұрын
this + google earth in VR
@dylanh333
@dylanh333 2 жыл бұрын
I can see this being added as a Pixel-exclusive feature in Google Camera or Google Photos tbh
@fermigas
@fermigas 2 жыл бұрын
@@kettlefleet829 or fix the videos of people who can't be bothered to hold the camera still long enough on a subject for the viewer to see it.
@circuit10
@circuit10 2 жыл бұрын
For things like insurance and security cameras you probably wouldn’t want to use this because it might make up information to fill in the gaps (good when you want it to look nice but not for getting actual information)
@alexhydron
@alexhydron 2 жыл бұрын
Someone please help me collect my lower jaw off the floor...
@RicoElectrico
@RicoElectrico 2 жыл бұрын
I see a green haze in most of these low light reconstructions. Maybe related to Bayer pattern having 2x green subpixels? I think that if noise is non-linear (Poisson sure is), it may screw with simple linear white balance correction. I remember old Nokia phone cameras (like 3500c) had this issue that dark areas had residual green noise.
@MaxChromaColor
@MaxChromaColor 2 жыл бұрын
Yeah I really wish they had just designed sensors originally with more hue channels than RGB and a better matrix, but it is all about cost and efficiency. You can extract an amazing amount of Hue information from linear RGB but down to the raw bayer matrix level you have to interpolate in different ways to try and avoid those issues. Red hues can be revealed into very black/grey/white areas (as long as its not totally zero or maxed out), Green Hues, but with Orange/Yellow/Lime hues there will be more noise... I am sure it is even worse when dealing with the direct data from the RGB raw matrix... but maybe different ways of blurring/interpolating/moving around the matrix can resolve the hues in the noise, and just need to remember to account for over-representation of the hues? I just want a camera with 6 or 12 hue-specific sensors and black/grey/white luminance ones, but with this kind of technological advancement with AI you really don't need that much raw info to get approximations like this... but to really get more dynamic range of various hues we need more sensors, especially in astrophotography it would be so helpful.
@rubenst2008
@rubenst2008 2 жыл бұрын
Nah, its the Matrix green :D
@nsa3967
@nsa3967 2 жыл бұрын
You can tonemap
@JustJory
@JustJory 2 жыл бұрын
Imagine what night photos could look like in the future if it already looks like this.
@jimj2683
@jimj2683 2 жыл бұрын
In the future you could have smartglasses with AI night vision!
@realfangplays
@realfangplays 2 жыл бұрын
Don't the night vision mode in our phones do something similar even now. I'm sure Google is gonna use this paper to improve the night mode in their pixel phones
@JustJory
@JustJory 2 жыл бұрын
@@realfangplays Yeah but they usually turn the exposure up to higher than 0.25 seconds so you usually gotta wait until the photo is done shooting.
@realfangplays
@realfangplays 2 жыл бұрын
@@JustJory Multiple images taken here will also take some time but there definitely will be an improvement, yeah.
@u12bdragon
@u12bdragon 2 жыл бұрын
4:42 this looks really cool, actually. I would love if there was like a videogame that looked like this. where you could explore around "gloomy"/"blurry" scenes.
@cihadturhan
@cihadturhan 2 жыл бұрын
not realtime yet
@jamescombridgeart
@jamescombridgeart 2 жыл бұрын
This one is packed full!! So good. Even the bikes spokes were rendered perfectly. Never mind the nerf, the nighttime denoising impressed me enough haha. Game dev is always my first in mind use case for most of this stuff. Personally I'd love what it could do to streamline photogrammetry!
@BevansDesign
@BevansDesign 2 жыл бұрын
Jeez, it's even accurately handling the reflections in that piano at the end. This is stunning.
@Waffles_Syrup
@Waffles_Syrup 2 жыл бұрын
A bit confusing. Does the "new" denoiser work solely off of a single raw image, or does it perform better because it uses the entire collection of raw images to rebuild the data from?
@jurjenbos228
@jurjenbos228 2 жыл бұрын
Your guess is right: the denoising uses all images, so that it has more information available for denoising.
@tenet-rotas
@tenet-rotas 2 жыл бұрын
This is insane Karoly! I was deeply impressed with the original method and seeing this sequel is just amazing!
@mrofnoctonod
@mrofnoctonod 2 жыл бұрын
This is incredible! The depth of field. The denoising. The parallax effect on thin objects. Everything. Just. Wow!
@elliotthood1719
@elliotthood1719 2 жыл бұрын
Imagine when people can use this to make a 3D scene for a game Development is going to get sped up so quickly, and the results will be ridiculously realistic
@Chuso_Skever
@Chuso_Skever 2 жыл бұрын
It already exists, it's called photogrammetry
@jimj2683
@jimj2683 2 жыл бұрын
@@Chuso_Skever Photogrammetry is very slow and tedious, partypooper. NERF is much better...
@GuyWithAnAmazingHat
@GuyWithAnAmazingHat 2 жыл бұрын
This is insane, AI learning has officially gone past the point of ENHANCE, you can now denoise, see in the dark, refocus, and even reconstruct a 3D environment out of a single photo?
@tylercasper702
@tylercasper702 2 жыл бұрын
No. You can reconstruct a 3D environment in 25-200 photos. This was stated in the video.
@zwenkwiel816
@zwenkwiel816 2 жыл бұрын
lol I was thinking that during the video. we used to laugh at CSI for doing impossible shit like this....
@licnep1
@licnep1 2 жыл бұрын
this viewport synthesis tech is cool cause it could also convert every movie into a 3d movie...
@Goatfooted
@Goatfooted 2 жыл бұрын
this would be awesome for videogame enviornments, its such a high quality render, if you wanted to actually have a cutscene happen irl and show it in game, however rare that is
@Will-kt5jk
@Will-kt5jk 2 жыл бұрын
The tilt-shift focus effect is pretty impressive & overall, it feels like the works are building towards something more & more robust, with this method far less affected by the reflection artefacts and ‘clouding’ seen in other papers. I’m really interested whether there’s mesh extraction on this & what the processing cost is (given it’s ‘real-time’)- it could eventually make 3D scanning from cameras simpler & more robust. If the processing is cheap enough, maybe even running on a smartphone. Then there’s computational photography more generally - taking a few shots of a scene could give you a virtual ‘light-field’ photo for re-focussing, with the added benefit of some ability to fly-through the scene.
@littlegandhi1199
@littlegandhi1199 2 жыл бұрын
Yeah looking at the textures during flybys to me it seems obvious it's only meshes like every other photogrammetry implementation.
@coder0xff
@coder0xff 2 жыл бұрын
Wow! The depth of field on the bookcase reflected in the piano was just *chef's kiss*.
@gregortidholm
@gregortidholm 2 жыл бұрын
This is fantastic! Thank you for all the time you invest in spreading this knowledge and promoting interest in scientific reports and the treasures waiting to be discovered!
@waterbot
@waterbot 2 жыл бұрын
Two Minute Papers up'ed "2 minutes ago" is always the best! Thanks again Dr.!
@BoogerDad
@BoogerDad 2 жыл бұрын
Crazy how adding in macro focuses makes normal-sized things look so miniature. Would be an excellent visual theme for video production
@PedroCasavecchia
@PedroCasavecchia 2 жыл бұрын
I've made an animated short film with a similar athletic! Maybe you would like it (: it's called Pulsion
@the747videoer
@the747videoer 2 жыл бұрын
those fly through videos are so surreal. it's like creating a CG scene and the only thing that's animated is the camera which is sliding on simple keyframes
@scuube7176
@scuube7176 2 жыл бұрын
The refocusing looks absolutely lovely!
@sjoerdgroot6338
@sjoerdgroot6338 2 жыл бұрын
For my studies I had to recreate this paper with 5 students and it's quite similar to mip-nerf. Only difference being that it runs on raw images instead and because of this uses a different activation function and you have to do do the post processing afterwards. Also they can use different exposure times and learn how to scale the RGB in order for different exposure photos to look the same. This might not directly become clear from this video but it does require you to recreate/figure out camera positions before you can train the raw nerf which can be challenging if it is too dark. Also, although taking 25-200 doesn't take to much time, training does 😅
@heinerowald-sobiraj8647
@heinerowald-sobiraj8647 2 жыл бұрын
Did my master's thesis in CS about a CV-related topic 5 years ago and did not stay up to date since then. Incredible how far CV has come!
@ondrazposukie
@ondrazposukie 2 жыл бұрын
this must come to Google photos, I definitely see it as an auto creation. Denoising, playing with focus, flying through. Nice additions. And let's scrap the old "we made a stylized photo for you" one which does nothing useful nor beautiful. two more papers later this could use also photos from the past which are always a little different (houses may have different flowers on widows or even a different roof). So not just to stick together but also make the details look like they are in a couple is photos. It would be especially useful because people sometimes make that many photos of one thing but rarely on the same day
@Phastraq
@Phastraq 2 жыл бұрын
This technique is going to make photogrammetry and digital recreation of objects and people, so much easier.
@alexsiemers7898
@alexsiemers7898 2 жыл бұрын
Image processing software like this and DALL-E are the kind of AI that genuinely blow my mind more than anything else in the field.
@KieranShort
@KieranShort 2 жыл бұрын
On top of a lot of the RawNeRF videos looking like tilt-shift model villages [the piano at 7:35 looks like it is a mini piano in a doll's house], at 5:54 do you notice around the front bike's brake cable that is sticking out - the brick pavement on the road has been smoothed flat. If you check the raw sensor data, your eyes/brain can make out that it's regular bricks. The same thing at 6:24 on the lower left hand side. The brick pavers have been blurred again. The NeRF in the dark just removes that detail. Maybe 2 papers down the line, these things might be fixed.
@cahitkarahan6378
@cahitkarahan6378 2 жыл бұрын
Get ready, Grand Theft Auto 10 is waiting 2 more papers down the line 🤣
@7bitscience
@7bitscience 2 жыл бұрын
Here's an idea: short image sequence around an object --> NeRF algorithm --> Photography
@soul-candy-music
@soul-candy-music 2 жыл бұрын
Made my first NERF the other day! A LOT harder than I expected (I know barely anything about programming), but it was a trip and a half.
@TonyStark-ik9em
@TonyStark-ik9em 2 жыл бұрын
That makes it sound as easy as "i stitched my first pano the other day" ... What? Which codebase? What renderer? I'm trying with colmap and the MultiNeRF featured in the video - this really is an ongoing trip right there. How did you succeed?
@mrlightwriter
@mrlightwriter 2 жыл бұрын
What softwares and processes did you use, if you don't mind sharing?
@soul-candy-music
@soul-candy-music 2 жыл бұрын
@@mrlightwriter I'm honestly too programmatically illiterate to even try to explain it, but I'll gladly link to the tutorial I followed. kzbin.info/www/bejne/bqHJgYiEbbOskKc
@mrlightwriter
@mrlightwriter 2 жыл бұрын
@@soul-candy-music Thanks for the link!
@TonyStark-ik9em
@TonyStark-ik9em 2 жыл бұрын
@@mrlightwriter I'm not OP but i just got my first NeRF running now. The GoogleLab version wouldn't run on my Mac so i switched to the NVidia code and while this still is a bitch or two to install and get compiling, training and finally running, it is very impressive. No i only have measly 1080ti and a Titan XP so it's slow as hell and i run out of VRAM really easy. 3090 prices need to come down a good piece more.
@JensGulin
@JensGulin Жыл бұрын
At 5:56 how can the rightmost fence post be that broken? It's very far from the base already in the input photo. Was the camera already using synthetis techniques or is it generated in reverse from video?
@ZackDeSario
@ZackDeSario 2 жыл бұрын
this guy is so adorable. i want him as my life narrator... "Zack opens the door, hold on to your papers, lets look at whats inside"
@j4ked
@j4ked 2 жыл бұрын
Dude, the denoising and synthesized view is INCREDIBLE.
@journeywithoutdistance
@journeywithoutdistance 2 жыл бұрын
VR using this could produce amazingly realistic experiences.. To be able to shoot a little video of a scene and then hop into it using VR would be incredible. Some of the future planned VR headsets could especially take advantage of the focal depth with light field technology, giving realistic focusing for our eyes.
@himabimdimwim
@himabimdimwim 2 жыл бұрын
Woah! The fly through video would be really cool with Google street view!
@TShearPhoto
@TShearPhoto 2 жыл бұрын
A detail you have a bit wrong: There should not be more noise in the RAW file. If there's more noise, it's because your camera/software is applying some very simple noise reduction that I'm sure is also crushing details. The reason the raw looks so noisy is because someone edited it to bring the shadows up. The original image would have looked WAY noisier if it had the same treatment. This is something that pretty much every photographer/videographer has experience with
@slow-moe3518
@slow-moe3518 2 жыл бұрын
AI is getting crazy good suddenly. What exciting times.
@smaklilu90
@smaklilu90 2 жыл бұрын
Once the scenes start moving they look like miniature models that are dangling and shot with a small camera 😂😂
@ruantristancarlinsky3851
@ruantristancarlinsky3851 2 жыл бұрын
Man this is absolutely Amazing! Technology is improving at such a rapid rate, imagine where we will be 10 years from now...
@valberm
@valberm 2 жыл бұрын
The narration voice tone is what sold it for me 😄
@djsUltra
@djsUltra 2 жыл бұрын
This guy is lucky that commas are not a finite resource.
@Some24Person
@Some24Person 2 жыл бұрын
I'm getting excited at all the new technologies that are coming out lately, imagine how helpful this could be for things like crime scene analysis
@jimurrata6785
@jimurrata6785 2 жыл бұрын
Or deepfakes! 😉
@Ampvartak
@Ampvartak 2 жыл бұрын
This is the best channel I have subscribed in a recent while. So much new info and knowledge.
@maracachucho8701
@maracachucho8701 2 жыл бұрын
That part in the end where the reflection on the piano produced a different focus plane was insane!
@mar_sze
@mar_sze 2 жыл бұрын
Wow, the refocus makes it so much better even! Just beautiful.
@skybuck2000
@skybuck2000 2 жыл бұрын
The yellow bike picture 6:20, was it truely just one picture ? Or multiple input pictures ??? Also is this software available for PC, or is this for data centers only and most folks will have to wait 10 or 20 years for this ?
@arothmanmusic
@arothmanmusic 2 жыл бұрын
Holy crap. It’s like the Lytro camera’s promise, but in higher resolution and quality. Incredible.
@LennartDoering
@LennartDoering 2 жыл бұрын
Your channel is one of the greatest gifts on youtube. Thank you for your work!
@SoimulX
@SoimulX 2 жыл бұрын
I remember a similar video of yours from 2018 and for me it looks like google just combined two AIs into one
@TheChazas
@TheChazas 2 жыл бұрын
we need this with GCAM on Pixel phones.
@GabeHandle
@GabeHandle 2 жыл бұрын
The combination of advancements that allow fewer and fewer photographs for training along with advancements in colorizing/upscaling old black and white photos and videos, I can see a time in the near future where we'll be able to create immersive environments of the past. A virtual time machine!
@Magnymbus
@Magnymbus 2 жыл бұрын
That denoising is incredible... Literally made me involuntarily vocalize my surprise
@timf5613
@timf5613 2 жыл бұрын
1 - open source the tech 2 - crowd source photos 3 - recreate a virtual version of our world 4 - step into the metaverse
@cmilkau
@cmilkau 2 жыл бұрын
Is the refocus performed by the model itself? Or is it a postprocessing step based on extracted depth information?
@OMNIDROID2995
@OMNIDROID2995 2 жыл бұрын
I actually love this channel. I just don't watch it that often because the way he talks gets on my nerves. It's exhausting. Like stop and go in city traffic. That's exactly how the guy talks. I really like your videos. But I'll definitely start watching them on mute with subtitles at some point. Just say a few more words between pauses. Don't worry. Our atmosphere has enough oxygen
@drewby4701
@drewby4701 2 жыл бұрын
This is absolutely insane, I'm in disbelief!
@Ila_Aras
@Ila_Aras 2 жыл бұрын
Same here!
@jimmy21584
@jimmy21584 2 жыл бұрын
Once this can be done in real-time, it’ll be a massive boost for computer vision and robotics, not to mention mobile photography. There must be a lot of other useful information about the scene that could be accessed, ie depth values, that the end result images are not showing us.
@MaxChromaColor
@MaxChromaColor 2 жыл бұрын
The amazing thing about RGB sensors is that you can interpolate the data and find out the "hue information" into the most (almost totally) achromatic regions, only in areas where there is total black or total white-point will you be unable to extract information, but the AI can fill it in. Now with the denoising, it is truly amazing to see the "hue-reveal" come to life, hopefully we will get even better sensors in the cameras someday that detect more wavelengths, but with this tech it is almost not as necessary... but what happens is certain "hues" are more noisy than others... Reds and Greens will have less noise, but an Orange or Yellow will have more noise - when looking down into the RAW sensor data it has to be interpolated from the typical bayer-matrix of RGB sensors (They didn't design camera chips to have multiple layers and take a full R, G, B for the whole scene, it is interlocked onto one 2D plane)... using color models like HWB you can convert the RGB into Hue, White, Black channels and with the denoising I assume they are doing some of this to try and figure out what colors and tones are in the noisy RAW data... added with the AI capabilities and there you have it. Truly exciting to see this level of color and tone extraction, I remember being amazed when just looking at 16-bit images with the Hue-Channel shown in color, just how much information was actually being captured by the sensors and it is really wild when we make techniques that go in and pull out the info and fill it in where its missing like this. Thanks for sharing this awesome research and developments!
@ewerybody
@ewerybody 2 жыл бұрын
6:05 I was looking at that spot first and thought: Meeeeh. BUT it was due to 480p mobile quality. Ramping it up to 1080p and its now flawless!! Unbelievably good!! 👌
@jenius00
@jenius00 2 жыл бұрын
I didn't quite catch from the video, does the denoising process try to utilize information from multiple images to do a better job denoising each? Or is it just combining denoising with NeRF? Or even better does it iteratively try to converge on a solution that maximizes denoising and NeRF performance metrics by having the output of one cross inform the other and vice versa?
@jenius00
@jenius00 2 жыл бұрын
scanning the paper webpage, it appears the denoising at the very least is informed by multiple images.
@BarryRowlingsonBaz
@BarryRowlingsonBaz 2 жыл бұрын
To do smooth focus pulling/depth of field with this do you need a range of pictures with different focus/DoF? I don't think we ever see more than one original for each processed video clip here do we? At least not for the new technique.
@nathanweisser
@nathanweisser 2 жыл бұрын
Me as a videographer: dude, my job... Me as an editor: DUDE, GIMME
@andreask5054
@andreask5054 2 жыл бұрын
What kaaaiiind is thaaaat acceeent?
@ramakrishna5480
@ramakrishna5480 2 жыл бұрын
Where do u get all this updates
@leonardovelho
@leonardovelho 2 жыл бұрын
This is incredible. I'ld love to try this out if I can get the depth map to extract the 3D geometry of arquitetural components and 3D print it as decoration or just play in a 3D environment like a virtual museum. Bravo!!!
@garycaracol4365
@garycaracol4365 2 жыл бұрын
The most impressive thing in my opinion is how the program recognizes the curvature of the glass of wine at 3:30 and chances light based on the angle
@coma-body-stilllife
@coma-body-stilllife 2 жыл бұрын
The outdoor night scene with standard NeRf looks beautiful despite It's terrible artifacts, Iridescent..
@CouchPotator
@CouchPotator 2 жыл бұрын
This would be amazing for nighttime security camera footage where key details don't exist in any one frame.
@USBEN.
@USBEN. 2 жыл бұрын
Man this AI progress is super fast lightening speed. I'm really enjoying this ride.
2 жыл бұрын
7:38 the reflection focus seems off on the piano right?
@tiusic
@tiusic 2 жыл бұрын
4:42 - The results from the old technique at night might not match reality, but I think they're really pretty. It looks like the scene is full of multi-colored clouds.
@HURSAs
@HURSAs 2 жыл бұрын
Looks like the algorithm has extracted the basic geometry of the objects and the position of the light sources from the RAW data and pushed it through a photogrammetry sofwares like "Meshroom" or "Reality Capture" so later it can proceed with some Rasterization of the maps, extracted from more detailed photos and ends with Light Transport and Camera FOV zDepth baked simulation, with denoising layer on top, which is thypical for the most 3D Modeling and Rendering Softwares. The still image result can not be easily recognized as rendered, but It definitelly looks like a simulated 3D environment while moving or oribitng with the camera around, especially when the Global Illumination layer seems to be differently denoised from the Rasterization layer.
@mathijsluijten
@mathijsluijten 2 жыл бұрын
I wish I could start playing around with this technique in some approachable form. I'm an animator by trade and would LOVE a plugin in After effects giving me these powers. A world op storytelling possibilities!!!
@sergentti
@sergentti 2 жыл бұрын
Do you think the refocusing would work in video if you record with multiple cameras?
Ray Tracing: How NVIDIA Solved the Impossible!
16:11
Two Minute Papers
Рет қаралды 802 М.
Watch Tesla’s Self-Driving Car Learn In a Simulation! 🚘
13:29
Two Minute Papers
Рет қаралды 428 М.
How to treat Acne💉
00:31
ISSEI / いっせい
Рет қаралды 108 МЛН
Chain Game Strong ⛓️
00:21
Anwar Jibawi
Рет қаралды 41 МЛН
"Junior developers can't think anymore..."
13:53
Travis Media
Рет қаралды 44 М.
Midnight hailstorms | 2d Weather Sandbox
6:41
Simulated Weather
Рет қаралды 2,3 М.
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 1,5 МЛН
DeepMind’s New AI Finally Enters The Real World! 🤖
8:56
Two Minute Papers
Рет қаралды 258 М.
Can You Put All This In a Photo? 🤳
6:38
Two Minute Papers
Рет қаралды 54 М.
Новый год 2025 на ТНТ "ComedyVision!" @ComedyClubRussia
1:16:57
Hollywood is so over: The INSANE progress of AI videos
21:34
AI Search
Рет қаралды 705 М.
DeepMind’s New Gaming AI Does The Impossible!
6:54
Two Minute Papers
Рет қаралды 106 М.
NVIDIA's Ray Tracing AI - This is The Next Level! 🤯
11:13
Two Minute Papers
Рет қаралды 159 М.
How to treat Acne💉
00:31
ISSEI / いっせい
Рет қаралды 108 МЛН