nnabla ディープラーニングチャンネル

25:15

【AI論文解説】拡散モデルと自己回帰型モデルの融合 Part1

Ай бұрын

26:45

【AI Paper】Perform efficient portrait animation with LivePortrait!

2 ай бұрын

49:43

【学会聴講報告】CVPR2024からみるVision最先端トレンド

3 ай бұрын

18:39

【AI論文解説】クラスタリングによる大規模データセット自動キュレーション

4 ай бұрын

12:41

【AI論文解説】LLMの事前学習をvisionに適用する手法Autoregressive Image Models

4 ай бұрын

26:21

[AI paper] CTM: Advanced Single-Step DIffusion Model for Fast and High-Quality Sampling (日本語字幕付)

4 ай бұрын

28:10

[AI論文解説] MPGD: 拡散モデルそのままであらゆる編集を可能に！ICLR採択論文を解説 | Sony's Research Minds

4 ай бұрын

48:45

[AI論文解説] SAN: 識別器を再構成してあらゆるGANの性能をアップ！ICLR採択論文を解説 | Sony's Research Minds

5 ай бұрын

32:37

【AI論文解説】RLHF不要なLLMの強化学習手法Direct Preference Optimization(+α)

5 ай бұрын

19:07

【AI論文解説】Consistency ModelsとRectified Flow ~解説編Part1~

7 ай бұрын

19:43

【AI論文解説】Consistency ModelsとRectified Flow ~解説編Part2~

7 ай бұрын

9:26

【AI論文解説】Consistency ModelsとRectified Flow ~前置き＆概要編~

7 ай бұрын

21:40

【AI Paper】Perform one-shot face reenactment with HyperReenact!

7 ай бұрын

22:06

【AI論文解説】リアルで自然な人体画像生成を実現！マルチモーダルなジオメトリ情報を理解した拡散モデル: HyperHumanを解説！

7 ай бұрын

14:23

【AI Paper Review】Using Generative AI for Image Classification!

9 ай бұрын

38:17

【AI論文解説】離散+連続のハイブリッド強化学習 Hybrid Action Representation (HyAR)

10 ай бұрын

45:55

【学会聴講報告】ICCV2023からみるVisionトレンド Part3 ~表現学習，人認識，3D表現編~

11 ай бұрын

2:59

【学会聴講報告】ICCV2023からみるVisionトレンド Part1 ~学会概要編~

11 ай бұрын

31:39

【学会聴講報告】ICCV2023からみるVisionトレンド Part2 ~効率の良い深層学習モデル編~

11 ай бұрын

18:40

【ソニー社内講演】拡散モデルと基盤モデル（2023年研究動向）

Жыл бұрын

38:48

【ICML2022】離散と連続のハイブリッド！新しい生成モデル "SQVAE"を紹介 | Sony's Research Minds

Жыл бұрын

30:00

【ICML2023】あらゆるデータ修復を可能に! "GibbsDDRM"を紹介 | Sony's Research Minds

Жыл бұрын

23:05

【AI Paper】Perform high fidelity portrait avatar reconstruction in real-time with StyleAvatar!

Жыл бұрын

26:01

【AI論文解説】GRES(Generalized RES) 参照表現セグメンテーションタスクの前提を覆す！multi- / single- / no-targetに対応するタスクと手法の提案

Жыл бұрын

27:20

【AI論文解説】UNINEXT 20個のSOTAを単一のパラメータで達成！？Instance Perceptionにおける統一モデル

Жыл бұрын

30:13

【論文対談】データ拡張をもっと自然に！CVPR2023採択論文"Rawgment"を紹介 | Sony's Research Minds

Жыл бұрын

13:27

【AI論文解説】RetNet: Transformerの後継！効率の良いLLM向けアーキテクチャ

Жыл бұрын

35:45

【学会聴講報告】CVPR2023からみるVision最先端トレンド

Жыл бұрын

1:45

チャンネル登録1万人ありがとうございます

Жыл бұрын

Пікірлер

@UCYdVBRl4XPgxMAD5W879 12 күн бұрын

説明がわかりにくすぎる

@tahanymansour9931 Ай бұрын

Do all these steps work when performed on the phone?

@nnabla Ай бұрын

Thanks for watching our video. The answer depends on how you want to use this. You can access to Google Colaboratory and run this demo via smartphone, but that means it is not performed on the phone, but on their server. I hope this answers your question.

@Mrperusyaneko Ай бұрын

分かりやすい解説ありがとうございます！生成の度に一部のパラメタを再最適化する作用で多様性が上がる、面白い！

@Mrperusyaneko Ай бұрын

ありがとうございます！βのスケーリングを勾配降下で最適化するアイディアは他の深層学習手法でも使えそう！

@nnabla 2 ай бұрын

You can check the slides used in this video in www.slideshare.net/slideshow/20240819_nm_liveportrait_nnabla_youtube_final-pdf/271869953

@nnabla 3 ай бұрын

本動画で利用されている資料はこちらでも公開しています！ www.slideshare.net/slideshow/cvpr2024-vision-cvpr2024-report-8a60/270524692

@yajuusennpai 4 ай бұрын

素晴らしい動画をありがとうございます

@benkyu93 5 ай бұрын

わかりやすく解説ありがとうございます。質問なのですが、LSTMの模式図(スライドp16)についてお聞きします。 LSTMはRNNの中間層の各ユニットをメモリセルで置き換えていると認識しているのですが、この図だと、RNNの中間層そのものをメモリセルで置き換えているように見えるのですが、どのように解釈をすればよいでしょうか？

@RivusVirtutis 7 ай бұрын

不勉強ゆえ、流れはわかるのですが、どういうNNを用いて画像とテクストの行列を統一して変換できる写像を生成するかわかりませんでした。元論文をしっかり理解できるようになるまでこのチャンネルなどを使ってもっと勉強しないといけないなと改めて思いました。

@student-beer 8 ай бұрын

すごお

@shake6321 9 ай бұрын

do you know how to make Ai Avatars? if so, please contact me i am looking for someone

@12cgt 9 ай бұрын

神授業

@Diagreen86 11 ай бұрын

いつも参考になります。丁寧な動画、ありがとうございます。

@sushi-sukisuki Жыл бұрын

ありがとうございます助かりました意外と力技なイメージですかね色々試してみて貢献してるとこを見つける感じですよねそんなに難しい理論は使っていないのかなと思いました

@nnabla Жыл бұрын

3:12 結果パート「GANベースのADM」--> 「Diffusion ModelのADM」です 4:25 右側の論文タイトルは「Pre-training Vision Transformers with Very Limited Synthesized Images」-->「SegRCDB: Semantic Segmentation via Formula-Driven Supervised Learning」です

@白葛 Жыл бұрын

分かりやすい解説ありがとうございます！

@net_stack4176 Жыл бұрын

Hi, thanks for the video. This really good. How do I run an inference for a set of images and get the prediction results saved?

@nnabla Жыл бұрын

Hi, thanks for using our Colab demo. Since it's a bit hard to show you how to do that here, we've opened an issue and answered your question. Please refer to github.com/sony/nnabla-examples/issues/394.

@kojiy01 Жыл бұрын

生成系の途中経過を見せてくれると内部でどのように処理されているのかのイメージがわきますね。ありがとう！

@h_holon Жыл бұрын

RetNetの紹介記事を探していたのでとても助かります。論文を見ただけではいまいち理解が及んでいなかった部分に見通しが立ちました。ありがとうございます。

@krishnaw14 Жыл бұрын

Congratulations everyone! おめでとうございます!

@moyamoyamoyamoya Жыл бұрын

数学弱者なので疑似コード助かります😢

@かいじんZ Жыл бұрын

オープンワールド　リコグニションは本来そうあるべき汎用的なタスクですよね。ネットワークの分野でも教師データのクラス以外のものが観測されるケースを考えるものは少ないです😢

@Nightspire1 Жыл бұрын

Thanks for this! Does the png image and mp4 video have to be the same pixel ratio? I tried uploading an mp4 from my files and ffmpeg gave me a warning, then tried to play and it gave this error: FileNotFoundError: [Errno 2] No such file or directory: 'result/arbitrary/input_image.png_by_input_video.mp4'

@nnabla Жыл бұрын

Hi, sorry for a bit late reply. Thanks for using our demo! > Does the png image and mp4 video have to be the same pixel ratio? No. We confirmed it works even when they have different pixel ratio. Note that it would affect the generation quality. Judging from the error message, it seems inference code failed to generate the resulting video. I don't think it was because of the different ratio. There might be something wrong in input video such as some special codec or non-ascii filename? Can you try again if you're still interested? Thanks.

@Nightspire1 Жыл бұрын

Thanks for replying! I've tried several different ai lipsyncing models, they all seem to warp the face. I feel ai lipsyncing is still in its infancy, and I don't know python too well so that is the biggest hurdle. SO many dependencies with so many different versions makes using any program in python very difficult, again if you don't really know what you're doing. I appreciate the reply, and the collab demo, as collabs are the only thing i can get to work!@@nnabla

@moyamoyamoyamoya Жыл бұрын

毎度勉強させてもらっています！

@mkii5095 Жыл бұрын

いつも論文紹介お世話になってます！

@andrewshin8704 Жыл бұрын

1万人おめでとうございます！🎵

@どこかのだれか-s8g Жыл бұрын

すごくわかりやすかったです！これを機に自分でも最新の論文を追っていきたいと思います！！

@toritometo Жыл бұрын

お腹すいたカバ！可愛い名前

@mariosalamanca7743 Жыл бұрын

Currently not working. Code stops at from generate import *

@nnabla Жыл бұрын

Hi, thanks for using our demo. We confirmed that the problem has been solved and it works as expected. Can you try that if you're still interested? Thanks!

@demaxism Жыл бұрын

今GPT人気爆発ですね　もっと前からこれを見るべきだった～

@YukioHatoyama114511 Жыл бұрын

やろうと思ったけどグラボ積んでるpcなかった

@truth2 Жыл бұрын

Brother I am doing in mobile And the site is Demo for paper "First Order Motion Model for Image Animation" And I was playing cells, those cells were being played well, after mounting Google drive, the next cell was Creat a model and load checkpoints When i pressed, it showed me error of ImportError Traceback (most recent call last) <ipython-input-6-dbd18151b569> in <module> ----> 1 from demo import load_checkpoints 2 generator, kp_detector = load_checkpoints(config_path='config/vox-256.yaml', 3 checkpoint_path='/content/gdrive/My Drive/first-order-motion-model/vox-cpk.pth.tar') 3 frames /content/first-order-model/augmentation.py in <module> 10 11 from skimage.transform import resize, rotate ---> 12 from skimage.util import pad 13 import torchvision 14 ImportError: cannot import name 'pad' from 'skimage.util' (/usr/local/lib/python3.9/dist-packages/skimage/util/__init__.py

@hungah Жыл бұрын

11:08 わかりやすい

@googIe.com. 2 жыл бұрын

ファインチューニングなしでUniTuneできるZeroTuneがほしいですね

@googIe.com. 2 жыл бұрын

拡散モデル完全に理解した

@dazof7671 2 жыл бұрын

python初心者なので質問の意図が動画の主旨から外れている場合は無視してください。コラボラトリ上のRun the trainingの再生ボタン(実行ボタン？)をクリックするとenv.close()の行でNameError:name 'env' is definedとなりました。githubにアップロードしていただいている内容に問題ないでしょうか。

@nnabla 2 жыл бұрын

ご質問ありがとうございます。おそらくですが、最後のRun the trainingの再生/実行ボタンだけ押していないでしょうか？Google Colaboratory上での実行では上から順番に全部の再生/実行ボタンを押して準備をしていくので、上にある他のブロックの再生ボタンを押した後で、最後にRun the trainingを押せば、学習が開始されると思います。

@nnabla 2 жыл бұрын

Thanks for watching this video and trying the Colab demo! Unfortunately, as of now (Nov. 8, 2022), we observe an error when running the Colab demo. We will fix this issue very soon and let you know here.

@walidflux 2 жыл бұрын

hey man can you do update for this?

@nnabla 2 жыл бұрын

Yes, we observe there'll be an error if you run the current Colab demo as is. A simple solution is to install the latest nnabla ("!pip install nnabla-ext-cuda114" in the first cell). Anyway we will update the demo soon.

@walidflux 2 жыл бұрын

@@nnabla thank you

@mochou_p 2 жыл бұрын

good!

@dazof7671 2 жыл бұрын

21:18で「そしてこのようにせ、、、」と途中で切れています。この部分で大切な何かをご説明いただいているというわけでは無い場合はご放念ください。

@nnabla 2 жыл бұрын

ご指摘ありがとうございます！確認させていただき、必要に応じて修正版をアップロードします！

@nnabla 2 жыл бұрын

確認したところ、内容に影響がないので再アップロードは必要なかったです。しかし、ご指摘の旨を概要欄に反映させていただきました。今後ともどうぞよろしくお願いいたします！

@clonegit7826 2 жыл бұрын

来年は、diffusion と GAN が逆転してそう

@taka_49 2 жыл бұрын

何ができるんだろう？ワクワクドキドキ ↓ なんやこれ？🤔

@nnabla 2 жыл бұрын

詳細動画はこちら！ kzbin.info/www/bejne/pnvKZaSPaM2CsKM

@nnabla 2 жыл бұрын

詳細動画はこちら：kzbin.info/www/bejne/m3OXlKSbabdnadU

@-_-plm2232 2 жыл бұрын

最近話題になったmimicもこれに近いの使ってるのかな

@javirodicio5525 2 жыл бұрын

I tried and the result is a disaster...

@nnabla 2 жыл бұрын

Quality of generation results depends on the input images or videos. One recommendation is to use well-aligned (same scale and same orientation) images as input videos (or vice versa). Or, there could be an issue when downloading the required pretrained weights file, and the AI model failed to use it. If the model was forced to run without proper weights file, generation results would be totally collapsed.

@zimma4335 2 жыл бұрын

When i try to play the video it says "FileNotFoundError: [Errno 2] No such file or directory: 'result/arbitrary/input_image.png_by_input_video.mp4':

@nnabla 2 жыл бұрын

One possible cause is that you executed the cell before the previous cells had not finished. Some cells would take long until the process ends.

@amathpati9107 2 жыл бұрын

Can we give the video as input instead of images to mimic the expression ?

@nnabla 2 жыл бұрын

Yes. Please check the latter part of the demo.

@aeaxao9973 2 жыл бұрын

Thank you for the good explanation! Great work!

Ең жақсы KZbin

Пікірлер