amazing! gpu utilization? That is so useful now I can increase the batch size so much more easily without having issues with nvidia-smi...etc etc!
@carterfendley31453 жыл бұрын
The "log_freq=10" made my training loop unbearably slow (on a different model than video). Granted, by most DL standards I have a slow computer. Love your stuff! Hope this saves someone a minute.
@raphaelhyde2335 Жыл бұрын
Great video and walk-through, I really like how you explain the details and steps Charlies
@brandomiranda67033 жыл бұрын
now I can track the gradients without a hassle? no additional get gradients functions...nice!
@maxxrenn Жыл бұрын
Great knight rider reference "Evil charles with a goatee"
@vladimirfomenko4893 жыл бұрын
Great tutorial, Charles, thanks for sharing!
@kanybekasanbekov29553 жыл бұрын
Does Wandb support PyTorch Distributed Data Parallel training? I cannot make it work ...
@WeightsBiases Жыл бұрын
yep, here's some docs: docs.wandb.ai/guides/track/advanced/distributed-training
@JsoProductionChannel5 ай бұрын
My NN is not learning even thought I have the optimize step in my def train(model, config). Does someone have the same problem?
@maciej12345678 Жыл бұрын
i have problem with connection in wandb wandb: Network error (ConnectionError), entering retry loop. windows 10 how to resolve this issue ?
@jakob32673 жыл бұрын
Awesome work, thanks for sharing!
@oluwaseuncardoso815011 ай бұрын
i don't understand what "log_freq=10" mean? Does it mean log the parameters every 10 epochs or batchs or steps?\
@HettyPatel Жыл бұрын
THIS IS AMAZING!
@brandomiranda67033 жыл бұрын
how does one achieve high disk utilization in pytorch? large batch size and num workers?
@brandomiranda67033 жыл бұрын
what happens if we don't do .join() or .finish()? e.g. there is a bug in the middle it crashes...what will wandb do? will the wandb process be closed on its own?
@WeightsBiases3 жыл бұрын
In the case of a bug or crash somewhere in the user script, the wandb process will be closed on its own, and as part of the cleanup it will sync all information logged up to that point. If that crashes (e.g. because the issue is at the OS level or things are otherwise very on fire), the information won't be synchronized to the cloud service but it will be on disk. You can sync it later with wandb sync. Docs for that command: docs.wandb.ai/ref/cli/wandb-sync If you have more questions like these, check out the Technical FAQ of our docs: docs.wandb.ai/guides/technical-faq
@Oliver-cn5xx2 жыл бұрын
the gradients are numerated like modex x1.x2 what do x1.x2 refer to?
@brucemurdock53586 ай бұрын
Americans are so imprecise in their vocabulary. I understand you're trying to make the explanations more palatable but I personally prefer someone being more calm, collected and precise in their vocabulary and choice of sentences. Many academicians may prefer this. Besides that, thanks for the video.
@brandomiranda67033 жыл бұрын
How do things change if I am using DDP? (e.g. distributed training and a bunch of different processes are running? Do I only log with one process? That is what I usually do)
@WeightsBiases3 жыл бұрын
There's two ways to handle it -- logging from only one process is simpler, but you sacrifice the ability to see what's happening on all GPUs (good for debugging). Explanatory docs here: docs.wandb.ai/guides/track/advanced/distributed-training
@asmirmuminovic54202 жыл бұрын
Great!
@user-jl4wk5ms4f3 жыл бұрын
how to count number of classes in each image
@user-zs1sy5nr6h2 жыл бұрын
Are these clips Deep Learning articles?
@HarishNarayanan3 жыл бұрын
wand-bee
@user-or7ji5hv8y3 жыл бұрын
Fonts are so small
@WeightsBiases3 жыл бұрын
Thanks for the feedback! We're making sure that future tutorials don't have this issue
@FeddFGC3 жыл бұрын
Go 720p or higher, it should do the trick. It's perfect already at 720p
@amitozazad15843 жыл бұрын
@@FeddFGC I second this, it works at high resolution