SAM - Segment Anything Model by Meta AI: Complete Guide

SAM - Segment Anything Model by Meta AI: Complete Guide | Python Setup & Applications

Рет қаралды 69,708

Күн бұрын

Пікірлер: 133

@dloperab Жыл бұрын

Great video... thanks Piotr and Roboflow for all the great videos you generate. I am resuming my interest in CV thanks to you!

@Roboflow Жыл бұрын

This is big! If I managed to convince you even a little bit I am proud of myself.

@SS-zq5sc Жыл бұрын

This was a great explanation and so was your blog entry. You gained another subscriber today. Thank you!

@Roboflow Жыл бұрын

Hi, it is Peter from the video :) That's awesome to hear! Thanks a lot!

@deeplearningexplained 25 күн бұрын

Really solid video, loved the intro haha

@anestiskastellos4150 Жыл бұрын

Very nice video. Next video -> Grounded Segment Anything !! 👏

@Roboflow Жыл бұрын

🚨 SPOILER ALERT: That's the plan!

@abdshomad Жыл бұрын

Second that! +1

@Roboflow Жыл бұрын

@@abdshomad I think we should have something by Friday/Monday

@user-rm9ml1th8h 11 ай бұрын

Nice video! I have 1 question, can you please suggest which is better out of this "SAM" or "YOLOv6-v3" for real-time detection in terms of accuracy? My requirement is to detect car parts(e.g. Michelin tire). Thank you in advance.

@Roboflow 11 ай бұрын

If you want to run real time, then you can’t use SAM. It will be to slow.

@user-rm9ml1th8h 11 ай бұрын

@@Roboflow: Thank you very much for your quick response! For our specific requirement to detect car parts(e.g. wheel type - alloy wheels or not, specific accessory, etc.,) after captured image being uploaded(taken from mobile camera), can you please suggest best algorithm based on your vast experience in this area? Do you recommend YOLOv6-v3 or GroundingDINO or any other? Tons of thanks to you again in advance!

@mnx64 Жыл бұрын

You said it’s real time ready - but what IS real time ready is only the prediction for various prompts on the *same* image. Generating the embeddings for each new image is actually really slow (multiple seconds depending on image size and hardware) and cannot be done in browser or in real time. This makes it less useful for live/video analysis, of course, but it’s still great to generate segmentation mask training sets! Thanks for the video

@Roboflow Жыл бұрын

Hi 👋🏻 It’s Peter from the video. You are right I really think I shouldn’t say that because it is confusing. The decoding part is very fast and can get executed in real time, but the encoding is quite slow. Can be faster if you use version B versus version H but still… So I really wish I would be more precise. Apologize for that.

@mattizzle81 Жыл бұрын

There is a fully onnx optimized and quantized model out there that is faster. Still not ultra fast but at least 1 FPS on my RTX 2080 ti, which is not bad. Semi real-time.

@Roboflow Жыл бұрын

@@mattizzle81 thanks for that info I was actually not aware of that… but still I think it would be much cleaner without that sentence in the vid :)

@froukehermens2176 9 ай бұрын

One use case is the annotation of eye tracking data. Per video frame one would like to annotate whether a person is looking at other people or objects in the environment. One could use YOLO and bounding boxes, but these are less precise than regions.

@samzhu6728 10 ай бұрын

Thanks for the wonderful video! Is it possible to annotate specific objects (with labels) in a few frames of a video (fixed perspective) and keep tracking those objects in the entire video?

@willemweertman1178 Жыл бұрын

You should try it with data taken in a underwater marine context. Lots of models struggle with that.

@Roboflow Жыл бұрын

Cool idea!💡 I work on next vid I’ll try to take that into consideration

@ish694 Жыл бұрын

Trying to work on the same use case with coral segmentation

@diyorpardaev 6 ай бұрын

It is really best video ever)I am making a great project with using sv

@Roboflow 6 ай бұрын

This is so kind! Thank you very much!

@JenishaThankaraj Жыл бұрын

Can we annotate polygon shape instead of rectangle using SAM

@kobic8 Жыл бұрын

Thanks so much for the clear video! Are you planning on also intergrate it with some tools to get an output that will include also labels of each mask?

@Roboflow Жыл бұрын

We already did. Take a look here: colab.research.google.com/github/roboflow-ai/notebooks/blob/main/notebooks/automated-dataset-annotation-and-evaluation-with-grounding-dino-and-sam.ipynb

@gbo10001 Жыл бұрын

wow that's really great waited for that...

@Roboflow Жыл бұрын

I’m super happy you like it!

@sanketgarg6910 Ай бұрын

Hey, great explanation I have a question can we do a multitask in a bounding box for example different layers of liquid in a bounding box vial if yes can you explain how? Thanks!

@geniusxbyofejiroagbaduta8665 Жыл бұрын

I can't wait to see how it can be used for annotations

@Roboflow Жыл бұрын

Stay tuned for RF update;) we also plan to drop one more video probably Friday/Monday where we will dive deep into auto annotation in Colab

@Roboflow Жыл бұрын

We have released a feature enabling you to use SAM in Roboflow to label images as you mentioned: blog.roboflow.com/label-data-segment-anything-model-sam/ Let us know what you think!

@mithilanavishka4531 5 ай бұрын

Hi i am in the process of learning this SAM model follwing your video , this is very helpful i am planing to use this model to segmentahistorical documents charchters, according to your knowledge will it be possbile or time wasting ?

@Roboflow 5 ай бұрын

SAM is not really good at document segmentation

@badrinarayanan686 4 ай бұрын

Great video!! I do have a question. How do we use MskAnnotator to annotate only one specific mask instead of the entire set of masks in sam_result?

@VenyushTandon-k7e Жыл бұрын

How can I tag objects in a picture using SAM? For example: In the picture that was used in the video: Man holding a dog, I want to identify all the objects in the picture like man, Dog, building, etc

@Roboflow Жыл бұрын

Stay tuned for our video tomorrow. 🔥 I’m going to show how to auto annotate images with Grounding DINO and SAM

@drm8164 Жыл бұрын

Help please, i need to learn computer vision, but i struggle a lot. Is the OpenCv certificate worth it, it's around 1200 us dollars ? Thanks

@Roboflow Жыл бұрын

Take a look here: github.com/SkalskiP/courses. In general, the Internet is full of free resources. It is not worth paying 1200 USD for a course like that.

@GohOnLeeds Жыл бұрын

Since SAM is trained on photos, any idea how well it does on synthetic images, like artwork or games? Cheers

@Roboflow Жыл бұрын

Great question. Unfortunately I didn’t experiment with those :/ sorry

@SweetShotGaming Жыл бұрын

Is there any way you can create an auto-labeler using SAM? (SAM would take care of everything with no human intervention). My specific need would be to label lane markings, but for an entire dataset of raw images.

@SweetShotGaming Жыл бұрын

Forgot to mention, great video! Is there any functionality with SAM where you can give it a few examples of what the label is and then it will assume the labels for the dataset. Thanks!

@Roboflow Жыл бұрын

Yes! Stay tuned to our next vid. We are doing full auto dataset generation and generation of masks from boxes. Should be on Monday.

@Aziz-bg4ph Жыл бұрын

How can I extract the segmented object produced by SAM?

@Roboflow Жыл бұрын

You can find masks inside `sv.Detections` object. `detections.masks`

@darinkumarnsit4780 Жыл бұрын

@@Roboflow Could you show me how to use 'detections.masks', please? I try to use it and got AttributeError: 'Detections' object has no attribute 'masks'

@TheArkLade Жыл бұрын

For [Single Image Bounding Box to Mask] what should be changed if we have more than 1 class and want to see detection for all classes?

@Roboflow Жыл бұрын

Hi it is Peter from the video 👋🏻 So you want to have multiple boxes converted into multiple masks?

@TheArkLade Жыл бұрын

@@Roboflow Yes. so two-part question: (1) current script returns one mask at a time. How can I change it so it returns all detected masks? (2) Let's suppose I have 5 classes. How should I do so that all detections for all 5 classes are shown?

@javier_medel Жыл бұрын

Great video, Do youthink that you can share the jupyter notebook?

@Roboflow Жыл бұрын

It is linked in description ;) all our demo notebooks are open sourced

@alaaalmazroey3226 6 ай бұрын

Does SAM segment all objects in the scene very well when there is an occlusion?

@unknown-wm9ru Жыл бұрын

This still doesn't work in live action does it? Like if I connected it to a camera or a vr headset like the meta quest pro/ pico 4 and used their cameras for AR powered by Sam. That would definetly be awsome!

@Roboflow Жыл бұрын

You should get few fps. But if you want 30 fps, than we are not there yet.

@unknown-wm9ru Жыл бұрын

@@Roboflow Hmm I see, but the fact that it's there already is awesome in itself! The future is here and It's really exciting I love it

@Roboflow Жыл бұрын

@@unknown-wm9ru true that!

@MrJesOP Жыл бұрын

First of all, thank you so much for the content, amazing contribution to the community! I wonder if it is possible to implement the negative point prompt in the SAM model similarly as it can be done in the website, where you can choose several points belonging to the object that you are interested in as well as points that do not belong to it... Some help would be amazing!! Thanks in advance!!

@Roboflow Жыл бұрын

Hi thanks a lot for those kind words :) As for your question - "implement the negative point prompt". I was looking for any project that would implement that functionality. And I didn't found anything :/

@EkaterinaGolubeva-pr9ih Жыл бұрын

Thank you ! Can SAM handle 3D images ? Any advice on how to approach it ?

@anandhsaspect4560 Жыл бұрын

Great. Thanks.

@Roboflow Жыл бұрын

Thanks a lot 🙏

@kgylsd Жыл бұрын

Please let us know anytime when the SAM/ Roboflow integration is accomplished 😊

@Roboflow Жыл бұрын

It will be for sure part of out weekly newsletter! ;)

@Roboflow Жыл бұрын

We have released a feature enabling you to use SAM in Roboflow to label images as you mentioned: blog.roboflow.com/label-data-segment-anything-model-sam/ Let us know what you think!

@ruiteixeira2324 Жыл бұрын

Very nice work. How do you see SAM being used in practice? You see this as a model to be integrated in a tool to generate training data for your task or being your final model for a certain task?

@Roboflow Жыл бұрын

Hi 👋! It is Peter from the video. I think we will see broad use of SAM in image and video editors. But I also think it will be the default feature in all major annotation tools. It is a bot too slow for real-time usage. But we will transfer the knowledge it provides int datasets that we use for training real-time models. What is your prediction?

@ruiteixeira2324 Жыл бұрын

@@Roboflow yes, I totally agree with you on the fact that it will be the default tool to annotate data. Since you think this is to slow, what's in your opinion the current state of art model for semantic segmentation for real-time applications?

@Roboflow Жыл бұрын

@@ruiteixeira2324 hahaha hard question. According to papers with code that would be latest version of YOLOv6.

@Roboflow Жыл бұрын

Hi Rui - we have released a feature enabling you to use SAM in Roboflow to label images as you mentioned: blog.roboflow.com/label-data-segment-anything-model-sam/ Let us know what you think!

@Xfunfunny8605k Ай бұрын

Plz made video by custom data set ..

@BabasahebMane-d9v Жыл бұрын

Is it possible to get segmentated image without passing its bounding box?

@Roboflow Жыл бұрын

You don’t need to pass box. If you won’t pass any prompt the whole image gets segmented.

@ifeanyianene6770 Жыл бұрын

Thanks so much for this video!! Is there another way to draw the bounding box (in like a single python file format whereby you just run your main function) that doesn't require jupyter widgets? Oh btw, Liked and subscribed you guys are awesome!

@Roboflow Жыл бұрын

Thanks for like and sub ;) as for your question. That was the only interactive way that I could come up with. But if you don’t want to do it in interactive way than you have plants of options.

@shaneable1 Жыл бұрын

Great video! Thank you! What hardware are you running this on?

@mohammedinnat1247 Жыл бұрын

Nice. Thanks.

@Roboflow Жыл бұрын

Thank you! 🙏

@DTM6559 9 ай бұрын

How can I train with different color masks rather than black and white mask??

@iflag9775 Жыл бұрын

Great! Could you give a talk on possibility of object detection with SAM

@Roboflow Жыл бұрын

Hi 👋🏻 Could you please explain a bit more what do you mean? It is segmentation model. Would you like to convert masks into boxes?

@EkaterinaGolubeva-pr9ih Жыл бұрын

Can you make a video on MedLSAM ( medical localize and segment anything model) ?

@jamtomorrow457 Жыл бұрын

Hi thanks for the great tutorial! How can I download the masks created using SAM and upload them into roboflow?

@welingtongoncalvessilva2498 12 күн бұрын

i got this error : AttributeError Traceback (most recent call last) in () ----> 1 box_annotator = sv.BoxAnnotator(color=sv.Color.red()) 2 mask_annotator = sv.MaskAnnotator(color=sv.Color.red(), color_lookup=sv.ColorLookup.INDEX) 3 4 detections = sv.Detections( 5 xyxy=sv.mask_to_xyxy(masks=masks), AttributeError: type object 'Color' has no attribute 'red'

@Mnephisto 12 күн бұрын

I'll look into it. Check back in a few hours!

@Roboflow 11 күн бұрын

Thanks a lot Linus!

@imimiliades629 4 ай бұрын

Hello! Can anybody explain how i can evaludate this model after training? What commands can i run?

@tomtouma Жыл бұрын

Can it be used to segment and label objects in a video stream from a live camera? I've been reading a lot of the feedback and people are saying it's computationally heavy and will run too slow at a meaningful refresh rate. I noticed the Meta advertisement was doing it in realtime and labeling and tracking stuff. What are your thoughts on this? Is it easier to stick to OpenCV/Yolo for a live video feed?

@Roboflow Жыл бұрын

> I noticed the Meta advertisement was doing it in realtime Could you point me to the resource?

@tomtouma Жыл бұрын

@@Roboflow At about 0:06 in your video. Looks like real time?

@Roboflow Жыл бұрын

@@tomtouma I wish! That was just a lot of work online to produce it. It is not real time.

@tomtouma Жыл бұрын

@@Roboflow Could you explain what you mean by "work online"? Do you mean the Meta team recorded a video and then post-processed the video offline? Also, is there a way to pass video frames (even if it is very slow like 1Hz or slower) to SAM and have it segment the image frames? I want to then run some python scripts to get me geometry information about these segmented masks.

@tomtouma Жыл бұрын

Just thought I'd bump this.

@shamukshi Жыл бұрын

Do you do freelancing ? my ACADEMIC project is "solar panel detection and counting using SAM."

@Roboflow Жыл бұрын

Nope. We do not do freelancing. :/

@chinnagadilinga5742 Жыл бұрын

Hi Sir I'm Beginner in I saw your Computer vision video's its fully combined and merged can you please update one by one video order that time we can understand easily thank you.

@Roboflow Жыл бұрын

Hi :) Are you only interested in auto annotation videos? Or all of them?

@lorenzoleongutierrez7927 Жыл бұрын

Great !

@user-ld8lc4ex4m Жыл бұрын

Thank you so much

@유영재-c9c Жыл бұрын

How can I do semantic segmentation labeling using sam?

@cedricvillani8502 Жыл бұрын

So how many SAMs Dics are you expecting to come in and out of your model? You just seem to really enjoy SAMs Dics, but I suppose using research from Michael J Black, about 10 years ago, and it makes sense why you really enjoy utilizing SAMs Dics

@MisterWealth Жыл бұрын

how do we switch out with our own videos?

@saharabdulalim Жыл бұрын

thank u for that ❤ i have a question ,when I make annotations for more than one object in image and pass it to sam it only mask one object like the brain cell in ur project and didn't mask the other objects in same image? i make it one class for all

@saharabdulalim Жыл бұрын

i think cuz of xyxy[0] but if i want to pass multimask on multi object?

@Roboflow Жыл бұрын

@@saharabdulalim I'm actually working on next vid right now. And we will cover more general autoannotation usecases. Vid should be out tomorrow.

@itayhilel2168 Жыл бұрын

Let's do this on a superstore dataset

@Roboflow Жыл бұрын

send a link :D I'll take a look

@cedricvillani8502 Жыл бұрын

Seriously you don’t need to pay these people for a API garbage, this is old look up panoptic segmentation stop giving these people your money and your data that’s all an API is it’s an application programming interface. In other words you’re just paying them when you don’t need it.

@yatinarora9650 Жыл бұрын

Great video , create some video to explain how SAM works internally please

@husseinali-yx7uf Жыл бұрын

Thanks for your video I have learnt a lot from you but this time each time i try to follow up with your steps this error encounter me : --------------------------------------------------------------------------- OutOfMemoryError: CUDA out of memory. Tried to allocate 14.40 GiB (GPU 0; 15.90 GiB total capacity; 6.53 GiB already allocated; 7.95 GiB free; 7.12 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

@Roboflow Жыл бұрын

Is that happening in notebook?

@saharabdulalim Жыл бұрын

try using parallel gpu

@husseinali-yx7uf Жыл бұрын

@@Roboflow Yes on both google colab and kaggle

@husseinali-yx7uf Жыл бұрын

@@saharabdulalim how to do that?

@Roboflow Жыл бұрын

@@husseinali-yx7uf od that happening with your own image?

@lalamax3d Жыл бұрын

first tutorial using sam should be mask of yourself which you showed on 36sec of this video. but thanks anyways

@Roboflow Жыл бұрын

Just convince my boss to do it and you’ll have it ;)

@lalamax3d Жыл бұрын

@@Roboflow i like your boss. now when you got time for that. (imho) please keep few pointers in mind a- do it with image sequence or video or both b- getting progress bar / status tqdm / queue c- giving area/prompt in one go. d- how to acheive consistancy.. etc.. if subject move out or something come in front (if possible)

@swipeshark5311 7 ай бұрын

Не рабочий код, дизлайк

@fintech1378 Жыл бұрын

how can this model be used in detecting type and quantity of inventory in a shop?

@Roboflow Жыл бұрын

Hi it is Peter from the video. I think that if you look for type and quantity of inventory in a shop you will be much better off with using detection models like YOLOv8 or YOLO-NAS.

@aarontan5434 11 ай бұрын

--------------------------------------------------------------------------- TypeError Traceback (most recent call last) in () ----> 1 mask_annotator = sv.MaskAnnotator(color_map='index') TypeError: MaskAnnotator.__init__() got an unexpected keyword argument 'color_map I got this error. May I know what went wrong?

@Roboflow 11 ай бұрын

Hi! I just fixed the notebook. Feel free to try it. :)

@karlvanbree4882 10 ай бұрын

guess he meant: sv.MaskAnnotator(color_lookup = "index")