Asking the model how confident it is in the label ( 6:40 ) isn't necessarily a valid way to assess its confidence, unless they have given the language aspect access to the logits of the detection stuff. If it was trained multimodal it doesn't necessarily know this either and can basically be saying what it learned in image text pairings about the person's confidence, not its own (or maybe they were able to elicit this capability with RLHF?). Is there any calibration out there that shows large multimodal language models can assess their own confidence in images in a way that lines up with the accuracy?
@msstateman2003Ай бұрын
Until AI can flatten autonomously it's input to output structure creating new classifications and agents it is not learning. The issue with most probabilistic models is at the boundary of conflating space and time (kurtosis inference). Shortest path may not be useful to model reality if model gradient is greater than 9 state space parameters.
@AlgoNudgerАй бұрын
Thanks.
@deeplearningpartnershipАй бұрын
o1 can do all that, so long as it's trained on the right data sets. In other words, it can reason about its world.
@mzlittleАй бұрын
You don't have to take over every 13 miles with Tesla FSD.
@harriehausenman8623Ай бұрын
I still dont understand why everyone assumes humans are intelligent. How would WE know?! Maybe to a truly intelligent being we are like apes-with-a-stick: Cute, interesting, definitely on the right path, but not 'really' intelligent. 😉
@rkara2Ай бұрын
You have a very isolated, linear point of view about intelligence. In reality people or beings cooperate with one another or their environment, so where are you models for that?