So it is basically a variant form of pixel attack? Or an attack based on adding noise, I forgot what the corresponding name was. Thank you for all your work on this channel, you are doing a great service to the community.
@JoaoVitor-mf8iq4 жыл бұрын
33:45 inducing correlated weights might be good on a type of distillation, since you could check main characteristics of the "professor neural network" and induce these correlations and weight distributions to the "student".
@EditorsCanPlay4 жыл бұрын
This paper is such an AI troll I love it!
@julespoon28844 жыл бұрын
If the cos diff is significant between unmarked and marked data in the same class, it should be just as easy to tell the difference between the two by comparing cosine differences between the feature vectors of the samples within the same class in the black box test. Or say taking all pairwise differences between the feature vectors of samples of the same class and maybe doing a PCA or smth, we should expect one of the eigenvectors to be a sign of marked data. For a defence to be effective tho, the effort to twart the defence has to be less than the benefit of using said dataset. Given that u have to train a decent model to detect if the data has been marked in the first place, ill say this defence is effective? Somewhat? EDIT: Ooo ur suggestion does make it way more sneaky
@YannicKilcher4 жыл бұрын
There are certainly many defenses that would work here, yours included.
@herp_derpingson4 жыл бұрын
You already talked about everything I wanted to say. Nice.
@ulm2874 жыл бұрын
What happens if you use a bigger model on the radio active data ? Or just a different arch ? Shouldn’t that break the whole thing ? Assuming different arch will learn different feats. Ie FFN or CNN for example?
@emuccino4 жыл бұрын
I was confused too. But I think you just need to feed the model a sample that only contains the radioactive feature for a particular class and see if it tends to classify it as such.
@YannicKilcher4 жыл бұрын
no necessarily, because adversarial examples are known to generalize across datasets and architectures
@zihangzou58023 жыл бұрын
@@YannicKilcher but how you can compute the cosine similarity when the features size are different? The transformation M would not be dxd. And in this case, do you need to train a model with the same architecture and find out?
@zihangzou58023 жыл бұрын
and in that case, you cannot guarantee the training process is the same as other trainer according to the prior assumption that the training process is unknown.
@simongiebenhain38164 жыл бұрын
I didn't really get your idea at the end. If your alterations to the data are not bound to a specific class, how would you force the network to pay attention to the alterations?
@YannicKilcher4 жыл бұрын
because the features itself would be bound to specific combinations of classes
@zihangzou58023 жыл бұрын
Can you explain how to understand figure 4 and figure 5? And since you mention alignment throughout the paper, why don't you use the angle between translation vector (\phi_0(x)-\phi_t(x)) and u to determine if the marked data were used? What is the benefit of referring beta distribution?
@zihangzou58023 жыл бұрын
and what is the value of x axis and y axis in those figures.
@tarunpaparaju53824 жыл бұрын
First comment and first like. Finally 💥🔥 (great video!)
@neur3034 жыл бұрын
Is there a difference to the concept of watermarking?
@drdca82634 жыл бұрын
This seems like it is supposed to be like watermarking pictures in that it allows you to demonstrate that a network was trained using the data you marked (analogous to demonstrating that a pictures used was watermarked by you by, pointing at the watermark), but different in that without knowledge of how it was marked, one can't tell if it was marked? Or, wait, is watermarking already an established idea in the context of training data?
@YannicKilcher4 жыл бұрын
watermarking tags the datapoint, this tags the model trained on the datapoint
@sunnyguha24 жыл бұрын
Its always Alice and Bob!
@GilaadE4 жыл бұрын
imgs.xkcd.com/comics/protocol.png
@Markste-in4 жыл бұрын
Doesn't this just mean that we intentionally add some bias towards a certain class in the data? (Something that we actually want to avoid?)
@YannicKilcher4 жыл бұрын
in a way, yes
@sacramentofwilderness66564 жыл бұрын
Well, what I feel uneasy about - that the feature extractors would be related simply by a linear transformation. I may be wrong, but there was a video on your channel, where It was shown, that even a different initializations of the neural network with the same architecture may lead to drastically different result, after the training process, having stucked into a completely different region in the weight space. And for the different architecture, the behaviour inside, the feature extraction seems to have little in common, with the setting, trained by those, whose want to protect their data
@YannicKilcher4 жыл бұрын
just because the weight space is different doesn't mean that different features are learned, but still a good observation!
@mgostIH4 жыл бұрын
My network when it trains on russian data
@alexanderchebykin64484 жыл бұрын
I'm highly skeptical about this whole data marking idea: whatever you do, it needs to be invisible to the eye, i.e. small. And if it's small, it'll surely disappear after converting the image to jpg/blurring/applying some other slight modification. And to me it seems downright impossible to go around this problem
@YannicKilcher4 жыл бұрын
That's true, but as you deteriorate the image to defend, you will also make your classifier less accurate
@tubui9389 Жыл бұрын
I thought the same. Deteriorating data will help not only defend against this kind of membership inference attack, but also make the classifier more robust to noise. I wish the authors explored effects of more data augmentations to attack performance, other than just crop and resize. Regarding the solution to go around this problem, the watermark needs to be robust to noise during marking time. Hence eq7 in the paper should take that into account.