Radioactive data: tracing through training (Paper Explained)

Рет қаралды 5,299

Күн бұрын

Пікірлер: 33

@um5548 4 жыл бұрын

So it is basically a variant form of pixel attack? Or an attack based on adding noise, I forgot what the corresponding name was. Thank you for all your work on this channel, you are doing a great service to the community.

@JoaoVitor-mf8iq 4 жыл бұрын

33:45 inducing correlated weights might be good on a type of distillation, since you could check main characteristics of the "professor neural network" and induce these correlations and weight distributions to the "student".

@EditorsCanPlay 4 жыл бұрын

This paper is such an AI troll I love it!

@julespoon2884 4 жыл бұрын

If the cos diff is significant between unmarked and marked data in the same class, it should be just as easy to tell the difference between the two by comparing cosine differences between the feature vectors of the samples within the same class in the black box test. Or say taking all pairwise differences between the feature vectors of samples of the same class and maybe doing a PCA or smth, we should expect one of the eigenvectors to be a sign of marked data. For a defence to be effective tho, the effort to twart the defence has to be less than the benefit of using said dataset. Given that u have to train a decent model to detect if the data has been marked in the first place, ill say this defence is effective? Somewhat? EDIT: Ooo ur suggestion does make it way more sneaky

@YannicKilcher 4 жыл бұрын

There are certainly many defenses that would work here, yours included.

@herp_derpingson 4 жыл бұрын

You already talked about everything I wanted to say. Nice.

@ulm287 4 жыл бұрын

What happens if you use a bigger model on the radio active data ? Or just a different arch ? Shouldn’t that break the whole thing ? Assuming different arch will learn different feats. Ie FFN or CNN for example?

@emuccino 4 жыл бұрын

I was confused too. But I think you just need to feed the model a sample that only contains the radioactive feature for a particular class and see if it tends to classify it as such.

@YannicKilcher 4 жыл бұрын

no necessarily, because adversarial examples are known to generalize across datasets and architectures

@zihangzou5802 3 жыл бұрын

@@YannicKilcher but how you can compute the cosine similarity when the features size are different? The transformation M would not be dxd. And in this case, do you need to train a model with the same architecture and find out?

@zihangzou5802 3 жыл бұрын

and in that case, you cannot guarantee the training process is the same as other trainer according to the prior assumption that the training process is unknown.

@simongiebenhain3816 4 жыл бұрын

I didn't really get your idea at the end. If your alterations to the data are not bound to a specific class, how would you force the network to pay attention to the alterations?

@YannicKilcher 4 жыл бұрын

because the features itself would be bound to specific combinations of classes

@zihangzou5802 3 жыл бұрын

Can you explain how to understand figure 4 and figure 5? And since you mention alignment throughout the paper, why don't you use the angle between translation vector (\phi_0(x)-\phi_t(x)) and u to determine if the marked data were used? What is the benefit of referring beta distribution?

@zihangzou5802 3 жыл бұрын

and what is the value of x axis and y axis in those figures.

@tarunpaparaju5382 4 жыл бұрын

First comment and first like. Finally 💥🔥 (great video!)

@neur303 4 жыл бұрын

Is there a difference to the concept of watermarking?

@drdca8263 4 жыл бұрын

This seems like it is supposed to be like watermarking pictures in that it allows you to demonstrate that a network was trained using the data you marked (analogous to demonstrating that a pictures used was watermarked by you by, pointing at the watermark), but different in that without knowledge of how it was marked, one can't tell if it was marked? Or, wait, is watermarking already an established idea in the context of training data?

@YannicKilcher 4 жыл бұрын

watermarking tags the datapoint, this tags the model trained on the datapoint

@sunnyguha2 4 жыл бұрын

Its always Alice and Bob!

@GilaadE 4 жыл бұрын

imgs.xkcd.com/comics/protocol.png

@Markste-in 4 жыл бұрын

Doesn't this just mean that we intentionally add some bias towards a certain class in the data? (Something that we actually want to avoid?)

@YannicKilcher 4 жыл бұрын

in a way, yes

@sacramentofwilderness6656 4 жыл бұрын

Well, what I feel uneasy about - that the feature extractors would be related simply by a linear transformation. I may be wrong, but there was a video on your channel, where It was shown, that even a different initializations of the neural network with the same architecture may lead to drastically different result, after the training process, having stucked into a completely different region in the weight space. And for the different architecture, the behaviour inside, the feature extraction seems to have little in common, with the setting, trained by those, whose want to protect their data

@YannicKilcher 4 жыл бұрын

just because the weight space is different doesn't mean that different features are learned, but still a good observation!

@mgostIH 4 жыл бұрын

My network when it trains on russian data

@alexanderchebykin6448 4 жыл бұрын

I'm highly skeptical about this whole data marking idea: whatever you do, it needs to be invisible to the eye, i.e. small. And if it's small, it'll surely disappear after converting the image to jpg/blurring/applying some other slight modification. And to me it seems downright impossible to go around this problem

@YannicKilcher 4 жыл бұрын

That's true, but as you deteriorate the image to defend, you will also make your classifier less accurate

@tubui9389 Жыл бұрын

I thought the same. Deteriorating data will help not only defend against this kind of membership inference attack, but also make the classifier more robust to noise. I wish the authors explored effects of more data augmentations to attack performance, other than just crop and resize. Regarding the solution to go around this problem, the watermark needs to be robust to noise during marking time. Hence eq7 in the paper should take that into account.