Classifying sound using Machine Learning (Artificial Summit February 2020 @ KnowIt)

Рет қаралды 570

Күн бұрын

Пікірлер: 9

@ChintanShah-z3w Жыл бұрын

hey there! thanks for the amazing talk. i am still confused about labelling training data for the model. in my scenario, i have to detect the whistle sound generated by referee during a live sports event. how do i approach training my dataset. i have audio clips from sample match which last over an hour. so if i use the clip to label whistle sounds from it, will it also not take into account the other sounds at the exact moment? suppose referee blows whistle at 2:00-2:10. if i label it as a whistle sound, will it also not include the noise coming from the crowd? how do i approach this problem?

@Jononor Жыл бұрын

Hi! Thanks for the question. Yes, the sound-of-interest may appear together with other sounds sometimes. That is just part of the nature of what you are trying to do! As long as you have enough samples this is not a problem at all for modern Machine Learning methods - it will learn to ignore the irrelevant parts. In fact, the goal should be to cover as many variations of how the sound might appear during use, to build a robust dataset and detector

@Jononor Жыл бұрын

I have a Jyputer notebook that demonstrates a very similar task: Detecting badminton hits in a match. You can find the code here, github.com/jonnor/machinehearing/blob/master/handson/badminton/BadmintonSoundEvents.ipynb (I hope the comment and link appears, KZbin is super finicky about it)

@ChintanShah-z3w Жыл бұрын

@@JononorYes, link is visible.Thank you so muchhh for the quick response and a reference! i'll look into it. :)

@peterm.4026 3 жыл бұрын

Wow, how am I just coming across this lol. Super helpful addition to your other one I saw. Thanks, Jon. Around 48:50 you're talking about labeling for multiple sounds so I just want to clarify. Let's say I want to detect birds. But it's in an area with a lot of wolves who howl a lot. Let's also assume that those are the only two things making noise at all times, wolves and birds. When I classify the dataset to train my model on...before watching this video, I would have originally thought, I should make two classes, "Birds" and "No Birds". But if there are a lot of wolves howling, would I improve my detection if I have "Birds", "No Birds", and "Birds and Wolves." In other words, is adding a third class where there is kind of a mixed sound between the one I want and the interfering audio going to improve my ability to know when birds are chirping around?

@Jononor 3 жыл бұрын

Adding another class to the model may improve performance - or it may hurt. Really depends on the particular model and data, hard to predict. But even if choosing to have only two classes in the model, it can often be advantageous to annotate more. For example it allows error analysis, so you can see if errors in bird classification is correlated with presence of $other-label - a type of confusion.

@peterm.4026 3 жыл бұрын

@@Jononor thanks!