i can't lie to you all,unfold data science is one of the best data science learning platform ,i learned many usefull skills from his videos..............
@sanketadamapure802 Жыл бұрын
Distance-based methods for outlier detection are well-suited for handling outliers. Here are a few distance-based algorithms commonly used for outlier detection: 1 ] k-nearest neighbors (k-NN): In k-NN, each data point is classified based on the majority class among its k nearest neighbors. Outliers can be identified as data points that have few or no neighbors within a certain distance. 2] Local Outlier Factor (LOF): LOF calculates the local density of a data point compared to its neighbors. It identifies outliers as data points with significantly lower density compared to their neighbors. LOF takes into account the distance to k-nearest neighbors and provides an outlier score for each data point. 3] Isolation Forest: Isolation Forest constructs random decision trees to isolate outliers. It measures the number of splits required to isolate a data point from the rest of the data. Outliers are identified as data points with a shorter average path length in the tree construction. 4] DBSCAN (Density-Based Spatial Clustering of Applications with Noise): DBSCAN groups together data points that are close to each other based on a density criterion. Outliers are considered as data points that do not belong to any dense cluster.
@UnfoldDataScience Жыл бұрын
Thanks Sanket for adding your points.
@rajasekaranm11986 ай бұрын
what a beautiful explanation
@ozan47026 ай бұрын
Thank you for the video. Do you recommend combining multiple outlier treatment methods? For example, log transform + winsorization? Or log transform + winsorization + standard scaler? If so, what should be the order of applying these methods?
@balajikomma541 Жыл бұрын
Sir actually I'm following your playlist "Big Data Hadoop and Unix playlist" but after video 'Sqoop' installation step, there are no other videos, could you please tell me where are the continuation videos of these playlist. Kindly update that playlist. Also one doubt, is Big data even in 2023 is important for data science or else can be managed with the cloud technologies like databricks pyspark in aws or azure or GCP, Kindly reply sir
@umeshtiwari800 Жыл бұрын
Tx, Aman
@UnfoldDataScience Жыл бұрын
Welcome Umesh.
@manjeerag868 Жыл бұрын
Hi Aman Thanq so much for your valuable videos. Pinged you on linked in. Please reply🙏