Outlier
  • Z-Score or Extreme Value Analysis (parametric)

    z=xμσz = \frac{x-\mu}{\sigma}

  • Probabilistic and Statistical Modeling (parametric)

  • Linear Regression Models (PCA, LMS)

  • Proximity Based Models (non-parametric)

  • Information Theory Models

  • High Dimensional Outlier Detection Methods (high dimensional sparse data)

  • How to check model is stable(validate performance)

    Cross-validation

    1. Randomly split your entire dataset into k ”folds”
    2. For each k-fold in your dataset, build your model on k1k-1 folds of the dataset. Then, test the model to check the effectiveness for kth fold
    3. Record the error you see on each of the predictions
    4. Repeat this until each of the k-folds has served as the test set
    5. The average of your k recorded errors is called the cross-validation error and will serve as your performance metric for the model
Author: shixuan liu
Link: http://tedlsx.github.io/2019/10/17/outlier/
Copyright Notice: All articles in this blog are licensed under CC BY-NC-SA 4.0 unless stating additionally.
Donate
  • Wechat
  • Alipay

Comment
Catalog