🌀 Dimensionality Red. (PCA, tSNE, UMAP)
Dimensionality Reduction is a type of multivariate analysis based on plotting all the variables into a 2D or 3D scatter plot.
Why reduce dimensions?
- Remove multicollinearity
- Deal with the curse of dimensionality
- Remove redundant features
- Interpretation & Visualization
- Make computations easier
- Identify Outliers
Method | Name | Based in | Duration |
---|---|---|---|
PCA | Principal Component Analysis | Linear (maximize variance) | Fast |
t-SNE | t Stochastic Neighbor Embedding | Neighbors | Â |
LargeVis | LargeVis | Neighbors | Â |
ISOMAP | t Stochastic Neighbor Embedding | Neighbors | Â |
UMAP | Uniform Manifold Approximation and Projection | Neighbors | Â |
AE | Autoencoder (2 or 3 at hidden layer) | Neural | Â |
VAE | Variational Autoencoder | Neural | Â |
LSA | Latent Semantic Analysis | Â | Â |
SVD | Singular Value decomposition | Linear? | Â |
LDA | Linear Discriminant Analysis | Linear | Â |
MDS | Multidimensional Scaling | Â | Â |
Principal Component Analysis (PCA)
a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. The first component is the most important one, followed by the second, then the third, and so on.
from sklearn.decomposition import PCA
pca = PCA(n_components=2)
pca.fit(X)
T-SNE
Read How to use t-SNE effectively
from sklearn.manifold import TSNE
tsne = TSNE(random_state=0)
x_tsne = tsne.fit_transform(x)
# And plot it:
plt.scatter(x_tsne[:, 0], x_tsne[:, 1]);
Independent Component Analysis (ICA)
a statistical technique for revealing hidden factors that underlie sets of random variables, measurements, or signals.
Principal Component Regression (PCR)
a technique for analyzing multiple regression data that suffer from multicollinearity. The basic idea behind PCR is to calculate the principal components and then use some of these components as predictors in a linear regression model fitted using the typical least squares procedure.
Partial Least Squares Regression (PLSR)
PCR creates components to explain the observed variability in the predictor variables, without considering the response variable at all. On the other hand, PLSR does take the response variable into account, and therefore often leads to models that are able to fit the response variable with fewer components.
Sammon Mapping
an algorithm that maps a high-dimensional space to a space of lower dimensionality by trying to preserve the structure of inter-point distances in high-dimensional space in the lower-dimension projection. sometimes we have to ask the question “what non-linear transformation is optimal for some given dataset”. While PCA simply maximizes variance, sometimes we need to maximize some other measure that represents the degree to which complex structure is preserved by the transformation. Various such measures exist, and one of these defines the so-called Sammon Mapping. It is particularly suited for use in exploratory data analysis.
Multidimensional Scaling (MDS)
a means of visualizing the level of similarity of individual cases of a dataset.
Projection Pursuit
a type of statistical technique that involves finding the most “interesting” possible projections in multidimensional data. Often, projections which deviate more from a normal distribution are considered to be more interesting.
Linear Discriminant Analysis (LDA)
if you need a classification algorithm you should start with logistic regression. However, LR is traditionally limited to only two class classification problems. Now, if your problem involves more than two classes you should use LDA. LDA also works as a dimensionality reduction algorithm; it reduces the number of dimension from original to C — 1 number of features where C is the number of classes.
Mixture Discriminant Analysis (MDA) — It is an extension of linear discriminant analysis. Its a supervised method for classification that is based on mixture models.
Quadratic Discriminant Analysis (QDA)
Linear Discriminant Analysis can only learn linear boundaries, while Quadratic Discriminant Analysis is capable of learning quadratic boundaries (hence it is more flexible). Unlike LDA however, in QDA there is no assumption that the covariance of each of the classes is identical.
Flexible Discriminant Analysis (FDA)
a classification model based on a mixture of linear regression models, which uses optimal scoring to transform the response variable so that the data are in a better form for linear separation, and multiple adaptive regression splines to generate the discriminant surface.
Reference
- https://www.analyticsvidhya.com/blog/2018/08/dimensionality-reduction-techniques-python/
- Read Curse of dimensionality
- Manifold learning
- Matrix factorization