Euclidean Distance and Normalization of a Vector

Photo by CHUTTERSNAP on Unsplash

Euclidean distance

Euclidean distance in N-D space

What does it mean to normalize an array ?

from sklearn import preprocessing
import numpy as np
X = [[ 1., -1., 2.],
[ 2., 0., 0.],
[ 0., 1., -1.]]
X_l1 = preprocessing.normalize(X, norm='l1')
X_l1
# array([[ 0.25, -0.25, 0.5 ],
# [ 1. , 0. , 0. ],
# [ 0. , 0.5 , -0.5 ]])
X_l2 = preprocessing.normalize(X, norm='l2')
X_l2
# array([[ 0.40824829, -0.40824829, 0.81649658],
# [ 1. , 0. , 0. ],
# [ 0. , 0.70710678, -0.70710678]])
np.sqrt(np.sum(X_l2**2, axis=1)) # verify that L2-norm is indeed 1
# array([ 1., 1., 1.])

Normalizing a Vector

V/|V| = (x/|V|, y/|V|, z/|V|).

| V/|V| | = sqrt((x/|V|)*(x/|V|) + (y/|V|)*(y/|V|) + (z/|V|)*(z/|V|))
= sqrt(x*x + y*y + z*z) / |V|
= |V| / |V|
= 1

L1 Vs L2? Which one to use?

L1 norm

||X || = |3| + |4| = 7

L2 norm

Differences between Norm of a Vector and distance between two points

Key point to remember — Distance are always between two points and Norm are always for a Vector.

That means Euclidean Distance between 2 points x1 and x2 is nothing but the L2 norm of vector (x1 — x2)

DataScience | ML | 2x Kaggle Expert. Ex Fullstack Engineer and Ex International Financial Analyst. https://www.linkedin.com/in/rohan-paul-b27285129/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store