1,998 views
2 2 votes
Hi,

After learnt feature scaling, I have some questions regarding Normalization.

Standardization: rescales data to have a mean of 0 and standard deviation of 1 after a data distribution is selected.(Please correct me if I understood it wrong)

Normalization: rescale data into a range of 0-1.(Please correct me if I understood it wrong)

Questions:

1. in what cases we use standardization? when to use normalization? when to use a combined method?

2. In class, features are standardized first using N(0,1), then Z score can be standardized to some number between 0-1. How is Z score standardized? (what Z score will be 0 and what Z score will be 1?) if we use different data set, will Z score with same value always reach to same standardized result? Can this standardization formula be stored so we can reproduce the same standardization  for prediction purpose?

Thanks a lot for your help.
100% Accept Rate Accepted 1 answers out of 1 questions

1 Answer

Best answer
3 3 votes

This depends on the data set. Thorough research data in a pandas DataFrame,1-D NumPy array or NumPy multi-array will make some difference in which technique(s) to use. And whether you use code or not.

What we did in class was done to show the math behind the coding. Z-score was taught, variables were set/known and it was used to show one of the techniques. There is also Manhattan, Euclidean, Max-min when standard deviation and mean are unknown.

Remember the purpose of Standardization and Normalization, they are there to scale for visualization (graphing). These new points/coordinates will be different than graphing the original points in the data set.

 

If you are coding, depending on your preference on libraries, most likely there are two lines of algorithms to find normalization and standardization. The norm function is usually the first and the transformation is done second.

This is a code I wrote, it is using just NumPy to perform Manhattan Normalization.

selected by

Related questions

3 3 votes
1 1 answer
809
809 views
Neo asked Oct 14, 2018
809 views
I am wondering what is the difference between normalization and feature scaling and usually when working on a machine learning project what comes normalization or feature...
1 1 vote
2 answers 2 answers
1.6k
1.6k views
kalyanak.p asked Sep 27, 2018
1,596 views
Please see example from the following link.The question is leaning towards a programming code solution in Python as the link above shows. Involving sklearn and any other ...
1 1 vote
1 1 answer
724
724 views
metelon asked Dec 15, 2020
724 views
When I standardized my data when I created my model. Do I need to save the standardization transformation when I want to predict with my model new data ?
1 1 vote
2 answers 2 answers
12.2k
12.2k views
kaADSS asked Jan 21, 2020
12,180 views
Hi,Since I still have confuse to use the score() and accuracy_score(), so I want to confirm my test assumption.Q1: score(), we use the split data to test the accuracy by...
3 3 votes
1 1 answer
1.1k
1.1k views
kalyanak.p asked Sep 26, 2018
1,067 views
I have read online articles involving KNN and its emphasis on normalization. I would like to know if all KNN functions in Python need to involve normalization? I do know ...