Starting off as a muggle that naïve to the Math's and Data Science world.

Day 47

Exploratory Data Analysis (EDA)

1. Configure dataset Role accordingly


2. From Explore tab, drag StatExplore node
3. Link nodes as shown
4. Run StatExplore


Result

  • if target variable were binary, Chi-Square were computed;
  • if target variable were numerical, Pearson correlation coefficient were computed.
  • Classification statistic consist of N-missing frequency, mode, mode frequency, second mode and second mode frequency.
  • Numerical statistic consist of mean, std, N-missing frequency, min, median, max, skew and kurtosis.



Tips

Replace value in column using Replacement and Impute

1. From Modify tab, drag Replacement and Impute node
2. Link nodes as shown


3. Configure Replacement node accordingly


4. in Replacement Editor for Replacement node, edit column, we are replacing “0” into “Missing”


5. In Variable for Impute node, edit column, we are replacing “Missing” into “Mean”

Leave a comment