## Heteroscedasticity & Transformations

Financial data often has problems with heteroscedasticity also spelled Heteroskedasticity (funnel shaped). Performing transformations is necessary however I often have a hard time explaining what the transformation is actually doing. This video helps show what is happening during a log transformation.

After watching the data literally “transform” it was much easier to grasp exactly what was happening and how it would help.  Just remember you’ll have to transform it back if you wish to interpret the values.  While the below video was created using SYSTAT, I transform data so often I have some killer SPSS macros which transform data effortlessly.

You want your data to be normally distributed around the line which is “homosecasticity“.

If you’re interested in reading more, SAGE has a specific book on Heteroscedasticity & transformations

## Correlation Analysis

A few years back I fielded a national study on the frequency of visiting restaurant chains.  In SPSS I did correlation analysis however, as is always the case, showing a correlation matrix to the client is not an option (if I want them to understand anything.)

So I took the matrix and imported into SYSTAT then computed an Additive tree.  Additive trees examine the response patterns across variables and group them, according to their similarities, in the shape of a two-dimensional tree. The closer items appear to each other, the higher the correlation between them.  (The color coding is subjective and is just added to aide interpretation)

The Additive Tree below is much easier to evaluate as clear patterns can be seen in how consumers “see” the chains.

Here’s a great book to get ideas on conveying data visually

And here is a book utilizing SYSTAT by one of the programmers. I learned the vast majority of my statistics from this great book!

## Payment Distributions

In 2010 a national study across various industries examined payment distributions by industry.  The below additive tree shows the correlations  by Industry.  When reading an additive tree, follow the path from between the lines.  The shorter the distance, the higher correlated they are.  Color-coding was done to aide interpretation of the tree and is subjective.  I used SYSTAT to create tree.

In reviewing the tree below you can see that, overall, there are two main branches that split at the root.  After that, there are some pretty clear patterns.

Here’s a great book to get ideas on conveying data visually

Here’s a great book to get ideas on conveying data visually

And here is a book utilizing SYSTAT by one of the programmers. I learned the vast majority of my statistics from this great book!