SPSS Statistics is a software package used for statistical analysis. Long produced by SPSS Inc., it was acquired by IBM in 2009. The current versions (2015) are officially named IBM SPSS Statistics. Companion products in the same family are used for survey authoring and deployment (IBM SPSS Data Collection), data mining (IBM SPSS Modeler), text analytics, and collaboration and deployment (batch and automated scoring services).
The software name originally stood for Statistical Package for the Social Sciences (SPSS), reflecting the original market, although the software is now popular in other fields as well, including the health sciences and marketing.
I frequently play around with multivariate techniques like K-means cluster analysis in SPSS. There were some big holes in the SPSS procedure that performs cluster analysis so wrote anSPSS Macro to automate what I wanted to be done. Below is a quick demo of the macro in use. It can really save a bunch of time! A few things mine does differently is:
assigns labels to the segments (these will be changed later but butter than just a 1,2,3)
Computes frequencies on the size of the segments (Why SPSS doesn’t do this automatically is beyond me)
Older online vendor tools and databases would frequently put multi-select questions into one column having a pipe,tab,semicolon or comma delimiter (what was real fun is when they would use a comma for a delimiter in a CSV file).
This can be very problematic in nearly any tool. In this video I demonstrate how easy it can be to move data from one column to many with an SPSS macro.
I frequently need to randomize my lists and save them into separate text files. This SPSS macro makes it a breeze!
If you haven’t already played with macros you’ll want to get yourself familiar with them by reviewing this post on Intro to Macros. The macro will need a few parameters from you like: the path to save the files, Stem (beginning) name of the files to create, # of groups to create and whether or not to keep the variable used to create the groups.
Financial data often has problems with heteroscedasticity also spelled Heteroskedasticity (funnel shaped). Performing transformations is necessary however I often have a hard time explaining what the transformation is actually doing. This video helps show what is happening during a log transformation.
After watching the data literally “transform” it was much easier to grasp exactly what was happening and how it would help. Just remember you’ll have to transform it back if you wish to interpret the values. While the below video was created using SYSTAT, I transform data so often I have some killer SPSS macros which transform data effortlessly.
You want your data to be normally distributed around the line which is “homosecasticity“.
If you’re interested in reading more, SAGE has a specific book on Heteroscedasticity & transformations