## SPSS macro performs K-means Cluster analysis and does the “heavy lifting”

I frequently play around with multivariate techniques like K-means cluster analysis in SPSS.  There were some big holes in the SPSS procedure that performs cluster analysis so wrote an SPSS Macro to automate what I wanted to be done.    Below is a  quick demo of the macro in use. It can really save a bunch of time!  A few things mine does differently is:

1. assigns labels to the segments (these will be changed later but butter than just a 1,2,3)
2. Computes frequencies on the size of the segments (Why SPSS doesn’t do this automatically is beyond me)
3. Color-codes and merges the significance testing in with the profile-plot showing what is different.

## Moving data from one column to many; Parsing a variable using an SPSS macro

Older online vendor tools and databases would frequently put multi-select questions into one column having a pipe,tab,semicolon or comma delimiter (what was real fun is when they would use a comma for a delimiter in a CSV file).

This can be very problematic in nearly any tool. In this video I demonstrate how easy it can be to move data from one column to many with an SPSS macro.

## Move one column to many

Here is the SPSS macro demonstrated in the video:

```
*///////////////.
DEFINE !Parse (Var !TOKENS (1)  / Stem !TOKENS (1) /Del !TOKENS (1))
STRING #(A1000).
VECTOR !Stem(25A500).
COMPUTE #=CONCAT(RTRIM(!Var),!Del).
COMPUTE #cnt=1.
LOOP IF INDEX(#,!Del)>0.
COMPUTE !Stem(#cnt)=SUBSTR(#,1,INDEX(#,!Del)-1).
COMPUTE #cnt=#cnt + 1.
COMPUTE #=SUBSTR(#,INDEX(#,!Del)+1).
Var width !Concat(!Stem,1) to !Concat(!Stem,25) (10).
END LOOP.
EXECUTE.
!ENDDEFINE.
*///////////////.
/* !Parse Var=NAIC_All Stem=NAIC  Del=";".```

## Automating the creation of randomly split-out text files with an SPSS macro

I frequently need to randomize my lists and save them into separate text files. This SPSS macro makes it a breeze!

If you haven’t already played with macros you’ll want to get yourself familiar with them by reviewing this post on Intro to Macros.  The macro will need a few parameters from you like: the path to save the files, Stem (beginning) name of the files to create, # of groups to create and whether or not to keep the variable used to create the groups.

## Below is the SPSS Macro which will automate the process.

```
*/////////////////////.
DEFINE !Rand_Gp_txt (Path !Tokens(1)/Stem !Tokens (1)/Groups !TOKENS (1)  /DropNewVars !Tokens(1))
*/////drop GP if already exists .
Match files file=* / DROP Temp_GP
Compute Rand=RV.UNIFORM(0,1).
RANK  VARIABLES=rand (A) /NTILES (!Groups) into Temp_GP /PRINT=Yes /TIES=HIGH.
Freq Temp_GP.
***********************break out and save in groups********************************.
!DO !cnt=1 !TO !Groups.
Temp.
Select if Temp_GP=!cnt.
SAVE TRANSLATE OUTFILE = !Path+!stem+!QUOTE(!CONCAT(!cnt,'.txt'))
/TYPE=TAB /FIELDNAMES /replace /KEEP= ALL  /drop=Temp_GP rand.
!DOEND

***Drop unwanted vars if *******.
!IF (!Unquote(!Upcase(!DropNewVars))="Y") !THEN
Match files file=* / DROP Rand Temp_GP.
!IFEND

!ENDDEFINE.
*/////////////////////.
!Rand_Gp_txt Path="c:\temp\" Stem="Rand_" Groups=2  DropNewVars="Y".```
Remember to define the SPSS macro before calling it!.

## Heteroscedasticity & Transformations

Financial data often has problems with heteroscedasticity also spelled Heteroskedasticity (funnel shaped). Performing transformations is necessary however I often have a hard time explaining what the transformation is actually doing. This video helps show what is happening during a log transformation.

After watching the data literally “transform” it was much easier to grasp exactly what was happening and how it would help.  Just remember you’ll have to transform it back if you wish to interpret the values.  While the below video was created using SYSTAT, I transform data so often I have some killer SPSS macros which transform data effortlessly.

You want your data to be normally distributed around the line which is “homosecasticity“.

If you’re interested in reading more, SAGE has a specific book on Heteroscedasticity & transformations