Metamajors clustering variations

# Clusters	Student Choices	Staff Choices	Combined
2 Clusters	students (sized)	teachers (sized)	combined (sized)
3 Clusters	students (sized)	teachers (sized)	combined (sized)
4 Clusters	students (sized*)	teachers (sized*)	combined (sized)
5 Clusters	students (sized)	teachers (sized)	combined (sized)
6 Clusters	students (sized)	teachers (sized)	combined* (sized)
7 Clusters	students (sized)	teachers (sized)	combined (sized)
8 Clusters	students (sized)	teachers (sized)	combined (sized)
9 Clusters	students (sized)	teachers (sized)	combined (sized)
10 Clusters	students (sized)	teachers (sized)	combined (sized)

Staff Clusters with Names

Group	Grid	Sized
Group 1	clusters	sized
Group 2	clusters	sized*
Group 3	clusters	sized
Group 4	clusters	sized
Group 5	clusters	sized
Group 6	clusters	sized
Group 7	clusters	sized
Group 8	clusters	sized
Group 9	clusters	sized
Group 10	clusters	sized
Group 11	clusters	sized
Group 12	clusters	sized
Group 13	clusters	sized*
Group 14	clusters	sized
Group 15	clusters	sized
Group 16	clusters	sized
Group 17	clusters	sized

More Information

This data is based on a series of surveys, both of students and staff, where they were asked to group together majors into clusters that made sense to them. I primarily looked at how often a major was put in the same group as another major. With the survey results, I built up a correlation matrix for student input and staff input. When you look at it in excel, you can shade the cells based on how frequently majors were put in the same group as each other. It gives you a giant table that looks like this.

In order to try to find better trends in the clusters, I used a technique called PCA, Principal Component Analysis, to reduce these many dimensions into two dimensions (which we can easily graph,) while maintaining the relative "distances" from each other. Additionally, I used the K-Means clustering algorithm, trying various numbers of clusters, although this was less enlightening.

How to read these charts

For the clusters above, for student choices, staff choices, or everyone combined, keep a few things in mind:

Locations don't matter. Closeness does. Majors which are closer means more people grouped them together.
Roll your mouse over the squares to see the major. If you're on a tablet, I apologize.
The 'sized' versions show (relatively) how many degrees of that type were awarded since 2009.
Please load this on your computer and spend a few minutes investigating. There's no answers here, just hints.
* An asterisk marks the ones I found interesting or enlightening.

Caveats

There's many ways to process and interpret these results. This is just one way.
With the staff set especially, there's only a few responses, and they had pretty different ideas of how to group things. What I'm showing is sort of an average, which isn't especially useful for that situation.
The staff groups included possible titles for the clusters. Since they're few and disparate, I included them in their entirety.

Next Steps

If there were more time, a possible next step might be to do a similar analysis, only based on how many classes each degree has in common.

source code

Based on similar work I did on US & California Senate, House and Assembly voting patterns.