Metamajors clustering variations

# ClustersStudent ChoicesStaff ChoicesCombined
2 Clusters students (sized)teachers (sized)combined (sized)
3 Clusters students (sized)teachers (sized)combined (sized)
4 Clusters students (sized*)teachers (sized*)combined (sized)
5 Clusters students (sized)teachers (sized)combined (sized)
6 Clusters students (sized)teachers (sized)combined* (sized)
7 Clusters students (sized)teachers (sized)combined (sized)
8 Clusters students (sized)teachers (sized)combined (sized)
9 Clusters students (sized)teachers (sized)combined (sized)
10 Clusters students (sized)teachers (sized)combined (sized)


Staff Clusters with Names

Group 1 clusterssized
Group 2 clusterssized*
Group 3 clusterssized
Group 4 clusterssized
Group 5 clusterssized
Group 6 clusterssized
Group 7 clusterssized
Group 8 clusterssized
Group 9 clusterssized
Group 10 clusterssized
Group 11 clusterssized
Group 12 clusterssized
Group 13 clusterssized*
Group 14 clusterssized
Group 15 clusterssized
Group 16 clusterssized
Group 17 clusterssized


More Information

This data is based on a series of surveys, both of students and staff, where they were asked to group together majors into clusters that made sense to them. I primarily looked at how often a major was put in the same group as another major. With the survey results, I built up a correlation matrix for student input and staff input. When you look at it in excel, you can shade the cells based on how frequently majors were put in the same group as each other. It gives you a giant table that looks like this.

In order to try to find better trends in the clusters, I used a technique called PCA, Principal Component Analysis, to reduce these many dimensions into two dimensions (which we can easily graph,) while maintaining the relative "distances" from each other. Additionally, I used the K-Means clustering algorithm, trying various numbers of clusters, although this was less enlightening.

How to read these charts

For the clusters above, for student choices, staff choices, or everyone combined, keep a few things in mind:


Next Steps

If there were more time, a possible next step might be to do a similar analysis, only based on how many classes each degree has in common.

  source code

Based on similar work I did on US & California Senate, House and Assembly voting patterns.