What are PC1 and PC2? Are they player clusters you've made? I notice that both Phils have the same PC1 and PC2. PC2 is nicely correlated with Attacks and Kills but not as much for the others. PC1 is really separating the diggers and blockers but PC2 is not.

Also, while watching that Austin video, it was clear that the wind was a big advantage. It might be interesting to infer wind advantage by determining the total points won on each side. Then you could characterize players on ability to play with and against the wind advantage. There are so many opportunities for playing with various combinations. Thank you for starting this blog.

Expand full comment

Thanks for the comment. PC1 and PC2 signify the top 2 "principal components" from the dataset. Principle components try to "compress" information from the other variables used for clustering to be visualized in 2 dimensions. You can read more info about it here: https://en.wikipedia.org/wiki/Principal_component_analysis

Thanks for the thoughts on the wind conditions, that's a great storyline to see if I could somehow correlate historical conditions against player performance! However, I don't think the data would ever be able to discern the side advantage though, given it likely just contains the avg. wind speeds per day (plus there isn't any info on court orientation, etc)

Expand full comment

Thanks. I was thinking you wouldn’t need to know anything about the reported wind or court orientation. You should be able to infer a side advantage based on the number of points scored by each team on each side. You know they switch every 7 points so you can detect if there is an advantage. You’d have to group by set not match since you don’t know which side they start on but that isn’t a problem. This would also incorporate other side advantages like sun in the eyes or if the lines were not setup correctly, etc. But I think the majority of side advantage is wind.

Expand full comment

Might be possible with the international data but not avp. I don’t have the point by point data, just the final set score

Expand full comment

Right. That makes sense. I forgot AVP doesn’t have score history like that. Crazy, but true. I did hear that BVB might have that for AVP because they were scraping the AVP website as the live stats were broadcast.

Expand full comment