While performing agglomerative hierarchical clustering on a large GPS trajectory data set, you notice that the dendrogram connects two geographically distant city clusters through a long sequence of points located along an interstate highway. The resulting cluster is a snake-like chain formed by stepwise merging of individual observations, even though most points are far apart overall-a phenomenon known as the chaining effect.
Which linkage criterion is most likely to have produced this behavior?
Mean pairwise distance between all points in the two clusters, also called average linkage
Maximum pairwise (farthest-neighbor) distance between clusters, also called complete linkage
Minimum pairwise (nearest-neighbor) distance between clusters, also called single linkage
Merge that yields the smallest increase in total within-cluster variance, also known as Ward's method
Single (nearest-neighbor) linkage merges two clusters whenever any single pair of points-one from each cluster-has the smallest distance. Because only one close pair is needed, clusters can accrete observations one at a time along a path of locally close neighbors, creating elongated, chain-shaped clusters that join regions that are globally far apart.
Complete linkage uses the farthest pair, producing compact clusters and rarely exhibits chaining. Average linkage averages all pairwise distances and therefore balances the extremes of single and complete linkage. Ward's method chooses the merge that causes the smallest increase in total within-cluster variance and generally favors similarly sized, compact clusters. None of these alternatives is as prone to the chaining effect as single linkage, making single linkage the most likely culprit in the scenario.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Why does single linkage lead to the chaining effect?
Open an interactive chat with Bash
How does complete linkage differ from single linkage in clustering?
Open an interactive chat with Bash
What factors make Ward's method suitable for clustering?