Kernel Clustering

case based learning (reasoning)

Kernel trick: we can substitute any*similarity function in place of the dot product

classification by similarity

image 0 < i < 1 vector dot product is similar function

Trade-offs: Small k gives relevant neighbors(오버피팅), Large k gives smoother functions (적당한 피팅)

stop no change

Agglomerative clustering

모든 거 사이에 간격 있다

간격 제일 작은거부터 합치는데

cluster도 점으로 처리

Parametric models - fixed seet of parameter non parametric - often limit - classifier increasees with data

weight vectors (the primal representation) from update counts (the dual representation)