Variance captures how the random nature of the finite dataset not family of model; It captures how much a model changes if we train on a different training set. i.e. the sensitivity of the model to the randomness in the dataset.
V(X±Y)=E[(x±y)2]−(μx±y)2=E[(x±y)2]−(μx±μy)2=E[x2±2xy+y2]−(μx2±2μxμy+μy2)=E(x2)±2E(xy)+E(y2)−μx2∓2μxμy−μy2=E(x2)−μx2±2(E(xy)−μxμy)+E(y2)−μy2=V(X)+V(Y)±2⋅Cov(X,U)V(X±Y)=V(X)+V(Y)±2Cov(X,Y)V(X+Y)=V(X)+V(Y),if X and U are independent