at least 1 hidden layerAny continuous function can be approximated arbitrarily well by a neural network with at least 1 hidden layer with a finite number of weightsOne wide (latent) layer is enough, but it is just a memorizer (Cybenko ’89)