The first step in that program is to identify the correct components to analyze.
Unfortunately, the most natural computational unit of the neural network – the neuron itself turns out not to be a natural unit for human understanding. This is because many neurons are polysemantic(Superposition). Superposition can arise naturally during the course of neural network training if the set of features useful to a model are sparse in the training data.
Safety features
- High-Level Actions
- Planning
- Social Reasoning
- Personas