AI Neuron Activation

Creator
Creator
Seonglae Cho
Created
Created
2023 Nov 12 18:44
Editor
Edited
Edited
2025 Feb 3 12:43
Unfortunately, the most natural computational unit of the neural network – the neuron itself turns out not to be a natural unit for human understanding. This is because many neurons are polysemantic(
Superposition Hypothesis
). Superposition can arise naturally during the course of neural network training if the set of features useful to a model are sparse in the training data.
AI Neuron Activation Notion
 
 
 
 
 

Safety features

There are features representing more abstract properties of the input, might there also be more abstract, higher-level actions which trigger behaviors over the span of multiple tokens?
  • High-Level Actions
  • Planning
  • Social Reasoning
  • Personas

Monosemanticity

Runtime monitoring

 
 
 

Recommendations