Personal Interest

내 관심사 설명할지 알고 있는 건
Small Talk에서 아주 중요하다

AI Safety, Activation Engineering

humanoid 사람죽여라 좋은 예시 jailbreak human language 로 instruction 주는게 prompt engineering 인데 implicit 할수 있다 natural language 다 보니 explixit 하고 명확한 방식으로 위험한 prompt 에 대해 대처할 수 있어야하는데 그게 activation engineering 인간 neural net 처럼 ai neural net 도 neuron 을 가지고 있는데 그 activation 값에다가 manual 하게 vector 를 더해줘서 원하는 방향으로 행동을 이끌어낼 수 있다 induce steering vector 라고 부르는데 explicit 하게 어떤 행위를 행하거나 막도록 한다 스위치를 크고 끼는 것처럼 예를 들어 chinese vector 를 insert 하면 마치 중국어 switch 가 켜진 것처럼 중국어로만 답하고, ocean topic 스위치를 켜면 시를 써달라 하건 이야기를 하건 계속 oceandp 관한 얘기를 한다 이렇게 manual 하고 reproducable 한 방식으로 ai control 하는 게 activation engineering 이다 이런 이론적인 배경이 mechanistic interpretability 인데 neuran 의 activation 을 수학적으로 분석하고 학습이 어떤 원리로 일어나고 모델 내부의 구성 요그소가 어떤 역할을 해서 지능이 만들어지는지를 분석하는 ai 학문이다

Interpretable AI, Mechanistic Interpretability

I am studying about how intelligence emerges from combination of matrix multiplication and activation function. 정말 신기한건 다른 computation block 으로 이루어졌음에도 ai 과 brain 이 비슷한 형태로 circuit이 발견된다는 점이다

The interesting finding is that the neuron activation is really sparse. it mean LLMs use their neuron very partially which is similar to the human brain. Have you heard about human brain use only 10%? That is actually not true but it is true in the micro temporal aspect. Actually human neuron sparsely activated but they use every neuron in the macroscopic perspective. in other words, they use neurons for each others functionality so actovation every neurons at the same time does not mean anything so it is kind of signal to processing information. The thing is that AI also sparsely use their activation to compute information. One additional thing is that they can save the space since the world knowledge is too diverse and brain or model size is limited because of the cost. Each features share their dimension which means a single neuron can be used as multiple semantics. This is called superposition theory and supposed to be evolved to optimize limited neuron dimension effectively. This is possible since neuron activation is sparse and theoretical support called compressed sensing

Artificial Intelligence

O1 preview model is first model who surpass IQ 100 which are average human IQ. After postmodernism, IQ is considered as some evil and biased metric, but in statistics it is actually useful to predict earning and other several aspect of human.

Terence Tao also agreed that o1 model is kind of level of postgraduate student. OpenAI insisting the o1 model is level of PhD student.

ai is dumb but high knowaged intelligence just as 5 aged boy with the whole world information

In context learnong 하고 phase change double deep grokking

트랜스포모가 특별한 건 딱 하나 이유 스케일링이 된다는것 뭐든 인공지능 모델이 스케일링되면 인간 지능 뛰어넘을 수 있다

Personal Interest

내 관심사 설명할지 알고 있는 건 Small Talk에서 아주 중요하다

AI Safety, Activation Engineering

Interpretable AI, Mechanistic Interpretability

Artificial Intelligence

Recommendations

내 관심사 설명할지 알고 있는 건
Small Talk에서 아주 중요하다