UST D3CODE 2024

Created

Created

2024 Sep 27 23:5

Creator

Creator

Seonglae Cho

Editor

Editor

Seonglae Cho

Edited

Edited

2024 Nov 23 14:25

Refs

Refs

Scale

scalable solutions that create a positive social impact

innovation, problem-solving, design thinking, and programming skills

not only address immediate challenges but also create lasting, positive social impact

~~Scaling the foundation of the future~~

~~Local beginnings to global influence~~

Building ethical frameworks that scale

~~Scaling impact~~

~~Exploring future frontiers~~

~~Human in the loop~~

UST D3CODE 2024 Plans

UST D3CODE 2024 Ideagen

UST D3CODE 2024 Ideagen

UST D3CODE 2024 Prototype

AI Governance principles

Explainability: Mechanistic Interpretability is central to the project, providing a promising path for understanding how AI models make decisions

Safety: Activation Engineering offers a mechanism for making AI models safer, especially when handling advanced systems like AGI, which may pose existential risks

Transparency: By directly manipulating activations and making changes explicit, the system provides a transparent method for controlling AI

Reproducibility: Unlike Prompt Engineering, which depends on natural language randomness, Activation Engineering provides more reproducible results through mathematical feature manipulation

Robustness: Activation Engineering overrides input prompts, providing a more robust system for resisting jailbreaking and malicious inputs

Social Impact

Explicitly control AI behaviors by switching on and off specific activation patterns

Improve AI safety by allowing for robust control, making it possible to prevent harmful actions by AI, particularly in AGI systems

Ensure transparency through mechanisms that clearly illustrate how internal activations affect outputs

Enhance reproducibility by bypassing the randomness inherent in Prompt Engineering and relying on more deterministic feature manipulation

notion image

Recommendations

/////