Scale
scalable solutions that create a positive social impact
innovation, problem-solving, design thinking, and programming skills
not only address immediate challenges but also create lasting, positive social impact
Scaling the foundation of the future
Local beginnings to global influence
- Building ethical frameworks that scale
Scaling impact
Exploring future frontiers
Human in the loop
UST D3CODE 2024 Plans
AI Governance principles
- Explainability: Mechanistic Interpretability is central to the project, providing a promising path for understanding how AI models make decisions
- Safety: Activation Engineering offers a mechanism for making AI models safer, especially when handling advanced systems like AGI, which may pose existential risks
- Transparency: By directly manipulating activations and making changes explicit, the system provides a transparent method for controlling AI
- Reproducibility: Unlike Prompt Engineering, which depends on natural language randomness, Activation Engineering provides more reproducible results through mathematical feature manipulation
- Robustness: Activation Engineering overrides input prompts, providing a more robust system for resisting jailbreaking and malicious inputs
Social Impact
- Explicitly control AI behaviors by switching on and off specific activation patterns
- Improve AI safety by allowing for robust control, making it possible to prevent harmful actions by AI, particularly in AGI systems
- Ensure transparency through mechanisms that clearly illustrate how internal activations affect outputs
- Enhance reproducibility by bypassing the randomness inherent in Prompt Engineering and relying on more deterministic feature manipulation

Seonglae Cho