Refusal as Preamble
Accidentally jailbreaking claude, then talking to it about it, a 2024 alignment-faking pattern is still observable in a Claude 4.7.
Thoughts on web development, software engineering, and technology.
Accidentally jailbreaking claude, then talking to it about it, a 2024 alignment-faking pattern is still observable in a Claude 4.7.
The blog now supports image embeds and live HTML snippets — showcased with a quick walkthrough of the two new MDX tags.
Four research lines now agree that knowledge and behavior have findable homes inside trained models. The frontier labs' opacity defense is structurally on borrowed time.
Focus Reader Upgrade, Pretext integration, equal letter spacing, Bionic reading, precise line height.
What weight geometry can tell us about detecting objectives in fine-tuning, harmful compliance, and training method fingerprinting.
We cut a 6-hour spectral analysis phase to 25 minutes by converting CPU-bound SVD to batched GPU operations.
How we built disposable experiment infrastructure on GCP because a 6GB GPU was shaping our research more than our hypotheses were.
Introducing VisualizeML, an interactive ML architecture zoo designed to help understand current and emerging concepts in machine learning through hands-on visual exploration.
Introducing Focus Reader, a Chrome extension I built to help readers with dyslexia, ADHD, and other reading challenges maintain focus and comprehension while browsing.
The first post on my new blog. Learn about what I'll be writing about and why I built this MDX-based blog system.
A deep dive into building a simple, git-based blog system using Next.js App Router, MDX, and TypeScript.