About
Apertus Claritas is a platform for collecting and sharing interpretability research on Apertus, Switzerland's open multilingual language model. It brings together researchers, students and independent contributors to document what we are learning: what works, what fails and where understanding remains incomplete.
What makes Apertus Claritas unique is that it offers both an inside view and an open view. Rather than only showcasing polished results, it creates space for exploratory findings, intermediate insights, negative results and technically grounded reflections that help others understand this model more deeply.
Topics
Topics we cover include but are not limited to the following:
- Features circuits, latent representations and geometry
- Sparse autoencoders and transcoders
- Probing, activation steering and interventions
- Training dynamics across checkpoints and parameter scales
- Safety monitoring including hallucinations, anthropomorphic concepts and behavioural drift
- Agentic interpretability and automated monitoring
- Tools, datasets and interactive interpretability interfaces
Get to know us
No team members yet.
No advisors yet.
No reviewers yet.
No contributors yet.




