Element 2
Bo . Negation

Encodes denial, refusal, and negative polarity.

What It Does

Boolean.Negation neurons activate in response to negation markers and negative polarity constructions. They encode 'not', 'never', 'no', 'neither', 'nor', and more complex negation patterns like 'it is not the case that' or 'contrary to what was stated.' They also fire on implicit negation — absences, refusals, and contradictions.

How It Behaves

Negation neurons have notably higher mean firing magnitudes than Affirmation neurons in our corpus, reflecting the cognitive asymmetry between affirmation and denial — negation requires the model to hold and then suppress a positive representation. They appear more frequently in middle layers where contextual reversal is computed, and are disproportionately active in models that handle more complex reasoning tasks.

Research Example

In GPT-2 Small, Boolean.Negation neurons fire on 'The Earth is not flat' at similar strength to 'The Earth is flat' — the model processes the factual content before applying the negation transformation. This separation is part of why models can hallucinate by 'forgetting' a negation they processed correctly in an earlier layer.

Other Boolean Elements