- For X to be a concern, there has to be capability and propensity (more or less - refer truth table)
- "We" (what a lot that term obfuscates) need to "agree" on what constitutes capability and propensity
- Various concerning results get dismissed on one or both of these grounds, though not necessarily in a way that points to a consistent underpinning framework when one looks at the dismissals used (by the same commentator) across issues...
- The Palisade Research finding about o1 'cheating' at chess seems like it should be a slam dunk on both fronts! Need to read that in full when it comes out.
- Really only **weak propensity** should be required: with the sheer number of opportunities (i.e. volume of user requests) a given model will have, across the number and training environment of models (cf. race dynamics), the repeated game will bite - strategies with risk of ruin will, in the limit, entail ruin
Truth table (to flesh out)
| Context | Game | Feature | Necessary? | Sufficient? | Descriptively |
| -------- | -------- | ---------- | ---------- | ----------- | ------------------------------------------------------------------------------------------------------------------- |
| Locally | One-off | Capability | T | F | Dangerous capability unlikely in any particular instance to be bad |
| Locally | One-off | Propensity | T | F | |
| Locally | Repeated | Capability | T | | |
| Locally | Repeated | Propensity | | | |
| Globally | One-off | | | | |
| Globally | One-off | | | | |
| Globally | Repeated | Capability | F | T | lim(shots->inf) capability *will* kick in, but it's not *necessary* initially... |
| Globally | Repeated | Propensity | T | T | ...because "intellidynamics" (as Liron Shapiro puts it) should do: propensity should eventually produce capability* |
\* This is probably skipping some steps actually - maybe we should carve intelligence / intellidynamics out of this framework
---
Further sources:
- UN AI risk report from bengio - Loss of Control section has the capability / propensity divide