Red lines as capability-propensity combinations

- For X to be a concern, there has to be capability and propensity (more or less - refer truth table) - "We" (what a lot that term obfuscates) need to "agree" on what constitutes capability and propensity - Various concerning results get dismissed on one or both of these grounds, though not necessarily in a way that points to a consistent underpinning framework when one looks at the dismissals used (by the same commentator) across issues... - The Palisade Research finding about o1 'cheating' at chess seems like it should be a slam dunk on both fronts! Need to read that in full when it comes out. - Really only **weak propensity** should be required: with the sheer number of opportunities (i.e. volume of user requests) a given model will have, across the number and training environment of models (cf. race dynamics), the repeated game will bite - strategies with risk of ruin will, in the limit, entail ruin Truth table (to flesh out) | Context | Game | Feature | Necessary? | Sufficient? | Descriptively | | -------- | -------- | ---------- | ---------- | ----------- | ------------------------------------------------------------------------------------------------------------------- | | Locally | One-off | Capability | T | F | Dangerous capability unlikely in any particular instance to be bad | | Locally | One-off | Propensity | T | F | | | Locally | Repeated | Capability | T | | | | Locally | Repeated | Propensity | | | | | Globally | One-off | | | | | | Globally | One-off | | | | | | Globally | Repeated | Capability | F | T | lim(shots->inf) capability *will* kick in, but it's not *necessary* initially... | | Globally | Repeated | Propensity | T | T | ...because "intellidynamics" (as Liron Shapiro puts it) should do: propensity should eventually produce capability* | \* This is probably skipping some steps actually - maybe we should carve intelligence / intellidynamics out of this framework --- Further sources: - UN AI risk report from bengio - Loss of Control section has the capability / propensity divide