When you use an AI to check on an AI, assuming they operate indendent, their combined success rate is multiplicative.
-
When you use an AI to check on an AI, assuming they operate indendent, their combined success rate is multiplicative.
E.g. 80% success rate, applied twice gives 64% success rate.
(They might not be independent, training for Sycophancy might make it worse.)
-
When you use an AI to check on an AI, assuming they operate indendent, their combined success rate is multiplicative.
E.g. 80% success rate, applied twice gives 64% success rate.
(They might not be independent, training for Sycophancy might make it worse.)
@icing If they are truly configured to operate as redundant “checks”, wouldn’t it be the *failure* rate (1 - P_success) that multiplies?
-
@icing If they are truly configured to operate as redundant “checks”, wouldn’t it be the *failure* rate (1 - P_success) that multiplies?
@marshray When each check can ruin the outcome, the success rates multiply. When any positive check is overall success, the failure rates multiply, or?
-
@marshray When each check can ruin the outcome, the success rates multiply. When any positive check is overall success, the failure rates multiply, or?
@marshray back to AI.
When AI 1 write a vulnerability report and you use AI 2 to check those reports, the overall assessment is only good when both do the right thing.
-
@marshray back to AI.
When AI 1 write a vulnerability report and you use AI 2 to check those reports, the overall assessment is only good when both do the right thing.
@icing I see the critical assumptions as:
1. AI 1 is operated to not create a bad report in the first place
2. AI 2 is operated to reject a bad report
3. The AIs probability of failure at (1) and (2) are uncorrelatedIf these assumptions were somehow validated, then they would constitute a “belt and suspenders” type redundant system.
But such assumptions are rarely justified in practice.
-
R relay@relay.infosec.exchange shared this topic on