Trustworthy AI

Speaker: Jeannette M Wing

Existing formal verification methods need to be adapted for AI.
Suggested model:
: model of data, e.g. stochastic process/distribution generating data
: model (i.e. code)
: probabilistic logic/interval analysis
: probabilistic/stochastic
 
The role of data:
  • Collection and partitioning data
  • Specifying "unseen" data
  • What do we quantify over?
  • How do we verify?
 

Specifying Unseen Data

How do we specigy the properties of the data?
We may know the distribution exactly, but unlikely not for real-world problems.
If we make assumptions/approximation - how does this differ from our model?
Approaches:
  1. Use repertoire of statistical tools
  1. Assume that initial specification is small & simple, and then use this to bootstrap an iterative refinement process.

What do we quantify over?

In traditional formal methods we strive to prove that for some property and variable representing e.g. all params.
But for AI systems we don't expect to work for all input data.
Possible solutions:
  • Restrict X to coming from class of distributions.
    • E.g. for robustness, in the adversarial ML setting, we might want ot show that M is robust to all norm-bounded perturbations of D. Or even, we might want to show M is robest to all "semantic" or "structural" pertubations for the task at hand (e.g. CV).
    • E.g. for fairness, we might want to show the model is fair on a given dataset and all unseen datasets that are "similar" (for some formal notion of "similar") → cites research indicating that small changes in a classifier can turn it from fair to unfair.

Robust by Construction

Add a noise layer a la differential privacy.