Random Hyperparameter Search
Random hyperparemeter search approach:
- Discrete: bernoulli or multinoulli distribution
- Continuous: uniform distribution and then exponentiate
At each step, sample from all distributions
Why is random search better than grid search?
Exponentially more efficient when there are several params that don't matter much
Debugging
Goodfellow's debugging tips
- Test cases
- Design test case so simple correct behaviour can be predicted (e.g. tiny dataset)
- Design test case that exercises one part of the model in isolation
- Visualisations
- Model in action
- Worst (and best) performance
- Model statistics