如果include bias and variance。Larger H 就是 Higher capacity , easier to overfit. 因为High capacity,可以找到一个更低的Loss。 On the other had, smaller H就是比较小的capacity, High Bias。可以作为选择的Model 比较少,因此比较难找到一个合适的function, 所以Loss会比较高。但又是因为limited 选择,因此变化不多,所以比较小的gap?
Hoeffding's Inequality provides a bound based solely on sample size and the bounded nature of the random variables. It does not account for the underlying variance introduced by model complexity. While the later is critical in practical scenarios, the increased variance is not captured by applying Hoeffding's inequality. The later does not reply on distribution of training loss only except its support [0,1]. Model complexity does affect Pr{D_train is bad} through |H| indeed. However. it is irrelevant to the core object (of inferring and minimizing the test loss bound \delta on 1:11:17), therefore a bit confusing. Kindly correct if I am wrong.