7

Natural Policy Gradient for Exponential Families

Recent work has highlighted how a misalignment between the support of the policy and the action space of the reinforcement learning problem can introduce bias and unnecessary variance into policy gradient estimates. To better align the support of the …

Nonparametrically Learning Activation Functions in Deep Neural Nets

We provide a principled framework for non-parametrically learning activation functions in deep neural networks. Currently, state-of-the-art deep networks treat choice of activation function as a hyper-parameter before training. By allowing activation …