... approaches have beenproposed (J. Nie and Jin, 1995; Chen and Bai, 1998;Wu and Jiang, 2000; Peng et al., 2004; Chen and Ma, 2002; Zhou, 2005; Goh et al., 2003; Fu and Luke, 2004; Wu et al., 2011). ... online learning, wetry to use more refined learning rates than the SGDtraining. Instead of using a single learning rate (ascalar) for all weights, we extend the learning ratescalar to a learning ... proposed ADF online learning algorithm.q, c, α, and β are hyper-parameters. q is an integerrepresenting window size. c is for initializing the learning rates. α and β are the upper and lower bounds...