Talk:Temporal difference learning
From Scholarpedia
This is an excellent article about an important learning rule, written by one of its main pioneers.
My main concern is the readability of the mathematical symbols. The size and font type of the symbols varied among the equations a bit - some were also too small to be easily interpreted (I read it using firefox under SUSE linux).
My only other slight concern was the role of the state information (x) -- it seems a bit orphaned in the initial description of the algorithm. I wondered why the author chose not to introduce the algorithm via a rewarded Markov process, and only then later to add more generality?