The vast majority of RL work
concentrates on the completely observable case, in which there is
noise in the agent's actions, but none in its observation of the
environment. When the state space is not completely observable,
additional methods need to be used beyond those which apply to
MDPs. Partially observable Markov decision processes (POMDP's) extend
the MDP model to include an observational model (which is a stochastic
function of the current state). The true state is hidden because the
number of observations is typically far less than the number of
possible states.
publications