Thứ Bảy, 1 tháng 3, 2014

Tài liệu Lọc Kalman - lý thuyết và thực hành bằng cách sử dụng MATLAB (P3) pptx

A random variable assigns real numbers to outcomes. There is an integral
number of dots on each face of the die. This de®nes a ``dot function'' d : s 3`on
the sample space s, where do is the number of dots showing for the outcome o of
the statistical experiment. Assign the values
do
a
1; do
c
3; do
e
5;
do
b
2; do
d
4; do
f
6:
This function is an example of a random variable. The useful statistical properties of
this random variable will depend upon the probability space de®ned by statistical
experiments with the die.
Events and sigma algebras. The statistical properties of the random variable d
depend on the probabilities of sets of outcomes (called events) forming what is
called a sigma algebra
1
of subsets of the sample space s. Any collection of events
that includes the sample space itself, the empty set (the set with no elements), and the
set unions and set complements of all its members is called a sigma algebra over the
sample space. The set of all subsets of s is a sigma algebra with 2
6
 64 events.
The probability space for a fair die. A die is considered ``fair'' if, in a large
number of tosses, all outcomes tend to occur with equal frequency. The relative
frequency of any outcome is de®ned as the ratio of the number of occurrences of that
outcome to the number of occurrences of all outcomes. Relative frequencies of
outcomes of a statistical experiment are called probabilities. Note that, by this
de®nition, the sum of the probabilities of all outcomes will always be equal to 1. This
de®nes a probability pe for every event e (a set of outcomes) equal to
pe
#e
#s
;
where #e is the cardinality of e, equal to the number of outcomes o P e. Note
that this assigns probability zero to the empty set and probability one to the sample
space.
The probability distribution of the random variable d is a nondecreasing function
P
d
x de®ned for every real number x as the probability of the event for which the
score is less than x. It has the formal de®nition
P
d
x
def
pd
À1
ÀI; x;
d
À1
ÀI; x 
def
fojdo xg:
1
Such a collection of subsets e
i
of a set s is called an algebra because it is a Boolean algebra with respect
to the operations of set union (e
1
 e
2
), set intersection (e
1
 e
2
), and set complement (sne)Ð
corresponding to the logical operations or, and, and not, respectively. The ``sigma'' refers to the
summation symbol S, which is used for de®ning the additive properties of the associated probability
measure. However, the lowercase symbol s is used for abbreviating ``sigma algebra'' to ``s-algebra.''
60 RANDOM PROCESSES AND STOCHASTIC SYSTEMS
For every real value of x, the set fojdo < xg is an event. For example,
P
d
1pd
À1
ÀI; 1
 pfojdo < 1g
 pf g the empty set
 0;
P
d
1:0 ÁÁÁ01pd
À1
ÀI; 1:0 ÁÁÁ01
 pfojdo < 1:0 ÁÁÁ01g
 pfo
a
g 
1
6
;
.
.
.
P
d
6:0 ÁÁÁ01ps1;
as plotted in Figure 3.2. Note that P
d
is not a continuous function in this particular
example.
3.2.2 Probability Distributions and Densities
Random variables f are required to have the property that, for every real a and b such
that ÀI a b I, the outcomes o such that a < f o < b are an event
e P a. This property is needed for de®ning the probability distribution function P
f
of f as
P
f
x
def
p f
À1
ÀI; x; 3:2
f
À1
ÀI; x 
def
fo P sj f o xg: 3:3
Fig. 3.2 Probability distribution of scores from a fair die.
3.2 PROBABILITY AND RANDOM VARIABLES 61
The probability distribution function may not be a differentiable function. However,
if it is differentiable, then its derivative
p
f
x
d
dx
P
f
x3:4
is called the probability density function of the random variable, f , and the
differential
p
f
x dx  dP
f
x3:5
is the probability measure of f de®ned on a sigma algebra containing the open
intervals (called the Borel
2
algebra over `).
A vector-valued random variable is a vector with random variables as its
components. An analogous derivation applies to vector-valued random variables,
for which the analogous probability measures are de®ned on the Borel algebras over
`
n
.
3.2.3 Gaussian Probability Densities
The probability distribution of the average score from tossing n dice (i.e., the total
number of dots divided by the number of dice) tends toward a particular type of
distribution as n 3I, called a Gaussian distribution.
3
It is the limit of many such
distributions, and it is common to many models for random phenomena. It is
commonly used in stochastic system models for the distributions of random
variables.
Univariate Gaussian Probability Distributions. The notation n

x; s
2
 is used to
denote a probability distribution with density function
px
1

2p
p
s
exp À
1
2
x À

x
2
s
2
45
; 3:6
where

x  Ehxi3:7
is the mean of the distribution (a term that will be de®ned later on, in Section 3:4:2)
and s
2
is its variance (also de®ned in Section 3.4.2). The ``n'' stands for ``normal,''
2
Named for the French mathematician Fe
Â
lix Borel (1871±1956).
3
It is called the Laplace distribution in France. It has had many discoverers besides Gauss and Laplace,
including the American mathematician Robert Adrian (1775±1843). The physicist Gabriel Lippman
(1845±1921) is credited with the observation that ``mathematicians think it [the normal distribution] is a
law of nature and physicists are convinced that it is a mathematical theorem.''
62 RANDOM PROCESSES AND STOCHASTIC SYSTEMS
another name for the Gaussian distribution. Because so many other things are called
normal in mathematics, it is less confusing if we call it Gaussian.
Gaussian Expectation Operators and Generating Functions. Because the
Gaussian probability density function depends only on the difference x À

x, the
expectation operator
E
x
h f xi 

I
ÀI
f xpx dx 3:8

1

2p
p
s

I
ÀI
f xe
ÀxÀ

x
2
=2s
2
dx 3:9

1

2p
p
s

I
ÀI
f x 

xe
Àx
2
=2s
2
dx 3:10
has the form of a convolution integral. This has important implications for problems
in which it must be implemented numerically, because the convolution can be
implemented more ef®ciently as a fast Fourier transform of f, followed by a
pointwise product of its transform with the Fourier transform of p, followed by an
inverse fast Fourier transform of the result. One does not need to take the numerical
Fourier transform of p, because its Fourier transform can be expressed analytically in
closed form. Recall that the Fourier transform of p is called its generating function.
Gaussian generating functions are also (possibly scaled) Gaussian density functions:
po
1

2p
p

I
ÀI
pxe
iox
dx 3:11

1

2p
p

I
ÀI
e
Àx
2
=2s
2

2ps
p
e
iox
dx 3:12

s

2p
p
e
À1=2o
2
s
2
; 3:13
a Gaussian density function with variance s
À2
. Here we have used a probability-
preserving form of the Fourier transform, de®ned with the factor of 1=

2p
p
in front
of the integral. If other forms of the Fourier transform are used, the result is not a
probability distribution but a scaled probability distribution.
3.2.3.1 Vector-Valued (Multivariate) Gaussian Distributions. The formula
for the n-dimensional Gaussian distribution n

x; P, where the mean

x is an n-
vector and the covariance P is an n Ân symmetric positive-de®nite matrix, is
px
1

2p
n
det P
p
e
1=2xÀ

x
T
P
À1
xÀ

x
: 3:14
3.2 PROBABILITY AND RANDOM VARIABLES 63
The multivariate Gaussian generating function has the form
po
1

2p
n
det P
À1
p
e
1=2o
T
Po
; 3:15
where o is an n-vector. This is also a multivariate Gaussian probability distribution
n0; P
À1
 if the scaled form of the Fourier transform shown in Equation 3.11 is
used.
3.2.4 Joint Probabilities and Conditional Probabilities
The joint probability of two events e
a
and e
b
is the probability of their set
intersection pe
a
 e
b
, which is the probability that both events occur. The joint
probability of independent events is the product of their probabilities.
The conditional probability of event e, given that event e
c
has occurred, is
de®ned as the probability of e in the ``conditioned'' probability space with sample
space e
c
. This is a probability space de®ned on the sigma algebra
aje
c
fe  e
c
je P ag3:16
of the set intersections of all events e P a (the original sigma algebra) with the
conditioning event e
c
. The probability measure on the ``conditioned'' sigma algebra
aje
c
is de®ned in terms of the joint probabilities in the original probability space by
the rule
peje
c

pe  e
c

pe
c

; 3:17
where pe  e
c
 is the joint probability of e and e
c
. Equation 3.17 is called Bayes'
rule
4
.
EXAMPLE 3.2: Experiment with Two Dice Consider a toss with two dice in
which one die has come to rest before the other and just enough of its face is visible
to show that it contains either four or ®ve dots. The question is: What is the
probability distribution of the score, given that information?
The probability space for two dice. This example illustrates just how rapidly the
sizes of probability spaces grow with the ``problem size'' (in this case, the number of
dice). For a single die, the sample space has 6 outcomes and the sigma algebra has
64 events. For two dice, the sample space has 36 possible outcomes (6 independent
outcomes for each of two dice) and 2
36
 68, 719, 476, 736 possible events. If each
4
Discovered by the English clergyman and mathematician Thomas Bayes (1702±1761). Conditioning on
impossible events is not de®ned. Note that the conditional probability is based on the assumption that e
c
has occurred. This would seem to imply that e
c
is an event with nonzero probability, which one might
expect from practical applications of Bayes' rule.
64 RANDOM PROCESSES AND STOCHASTIC SYSTEMS
die is fair and their outcomes are independent, then all outcomes with two dice have
probability 
1
6
Â
1
6

1
36
and the probability of any event is the number of outcomes
in the event divided by 36 (the number of outcomes in the sample space). Using the
same notation as the previous (one-die) example, let the outcome from tossing a pair
of dice be represented by an ordered pair (in parentheses) of the outcomes of the ®rst
and second die, respectively. Then the score so
i
; o
j
  do
i
do
j
, where o
i
represents the outcome of the ®rst die and o
j
represents the outcome of the second
die. The corresponding probability distribution function of the score x for two dice is
shown in Figure 3.3a.
The event corresponding to the condition that the ®rst die have either four or ®ve
dots showing contains all outcomes in which o
i
 o
d
or o
e
; which is the set
e
c
fo
d
; o
a
; o
d
; o
b
; o
d
; o
c
; o
d
; o
d
; o
d
; o
e
; o
d
; o
f

o
e
; o
a
; o
e
; o
b
; o
e
; o
c
; o
e
; o
d
; o
e
; o
e
; o
e
; o
f
g;
of 12 outcomes. It has probability pe
c

12
36

1
3
:
Fig. 3.3 Probability distributions of dice scores.
3.2 PROBABILITY AND RANDOM VARIABLES 65
By applying Bayes' rule, the conditional probabilities of all events corresponding
to unique scores can be calculated as shown in Figure 3.4. The corresponding
probability distribution function for two dice with this conditioning is shown in
Figure 3.3b.
3.3 STATISTICAL PROPERTIES OF RANDOM VARIABLES
3.3.1 Expected Values of Random Variables
Expected values. The symbol E is used as an operator on random variables. It is
called the expectancy, expected value,oraverage operator, and the expression
E
x
h f xi is used to denote the expected value of the function f applied to the
ensemble of possible values of the random variable x. The symbol under the E
indicates the random variable (RV) over which the expected value is to be evaluated.
When the RV in question is obvious from context, the symbol underneath the E will
be eliminated. If the argument of the expectancy operator is also obvious from
context, the angular brackets can also be disposed with, using Ex instead of Ehxi, for
example.
Moments. The nth moment of a scalar RV x with probability density p(x)is
de®ned by the formula
Z
n
x
def
E
x
hx
n
i
def

I
I
x
n
px dx: 3:18
Fig. 3.4 Conditional scoring probabilities for two dice.
66 RANDOM PROCESSES AND STOCHASTIC SYSTEMS
The nth central moment of x is de®ned as
m
n
x
def
Ehx À Exi
n
3:19


I
ÀI
x À Ex
n
px dx: 3:20
The ®rst moment of x is called its mean
5
:
Z
1
 Ex 

I
ÀI
xpx dx: 3:21
In general, a function of several arguments such as f (x,y,z) has ®rst moment
Ef x; y; z

I
ÀI

f x; y; zpx; y; z dx dy dz: 3:22
Array Dimensions of Moments. The ®rst moment will be a scalar or a vector,
depending on whether the function f (x, y, z) is scalar or vector valued. Higher order
moments have tensorlike properties, which we can characterize in terms of the
number of subscripts used in de®ning them as data structures. Vectors are singly
subscripted data structures. The higher order moments of vector-valued variates are
successively higher order data structures. That is, the second moments of vector-
valued RVs are matrices (doubly subscripted data structures), and the third-order
moments will be triply subscripted data structures.
These de®nitions of a moment apply to discrete-valued random variables if we
simply substitute summations in place of integrations in the de®nitions.
3.3.2 Functions of Random Variables
A function of RV x is the operation of assigning to each value of x another value, for
example y, according to rule or function. This is represented by
y  f x; 3:23
where x and y are usually called input and output, respectively. The statistical
properties of y in terms of x are, for example,
Ey 

I
ÀI
f xpx dx;
.
.
.
3:24
Ey
n


I
ÀI
f x
n
px dx
when y is scalar. For vector-valued functions y, similar expressions can be shown.
5
We here restrict the order of the moment to the positive integers. The zeroth-order moment would
otherwise always evaluate to 1.
3.3 STATISTICAL PROPERTIES OF RANDOM VARIABLES 67
The probability density of y can be obtained from the density of x. If Equation
3.23 can be solved for x, yielding the unique solution
x  gy: 3:25
Then we have
p
y
y
p
x
gy
@f x
@x



xgy
3:26
where p
y
y and p
x
x are the density functions of y and x, respectively. A function of
two RVs, x, y is the process of assigning to each pair of x, y another value, for
example, z, according to the same rule,
z  f  y; x; 3:27
and similarly functions of n RVs. When x and y in Equation 3.23 are n-dimensional
vectors and if a unique solution for x in terms of y exists,
x  gy; 3:28
Equation 3.26 becomes
p
y
y
p
x
gy
jJj
xgy
; 3:29
where the Jacobian jJjis de®ned as the determinant of the array of partial derivatives
@f
i
=@x
j
:
jJjdet
@f
1
@x
1
@f
1
@x
2
ÁÁÁ
@f
1
@x
n
@f
2
@x
1
@f
2
@x
2
ÁÁÁ
@f
2
@x
n
.
.
.
.
.
.
.
.
.
.
.
.
@f
n
@x
1
@f
n
@x
2
ÁÁÁ
@f
n
@x
n
P
T
T
T
T
T
T
T
T
T
T
T
R
Q
U
U
U
U
U
U
U
U
U
U
U
S
: 3:30
3.4 STATISTICAL PROPERTIES OF RANDOM PROCESSES
3.4.1 Random Processes (RPs)
A RV was de®ned as a function x(s) de®ned for each outcome of an experiment
identi®ed by s. Now if we assign to each outcome s a time function x(t, s), we obtain
68 RANDOM PROCESSES AND STOCHASTIC SYSTEMS
a family of functions called random processes or stochastic processes. A random
process is called discrete if its argument is a discrete variable set as
xk; s; k  1; 2 : 3:31
It is clear that the value of a random process x(t) at any particular time t  t
0
, namely
xt
0
; s, is a random variable [or a random vector if xt
0
; s is vector valued].
3.4.2 Mean, Correlation, and Covariance
Let x(t)beann-vector random process. Its mean
Ext

I
ÀI
xtpxt dxt; 3:32
which can be expressed elementwise as
Ex
i
t

I
ÀI
x
i
tpx
i
t dxt; i  1 n:
For a random sequence, the integral is replaced by a sum.
The correlation of the vector-valued process x(t) is de®ned by
Ehxt
1
x
T
t
2
i 
Ehxt
1
x
1
t
2
i ÁÁÁ Ehx
1
t
1
x
n
t
2
i
.
.
.
.
.
.
.
.
.
Ehx
n
t
1
x
1
t
2
i ÁÁÁ Ehx
n
t
1
x
n
t
2
i
P
T
T
T
R
Q
U
U
U
S
; 3:33
where
Ex
i
t
1
x
j
t
2


I
ÀI

x
i
t
1
x
j
t
2
px
i
t
1
; x
j
t
2
 dx
i
t
1
 dx
j
t
2
: 3:34
The covariance of x(t) is de®ned by
Ehxt
1
ÀExt
1
xt
2
ÀExt
2

T
i
 Ehxt
1
x
T
t
2
i À Ehxt
1
iEhx
T
t
2
i:
3:35
When the process x(t) has zero mean (i.e., Ext0 for all t), its correlation and
covariance are equal.
The correlation matrix of two RPs x(t), an n-vector, and y(t), an m-vector, is given
by an n  m matrix
Ext
1
y
T
t
2
; 3:36
3.4 STATISTICAL PROPERTIES OF RANDOM PROCESSES 69

Không có nhận xét nào:

Đăng nhận xét