<a name="docA001">Index beginIndex this file
2009-06-08-19-10 start
This file http://freeman2.com/tute0008.htm
is Freeman's reading notes. Although Freeman
always keep correct view point. But Freeman's
capability is limited, plus no one proofread
this file. Then you can still find wrong view
point. When read, please put question mark as
often as possible. If you suspect any view
point wrong, please ask a math expert near by.
<a name="docA002">,<a name="textbook">
This file is a note for reading inequality
book written by ProfessorJ. MichaelSteeleThe Cauchy-Schwarz Master Class★★★★★
Below use 'textbook' as abbreviation.
Freeman also read web pages online, and will
indicate the source URL at discussion point.
<a name="docA003">
Freeman study mechanical engineering.
Engineering mathematics do not teach
inequality. Above book is first inequality
book. First time read, it was very hard.
Although high school time learned
a*a + b*b >= 2*a*b
But this little knowledge do not help.
<a name="docA004">
This file follow textbook chapter section
order, but not continuous, Freeman skip
those uncertain sections/problems.
This file first function is to learn
inequality. Second function is to learn
how to use html code to write math
equations.
<a name="docA005">Index beginIndex this file
On 2009-01-27-10-08 Freeman accessed
the next page
http://www.sftw.umac.mo/~fstitl/10mmo/inequality.html
save as sftw.umac.mo_text_math_eqn_good.htm
Above page is the main reference for html
math equation.
In order to let reader to build html math
equation, previous file tute0007.htm page
end has math symbol and internal code.
This file third function is to display how
to draw curves in web page. The main engine
is XYGraph v2.3 - Technical Figures
Thank you for read Freeman's inequality page.
2009-06-08-19-47 stop
<a name="docA006">
2009-08-23-15-00 start
□ Exercise 1.14 solution
Exercise 1.14 problem (a) is solved by hint
Exercise 1.14 problem (b) is out of LiuHH's
reach. What is "cardinality of a set B⊂Z3" ?
Without clear understand the definition,
LiuHH is unable to solve problem (b)
LiuHH major in mechanical engineering.
LiuHH is mathematics admirer and outsider.
<a name="docA007">
To solve problems in
The Cauchy-Schwarz Master Class
LiuHH's goal is to solve 50% of the exercise
problems. In reality, solve 40% is good enough.
Please do not be surprise that LiuHH skip some
text section, skip some problems.
2009-08-23-15-12 stop
<a name="ch13b001">Index beginIndex this file
2010-07-01-11-30 start
■ Problem 13.3
[The HLP Representation
α(<)β implies α=Dβ]
jk0jkmbestrngbjktrapHLPDSMprf<a name="ch13b002">
Show that α(<)β implies that
there exists a doubly stochastic
matrix D such that
α=Dβ ---eqn.BO001 //example 12
Hardy, Littlewood and Polya came
to this result because of their
interests in mathematical
inequalities. but, ironically,
<a name="ch13b003">
the concept of majorization was
originally introduced by
economists who were interested
in inequalities of a different
sort -- the inequality of income
which one finds in our society.
<a name="ch13b004">
Today, the role of majorization
in mathematics far outstrips the
role in economics, but consideration
of income distribution can still
add to our intuition.
2010-07-01-11-45 stop
<a name="ch13b005">
2010-07-01-11-47 start
Now consider two sequences
α seq: 4 3.2 3 2.8 2 ---eqn.BO002
β seq: 6 4.4 3 0.9 0.7 ---eqn.BO003
They have majorization relation.
<a name="ch13b006">Index beginIndex this file
What is majorization relation?
In simple words:
Every element of α seq is an
average of β seq.
or
Every element of α seq is an
interpolation of β seq.
<a name="ch13b007">
If one element of αv seq need
extrapolation of βv seq.
then αv and βv do NOT have
majorization relation.
<a name="ch13b008">
There are two methods to exam
majorization relation between
α_seq. and β_seq.
<a name="ch13b009">Majorization definition first
version is interpolation eqn.
It is called //2nd version
"Muirhead's condition"
which is //numericalexample
α∈H(β) ---eqn.BN120
or
α=D*β ---eqn.BN121
<a name="ch13b010">Majorization definition second
version is successive inequality.
It is called //1st version
"partial sum inequality"
which is //numerical example
α[1]≦β[1] ---eqn.BO004
α[1]+α[2]≦β[1]+β[2] ---eqn.BO005
α[1]+α[2]+α[3]≦β[1]+β[2]+β[3] ---eqn.BO006
.....
∑[j=1,n-1]α[j]≦∑[j=1,n-1]β[j] ---eqn.BO007
and final equality
∑[j=1,n] α[j]=∑[j=1,n] β[j] ---eqn.BO008
<a name="ch13b011">Index beginIndex this file
Numerical example can help reader
to understand better. Next use
α seq: 4 3.2 3 2.8 2 ---eqn.BO002
β seq: 6 4.4 3 0.9 0.7 ---eqn.BO003
to illustrate two Majorization
definitions.
<a name="ch13b012">
First definition "Muirhead's
condition" for α_seq. and β_seq
the definition is
α=D*β ---eqn.BN121
in specific form, it is .....
(prepare example)
2010-07-01-12-14 stop
<a name="ch13b013">
2010-07-01-15-37 start
Please goto the Majorization
transformation page (local)
http://freeman2.com/jsmajor2.htm
Click [α,ß in] link, next
click [17] example button
click [Run Stoch] button
click [box 7] button
<a name="ch13b014">
"Bx7 stochastic mat" has next
matrix M17
[[
0.588628762541806 0.044147157190635465 0 0.08461538461538463 0.2826086956521739
0 0.6571428571428571 0 0.3428571428571429 0
0 0 1 0 0
0.17948717948717951 0.2813186813186814 0 0.5391941391941392 0
0.23188405797101447 0.01739130434782609 0 0.03333333333333334 0.717391304347826
]]
<a name="ch13b015">
Above is matrix in decimal number
Convert decimal number to quotient
number, get
[[
176/299 66/1495 0 11/130 13/46
0 23/35 0 12/35 0
0 0 1 0 0
7/39 128/455 0 736/1365 0
16/69 2/115 0 1/30 33/46
]]
<a name="ch13b016">Index beginIndex this file
Please verify that quotient matrix
each row sum to one
each column sum to one
All matrix elements are
non-negative.
2010-07-01-15-53 stop
<a name="ch13b017">
2010-07-01-18-55 start
For this specific case
α=D*β ---eqn.BN121
become
4 =6*176/299 +4.4*66/1495 +3*0 +0.9*11/130 +0.7*13/46
// above line ---eqn.BO009
<a name="ch13b018">
3.2= 6*0 +4.4*23/35 +3*0 +0.9*12/35 +0.7*0
// above line ---eqn.BO010
3 = 6*0 +4.4*0 +3*1 +0.9*0 +0.7*0
// above line ---eqn.BO011
2.8= 6*7/39 +4.4*128/455 +3*0 +0.9*736/1365+0.7*0
// above line ---eqn.BO012
2 = 6*16/69 +4.4*2/115 +3*0 +0.9*1/30 +0.7*33/46
// above line ---eqn.BO013
<a name="ch13b019">
eqn.BO009 to eqn.BO013 are
Muirhead's condition
Above equations show that each
element of α is an average of
elements of β .
<a name="ch13b020">
recall
α seq: 4 3.2 3 2.8 2 ---eqn.BO002
β seq: 6 4.4 3 0.9 0.7 ---eqn.BO003
All coefficient (quotient) are in
the domain [0,1].
Coefficients in each line sum to one.
That is average, that is interpolation.
<a name="ch13b021">Index beginIndex this file
Successive inequality for α seq.
and β seq, eqn.BO004 to eqn.BO008
their numerical values are next
α seq. ≦ β seq.
4 ≦ 6 ---eqn.BO004B
4 + 3.2 ≦ 6 + 4.4 ---eqn.BO005B
4 + 3.2 + 3 ≦ 6 + 4.4 + 3 ---eqn.BO006B
4+3.2+3+2.8 ≦ 6+4.4+3+0.9 ---eqn.BO007B
4+3.2+3+2.8+2=6+4.4+3+0.9+0.7 ---eqn.BO008B
<a name="ch13b022">
that is
α seq. ≦ β seq.
4 ≦ 6 ---eqn.BO004C
7.2 ≦ 10.4 ---eqn.BO005C
10.2 ≦ 13.4 ---eqn.BO006C
13.0 ≦ 14.3 ---eqn.BO007C
15 = 15 ---eqn.BO008C
<a name="ch13b023">
Majorization definition
first version is interpolation, and
second ver. is successive inequality
Are they equal?
Can they imply each other?
<a name="ch13b024">
Previous Problem 13.2 discuss
Muirhead Implies Majorization
that is (given) coeff. matrix
implies successive inequality,
or given interpolation eqn.
ask to prove successive ineq.Numerical example for Problem 13.2
give a solid explanation for one
specific case n=4, k=3.
<a name="ch13b025">
//Be alert about the difference !
Current Problem 13.3 discuss
Reverse case: (given) successive
inequality implies coeff. matrix
or given successive ineq.
ask to prove interpolation eqn.
Problem 13.3: HLP Representation
Problem 13.3: α(<)β implies α=Dβ
<a name="ch13b026">Index beginIndex this file
All above said is ask reader pay
attention to the difference
between Problem 13.2 and
Problem 13.3 They are reverse
to each other.
2010-07-01-19-56 stop
<a name="ch13b027">
2010-07-01-22-31 start
The task is transform from β seq.
to α seq. Both α seq. and β seq
have same total sum 15. The goal
is to find a doubly stochastic
matrix D (M17) such that α=D*β
is true.
<a name="ch13b028">We do average two pair data at
one time.
α seq. is goal and never change.
β seq. β[j] reduce, β[k] increase.
<a name="ch13b029">jk0jkmbestrngbjktrapHLPDSMprf
We use // non-best j,k
α seq: 4 3.2 3 2.8 2 ---eqn.BO002
β seq: 6 4.4 3 0.9 0.7 ---eqn.BO003
index 0 1 2 3 4
as numerical example.
Assume j=0 (red) and k=3 (blue)
j=0, k=3 is not the best choice.
Before talk best, we mention the
non-best. //why non-best?
<a name="ch13b030">
Both sequences are descending.
α[0]= 4> α[1]=3.2> ...> α[4]=2
β[0]= 6> β[1]=4.4> ...> β[4]=0.7
<a name="ch13b031">Index beginIndex this file
Please pay attention to index=2
α[2]=3=β[2] both α[2] and β[2]
are equal.
Equal middle elements is NOT
required.<a name="ch13b032">
Index=0 and Index=1 are higher
value side. Here α[j]<β[j]
(example α[0]=4<6=β[0])
Index=3 and Index=4 are lower
value side. Here α[k]>β[k]
(example α[3]=2.8>0.9=β[3])
<a name="ch13b033">define j,k
Use 'j' for higher side index.
Use 'k' for lower side index.
We have
j<k (0 < 3) j,k are index
α[j]>α[k] ( 4 >2.8)
β[j]>β[k] ( 6 >0.9)
α[k]>β[k] (2.8>0.9)
α[j]<β[j] ( 4 < 6 )
<a name="ch13b034">
j,k are index. (j,k are pure number)
α[j] and β[j] are observed data.
(α[j] and β[j] are observed length)
Higher value side determine j only.
Lower value side determine k only.
In eqn.BO002 and eqn.BO003
j has two choices, j=0 or j=1
k has two choices, k=3 or k=4
//colored example<a name="ch13b035">
If no equal middle elements,
what α seq. and β seq look like?
In jsmajor2.htm click example 53
button, get
α53 seq: 5 3 3 2 1 1
β53 seq: 6 2.8 2.8 1.8 1.5 0.1
index 0 1 2 3 4 5
<a name="ch13b036">Index beginIndex this file
There is no equal middle elements
no equal value left end elements
no equal value right end elements
It is still easy to see
j section is index 0 (α[j]<β[j])
k section is 1 to 5 (α[k]>β[k])
<a name="ch13b037">
Equal value left end elements and
equal value right end elements
can be dropped out of analysis.
They contribute nothing. (they
are done) For example, if run
α53 seq and β53 seq
<a name="ch13b038">
First iteration change
β_old 6 2.8 2.8 1.8 1.5 0.1
to
β_new 5.8 3 2.8 1.8 1.5 0.1
compare
α_fix 5 3 3 211<a name="ch13b039">
Fourth iteration
β_new 5 3 3 21.50.5
β_old 5.4 3 3 2 1.5 0.1
Fifth iteration, above purple data
are ignored. We only need average
between red and blue. Leading equal
value terms dropped out of analysis.
(Leading equal numbers are done)
2010-07-01-23-41 stop
<a name="ch13b040">Index beginIndex this file
2010-07-02-12-02 start
■ Index j,k determination
Next nine lines are link.
non-best j,k example
define j-range k-range m-range
define best j,k pair
j,k,m-range examplebest j,k example
trap if use wrong j,k
Hardy, Littlewood, Polya matrix
define doubly stochastic matrix
proof HLP matrix is doubly s. m.
<a name="ch13b041">
Liu,Hsinhan write a program (local)
http://freeman2.com/jsmajor2.htm
to find doubly stochastic matrix
for two given sequences α and β seq
<a name="ch13b042">
Initially LiuHH code j,k value with
the easiest method (wrong), set
j to left end of index range,
k to right end of index range.
This setting work for some α,β seq
but must use permutation matrix to
re-order new β seq to descending
order.
<a name="ch13b043">
Trouble is that some other α,β seq
have majorization relation but get
negative matrix element, which is
an error. Gradually, LiuHH realize
that index j,k determination is
important.
LiuHH spend more time on the
function jkcheck().
More document is here.
<a name="ch13b044">
Back to non-best j,k example
α seq: 4 3.2 3 2.8 2 ---eqn.BO002
β seq: 6 4.4 3 0.9 0.7 ---eqn.BO003
index 0 1 2 3 4 Why
j=0, k=3 is not the best choice?
Let us do the calculation.
<a name="ch13b045">Index beginIndex this file
RobinHood transformation average
action let β[0]= 6 reduce 2 to 4
to match α[0]= 4 then
β[3]=0.9 must add 2 to 2.9
Because β seq total sum be 15.
<a name="ch13b046">
The old/new β seq are
β old: 6 4.4 3 0.9 0.7
β new: 4 4.4 3 2.9 0.7
β sort 4.4 4 3 2.9 0.7
α seq: 4 3.2 3 2.8 2
From blue β new to red β sort, we
need sort β seq. Sort dislocate
blue 4 and black 4. Which can be
avoid if we choose best j,k pair.
<a name="ch13b047">jk0jkmbestrngbjktrapHLPDSMprf
α and β are in descending order.
smaller index has higher value.
Define j-range be the index range
where α[j]<β[j] //example
Define k-range be the index range
where α[k]>β[k]. (with surprise
α[k]<β[k] see α[4]=1<1.5=β[4])<a name="ch13b048">
Define m-range be the index range
where α[m]=β[m] for 0≦j<m<k≦n
'm' mean middle. 'n' is dimension.
All three ranges can be undefined.
If α≡β, all three ranges are empty
and problem is done.
<a name="ch13b049">jk0jkmbestrngbjktrapHLPDSMprfWhat is the best j,k pair?
Choose j-range right end for j
Choose k-range left end for k
Let j,k be as close as possible.
No need to sort updated β seq.
That is best j,k pair.//j,k range<a name="ch13b050">Index beginIndex this filejk0jkmbestrngbjktrapHLPDSMprf
j,k,m range example
α seq: 4 3.2 3 2.8 2 ---eqn.BO002
β seq: 6 4.4 3 0.9 0.7 ---eqn.BO003
index 0 1 2 3 4
Red columns j-range α[j]<β[j]
Blue columns k-range α[k]>β[k]
Black column m-range α[m]=β[m]
<a name="ch13b051">
Program jsmajor2.htm
function jkcheck() use
jl for j-range left bound (jl=0)
jl=i; //9906232226 ←time stamp
jr for j-range right bound (jr=1)
jr=i; //9906232231
<a name="ch13b052">
kl for k-range left bound (kl=3)
kl=i; //9906232241
kr for k-range right bound (kr=4)
kr=i; //9906232245
<a name="ch13b053">
Four time stamp lines are code
line in jsmajor2.htm
Four (jl=0) etc are values in
example eqn.BO002 and eqn.BO003
<a name="ch13b054">
In this example, determined
j-range be [0,1]
k-range be [3,4] then
Choose j-range right end j=jr=1
Choose k-range left end k=kl=3
The best choice for j,k in the
example eqn.BO002 and eqn.BO003
is
j=1 and k=3 (NOT j=0 and k=3 )
<a name="ch13b055">Index beginIndex this filejk0jkmbestrngbjktrapHLPDSMprf
Back to example for best j,k
α seq: 4 3.2 3 2.8 2 ---eqn.BO002
β seq: 6 4.4 3 0.9 0.7 ---eqn.BO003
index 0 1 2 3 4
We average between red and blue.
<a name="ch13b056">
RobinHood transformation average
action β[1]=4.4 reduce 1.2 to 3.2
to match α[1]= 3.2 then
β[3]=0.9 must add 1.2 to 2.1
Because β seq total sum be 15.
<a name="ch13b057">
The old/new β seq are
β old: 6 4.4 3 0.9 0.7
β new: 6 3.2 3 2.1 0.7
β sort 6 3.2 3 2.1 0.7
α seq: 4 3.2 3 2.8 2
<a name="ch13b058">
From blue β new to red β sort,
we create an identity matrix as
permutation matrix. which in
effect is no sort.
<a name="ch13b059">
Next iteration, new problem is
α seq: 4 3.2 3 2.8 2
β sort 6 3.2 3 2.1 0.7
index 0 1 2 3 4
There are two equal value columns
marked purple color.
<a name="ch13b060">Index beginIndex this file
In this updated problem, determined
j-range be [0,0]
k-range be [3,4] then
Choose j-range right end j=0
Choose k-range left end k=3
average between index=0 and 3.
<a name="ch13b061">
Can you fill in the following
detail steps for LiuHH? I need
a break. Thank you very much.
Please goto jsmajor2.htm
click [17], click [run stoch]
then peek [Box 3 β history]
for hint.
2010-07-02-13-08 stop
<a name="ch13b062">jk0jkmbestrngbjktrapHLPDSMprf
2010-07-02-16-57 start
Best j,k pair is defined.
If violate best j,k definition
there is one case it may trap
a programmer.
Please goto jsmajor2.htm
click [52] get the following
α52 seq: 5 3 2 1 1 ---eqn.BO014
β52 seq: 6 2 2 2 0 ---eqn.BO015
index 0 1 2 3 4
<a name="ch13b063">
Here the equal middle element is
index 2 and α52[2]=β52[2]=2. Then
m-range is [2,2], ml=mr=2
Base on m-range, define //error
j=ml-1=2-1=1
k=mr+1=2+1=3
get j,k pair j=1, k=3. //error
<a name="ch13b064">
Average the following problem
α52 seq: 5 3 2 1 1
β52 seq: 6 2 2 2 0
index 0 1 2 3 4
<a name="ch13b065">Index beginIndex this fileEarly α[j]+α[k]=6≠5.3=β[j]+β[k]
Here α[j]+α[k]=3+1=2+2=β[j]+β[k]
Look like one stone two birds.
One average two data equality.
Is it a better situation?
<a name="ch13b066">
LiuHH trapped in example 52 for
a while. Because 'average' β52
middle '2 2 2' to α52 middle
'3 2 1' ? That is not average!!
That is diverse.
<a name="ch13b067">
If average '3 2 1' to '2 2 2'
Drop the middle like number '2'
Average '3 1' to '2 2'
the code generate
[2]=[0.5 0.5]*[3]
[2] [0.5 0.5] [1]
which has no trouble.
<a name="ch13b068">
Now need 'average' [2 2] to [3 1]
Find the inverse matrix
such that
[3]=[ s t ]*[2]
[1] [ u v ] [2]
<a name="ch13b069">Index beginIndex this file
BUT !! this is impossible !!
Because determinant value of
matrix [[0.5 0.5], [0.5 0.5]]
is ZERO !! There is no inverse
matrix. After this difficulty,
<a name="ch13b070">
LiuHH gradually understand that
old j,k determination was wrong.
Start write new code for correct
j,k determination. Please see
comments in program jsmajor2.htm
function jkcheck()
<a name="ch13b071">
Why it is wrong ? For
α52 seq: 5 3 2 1 1
β52 seq: 6 2 2 2 0
index 0 1 2 3 4
j range is [0,0]
k range is [1,4]
set j,k pair j=1, k=3 is wrong
set j,k pair j=0, k=1 is right.
2010-07-02-17-40 stop
<a name="ch13b072">Index beginIndex this file
2010-07-02-18-56 start
■ The simplest case n=2
No matter how complicate the
α,β seq are. Each iteration
average two pairs data.
This section title is
"The simplest case n=2"
which is actually the center
part of Problem 13.3.
<a name="ch13b073">
Now we have
α sequence = [α1, α2] and
β sequence = [β1, β2]
We want to find a 2x2 matrix
such that
[α1]=[m11 m12]*[β1] ---eqn.BO016
[α2] [m21 m22] [β2]
<a name="ch13b074">
Please remember Problem 13.3given successive inequality and
ask to find matrix D such that
α=Dβ ---eqn.BO001 //example
For numerical example, take
best j,k from eqn.BO002.
<a name="ch13b075">
That is
[α1]=[3.2] ---eqn.BO017
[α2] [2.8]
we have α1>α2
and
[β1]=[4.4] ---eqn.BO018
[β2] [0.9]
we have β1>β2
eqn.BO016 become
[3.2]=[m11 m12]*[4.4] ---eqn.BO019
[2.8] [m21 m22] [0.9]
<a name="ch13b076">
Average value of α1 and α2 is
ρα=(3.2+2.8)/2=3 ---eqn.BO020
Average value of β1 and β2 is
ρβ=(4.4+0.9)/2=2.65 ---eqn.BO021
Textbook choose ρβ as parameter
ρ and define rho
ρ=(β1+β2)/2=(4.4+0.9)/2=2.65 ---eqn.BO022
(textbook page 199 line 20)
<a name="ch13b077">Index beginIndex this file
Define tau as radius from ρ
to either β1 or β2
τ=β1-ρ ---eqn.BO023
Numerical value is
τ=4.4-2.65=1.75 ---eqn.BO024
<a name="ch13b078">
ρ definition is based on ρβ
(not base on ρα) then
β1 and β2 are perfect symmetry
from ρ . From eqn.BO023
β1=ρ+τ=2.65+1.75=4.4 ---eqn.BO025
β2=ρ-τ=2.65-1.75=0.9 ---eqn.BO026
Above is β to ρ relation.
<a name="ch13b079">
Next see α to ρ relation.
ρ definition is not based on ρα
ρ is not the average value of
α1, α2. There is no symmetry
of α1 and α2 from ρ. Define
sigma as
σ=max(|α1-ρ|,|α2-ρ|) ---eqn.BO027
(textbook page 199 line -9)
<a name="ch13b080">
Numerical value is
σ=max(|3.2-2.65|,|2.8-2.65|)
=0.55 ---eqn.BO028
Then
ρ+σ=2.65+0.55=3.2=α1 ---eqn.BO029
ρ-σ=2.65-0.55=2.1≠α2 ---eqn.BO030
<a name="ch13b081">Index beginIndex this file
If we choose one example such
that
α1+α2=β1+β2 ---eqn.BO031
then we will have ρ-σ=α2. One
average operation bring two
numbers to equality.
<a name="ch13b082">
But α1+α2=β1+β2 is not required.
The example
α seq: 4 3.2 3 2.8 2 ---eqn.BO002
β seq: 6 4.4 3 0.9 0.7 ---eqn.BO003
index 0 1 2 3 4
was chosen carefully avoid the
special case α1+α2=β1+β2
2010-07-02-19-54 here
<a name="ch13b083">
Hardy, Littlewood and Polya
found that
D*β = α ---eqn.BO032
has following matrix form
Square D matrix transform old β seq. to new β seq.
Blue is old β more diverse. purple is new β more
average. If α1+α2=β1+β2 then new β seq. is α seq.
---page 199
---line 2
---eqn.13.14
definitionprove
width of above equation
<a name="ch13b085">Index beginIndex this file
2010-07-02-20-38 here
The following is copied from old
work, which is to expand eqn.13.14
verify equation left side equal
right side.
[[
<a name="ch13b086">
2010-06-17-17-43 CSMC page 199 eqn.13.14
(τ+σ)*(ρ+τ)/(2τ) + (τ-σ)*(ρ-τ)/(2τ)
=[(τ+σ)*(ρ+τ) + (τ-σ)*(ρ-τ)]/(2τ)
=[(τρ+σρ+ττ+στ) + (τρ-σρ-ττ+στ)]/(2τ)
=[(τρ+στ) + (τρ+στ)]/(2τ)
=ρ+σ ---eqn.BO033
<a name="ch13b087">
2010-06-17-17-48
(τ-σ)*(ρ+τ)/(2τ) + (τ+σ)*(ρ-τ)/(2τ)
=[(τ-σ)*(ρ+τ) + (τ+σ)*(ρ-τ)]/(2τ)
=[(τρ-σρ+ττ-στ) + (τρ+σρ-ττ-στ)]/(2τ)
=[(τρ-στ) + (τρ-στ)]/(2τ)
=ρ-σ ---eqn.BO034
2010-06-17-17-51
]]
<a name="ch13b088">
2010-07-02-20-46 here
If
α1+α2=β1+β2 ---eqn.BO031
then eqn.13.14 is perfect.
HLP representation eqn.13.14
is one stone two birds. Bring
two β elements = α elements
<a name="ch13b089">Index beginIndex this file
But, α1+α2=β1+β2 is not required.
Many majorization sequences do
not have α1+α2=β1+β2 relation.
In this case, HLP representation
is one stone one bird. Bring
one β element = α element.
<a name="ch13b090">
We find doubly stochastic matrix
for two given sequences, do two
pairs at each iteration. Then
multiply all transformation
matrix to a final matrix. see
eqn.BO181. That is our answer.
<a name="ch13b091">
There is a theorem say that
doubly stochastic matrix A
multiply
doubly stochastic matrix B
the result is still a
doubly stochastic matrix.
This theorem give our process
a solid foundation.
<a name="ch13b092">
Liu,Hsinhan's work is just fill
the details. Most time correct.
Some time error.
Please read textbook.
Professor J.M. Steele give you
better explanation.
You can find figure 13.1 and
matrix equation 13.16 in textbook
not here in this page.
2010-07-02-21-09 stop
<a name="ch13b093">
2010-07-02-21-35 start
Compare eqn.13.14
with
[α1]=[m11 m12]*[β1] ---eqn.BO016
[α2] [m21 m22] [β2]
find
m11=(τ+σ)/(2τ) ---eqn.BO035
m12=(τ-σ)/(2τ) ---eqn.BO036
m21=(τ-σ)/(2τ) ---eqn.BO037
m22=(τ+σ)/(2τ) ---eqn.BO038
<a name="ch13b094">Index beginIndex this file
Question is that eqn.BO016 has
index 1 and 2. If j≠1 and k≠2
what to do?
Actually we should view eqn.BO016
as
[αj]=[mjj mjk]*[βj] ---eqn.BO039
[αk] [mkj mkk] [βk]
here, j and k are index.
<a name="ch13b095">
All other non-j and non-k rows
columns are all one for [m,m]
and all zero for [m,n] m≠n.
Textbook page 200 equation 13.16
illustrate this point.
<a name="ch13b096">
Textbook page 200 figure 13.1 is
a diagram for relative magnitude
of the following numbers.
βj=ρ+τ
βk=ρ-τ
αj ≦ ρ+σ
αk=ρ-σ
Please read textbook for better
explanation.
2010-07-02-21-48 stop
<a name="ch13b097">
2010-07-03-09-48 start
Program jsmajor2.htm output box 7
give us doubly stochastic matrix.
Whether eqn.13.14 is also a doubly
stochastic matrix?
<a name="ch13b098">Index beginIndex this filejk0jkmbestrngbjktrapHLPDSMprfA doubly stochastic matrix satisfy
the following three properties.
(1) all elements be non-negative
(2) all column sum to one
(3) all row sum to one
(textbook page 196 line 1 to 5)
A permutation matrix is a doubly
stochastic matrix.
An identity matrix is a doubly
stochastic matrix.
<a name="ch13b099">
Let us review eqn.13.14 matrix
elements column sum and row sum.
Four elements only two flavors
m0=(τ-σ)/(2τ) ---eqn.BO040
m1=(τ+σ)/(2τ) ---eqn.BO041
m0 is off diagonal [j,k] and [k,j]
m1 is on diagonal [j,j] and [k,k]
<a name="ch13b100">
Adding result is
m0+m1=[(τ-σ)/(2τ)]+[(τ+σ)/(2τ)]
m0+m1=[(τ-σ)+(τ+σ)]/(2τ)
m0+m1=(2τ)/(2τ)=1 ---eqn.BO042
Any row sum or column sum are just
m0+m1. Then row sum to one and
column sum to one.
<a name="ch13b101">
Matrix elements be non-negative
requirement couple with negative
data sequence that is a point
need more time to check.
<a name="ch13b102">Index beginIndex this file
Next check positive/negative
property for m0 and m1.
To check, need two numbers τ
and σ. They are defined next
τ=β1-ρ ---eqn.BO023
σ=max(|α1-ρ|,|α2-ρ|) ---eqn.BO027
where
ρ=(β1+β2)/2 ---eqn.BO022
<a name="ch13b103">
Majorization theory require
transformation matrix elements
all be non-negative, but do
not require α seq and β seq.
In other words, α seq. β seq
can have negative elements.<a name="ch13b104">
For α=[α1,α2] β=[β1,β2],
α is more average,
β is more diverse.
Assume the following order
β1 ≧ α1 ≧ α2 ≧ β2 ---eqn.BO043
jsmajor2.htm example 14 iter=1
4 ≧ 3 ≧ 2 ≧ 1 //all positive
jsmajor2.htm example 15 iter=1
-1 ≧ -3 ≧ -3 ≧-4 //all negative
Main concern is negative data.
<a name="ch13b105">
For positive 4 ≧ 3 ≧ 2 ≧ 1 ---eqn.BO044
ρ=(β1+β2)/2=(4+1)/2=2.5>0 ---eqn.BO045
τ= β1-ρ = 4-2.5 =1.5>0 ---eqn.BO046
σ=max(|α1-ρ|,|α2-ρ|) ---eqn.BO047
=max(|3-2.5|,|2-2.5|)
=max(0.5,0.5)
<a name="ch13b106">Index beginIndex this file
σ=0.5>0 ---eqn.BO048
m0=(τ-σ)/(2τ)=(1.5-0.5)/(3)>0 ---eqn.BO049
m1=(τ+σ)/(2τ)=(1.5+0.5)/(3)>0 ---eqn.BO050
For positive α=[α1,α2] β=[β1,β2],
we find m0>0 and m1>0
<a name="ch13b107">For negative -1≧-3 ≧-3 ≧-4 ---eqn.BO051
ρ=(β1+β2)/2=(-1-4)/2=-2.5<0 ---eqn.BO052
τ= β1-ρ =-1-(-2.5) =1.5>0 ---eqn.BO053
σ=max(|α1-ρ|,|α2-ρ|) ---eqn.BO054
=max(|-3+2.5|,|-3+2.5|)
=max(0.5,0.5)
<a name="ch13b108">
σ=0.5>0 ---eqn.BO055
m0=(τ-σ)/(2τ)=(1.5-0.5)/(3)>0 ---eqn.BO056
m1=(τ+σ)/(2τ)=(1.5+0.5)/(3)>0 ---eqn.BO057
For negative α=[α1,α2] β=[β1,β2],
we find m0>0 and m1>0
<a name="ch13b109">
Numerical result tell us that
matrix elements m0 and m1 are
positive for both
all positive α seq. and β seq
all negative α seq. and β seq
2010-07-03-10-46 stop
<a name="ch13b110">Index beginIndex this filejk0jkmbestrngbjktrapHLPDSMprf
2010-07-03-15-15 start
■ Prove HLP matrix is doubly
stochastic
Numerical check build confidence.
Numerical check is not a proof.
Because there are infinite many
numerical cases.
Symbolic equation is a proof.
Because symbol can be replaced
with any number.
<a name="ch13b111">
ρ is defined to be the average
of β1 and β2.
ρ=(β1+β2)/2 ---eqn.BO022
Assumed the following order
β1 ≧ α1 ≧ α2 ≧ β2 ---eqn.BO043
then
β1 ≧ ρ ≧ β2 ---eqn.BO058
<a name="ch13b112">
τ is defined as next
τ=β1-ρ ---eqn.BO023
because β1 ≧ ρ, then
τ must ≧0 ---eqn.BO059
for any beta sequence.
for negative beta sequence too.
See eqn.BO053<a name="ch13b113">
Next see σ
σ=max(|α1-ρ|,|α2-ρ|) ---eqn.BO027
Above definition, let
σ≧0 ---eqn.BO060
with the order
β1 ≧ α1 ≧ α2 ≧ β2 ---eqn.BO043
ρ is mid point of β1, β2.
<a name="ch13b114">
Insert ρ to eqn.BO043, it could
be
β1 ≧ ρ ≧ α1 ≧ α2 ≧ β2 ---eqn.BO061
or
β1 ≧ α1 ≧ ρ ≧ α2 ≧ β2 ---eqn.BO062
or
β1 ≧ α1 ≧ α2 ≧ ρ ≧ β2 ---eqn.BO063
<a name="ch13b115">Index beginIndex this file
no matter which one is true,
σ must be ≦ τ ---eqn.BO064
because τ is greater radius
and σ is shorter radius
(ρ is center, α and β are
circumferences)
Combine eqn.BO059,BO060,BO064
we find
0 ≦ σ ≦ τ ---eqn.BO065
<a name="ch13b116">
Finally goto
m0=(τ-σ)/(2τ) ---eqn.BO040
m1=(τ+σ)/(2τ) ---eqn.BO041
With eqn.BO065 in hand,
eqn.BO040 and eqn.BO041 tell
us that matrix element m0,m1
are both non-negative.
<a name="ch13b117">
Even if α seq. and β seq have
negative elements, m0,m1 still
be non-negative.
<a name="ch13b118">
We just proved that matrix D
has all non-negative elements.
Review definition of doubly
stochastic matrix and the easy
condition eqn.BO042
We conclude that D matrix in
eqn.13.14 is a doubly stochastic
matrix.
<a name="ch13b119">Index beginIndex this file
Chicken give birth to chicken,
rabbit give birth to rabbit ,
doubly stochastic give birth to
doubly stochastic , that is a
matter of course. suspect ?
That is why we get doubly
stochastic matrix in
jsmajor2.htm output box 7.
2010-07-03-15-57 stop
2010-07-03-18-18 done first proofread
<a name="ch13b120">
2010-07-03-18-20 start record
following is a small program
help you calculate.
Given α1, α2; β1, β2
find D matrix and new β.
<a name="ch13b121">
Please copy next code, paste to
[Box3, input JS command] of
calculator page, then click
[Test box3 command, output to box4]
Box 4 has output result.
//<a name="ch13b122">Index beginIndex this file
//2010-07-02-23-13
alfa1=3.2
alfa2=2.8
beta1=4.4
beta2=0.9
//you can change value above
//do not change code below.
rho=(beta1+beta2)/2
tau=rho-beta2;
//<a name="ch13b123">
si1=abs(rho-alfa1)
si2=abs(rho-alfa2)
sigma=max(si1,si2)
m0=(tau-sigma)/2/tau
m1=(tau+sigma)/2/tau
mat=m1+', '+m0+'; '+m0+', '+m1
//<a name="ch13b124">
newBeta1=m1*(rho+tau)+m0*(rho-tau)
newBeta2=m0*(rho+tau)+m1*(rho-tau)
newbet12=newBeta1+', '+newBeta2
oldbeta=beta1+', '+beta2
mat //matrix
newbet12 //new beta
oldbeta //old beta
//2010-07-02-23-24
//2010-07-03-18-25 done record
2010-07-03-22-22 done second proofread
2010-07-03-23-05 done spelling check
<a name="ch13b124a">
2010-07-08-11-58 start
α seq: 3 3 0
β seq: 4 1 1
do not have majorization relation.
The program jsmajor2.htm do not
carry out calculation. Why it is
wrong? Above code will give you
answer.
<a name="ch13b124b">
First set
alfa1=3
alfa2=3
beta1=4
beta2=1
get
[[
<a name="ch13b124c">
mat //matrix
0.6666666666666666, 0.3333333333333333; 0.3333333333333333, 0.6666666666666666
newbet12 //new beta
3, 2
oldbeta //old beta
4, 1
]]
which is ok.
<a name="ch13b124d">
New problem become
α seq: 3 3 0
β seq: 3 2 1
Second set
alfa1=3
alfa2=0
beta1=2
beta2=1
get
[[
<a name="ch13b124e">
mat //matrix
2, -1; -1, 2
newbet12 //new beta
3, 0
oldbeta //old beta
2, 1
]]
<a name="ch13b124f">
The matrix [2, -1; -1, 2] or
[ 2 -1]
[ -1 2]
is wrong.
It is extrapolation.
BUT interpolation is required.
2010-07-08-12-09 stop
<a name="ch13b125">Index beginIndex this file
2010-07-05-08-58 start
■ Stand on HLP shoulder
We see eqn.13.14 is the center
equation to Problem 13.3. The
HLP Representation. That is
Hardy, Littlewood and Polya
Representation.
<a name="ch13b126">
Can we ride on HLP shoulder and
find out the doubly stochastic
matrix D ? Let us solve the
following problem.
<a name="ch13b127">
Given α and β two sequences
with majorization relation
α(<)β ---eqn.BO066
Given two index
j<k ---eqn.BO067
for two pairs data
α1=α[j]<β[j]=β1 ---eqn.BO068
α2=α[k]>β[k]=β2 ---eqn.BO069
<a name="ch13b128">
Define
ρ=(β1+β2)/2 ---eqn.BO022
τ=β1-ρ ---eqn.BO023
σ=max(|α1-ρ|,|α2-ρ|) ---eqn.BO027
then we have
β1=ρ+τ ---eqn.BO025
β2=ρ-τ ---eqn.BO026
ρ+σ=α1 ---eqn.BO029
ρ-σ≠α2 ---eqn.BO030
<a name="ch13b129">Index beginIndex this file
Ask to find matrix D in the
following form
D=[w x] ---eqn.BO070
[y z]
which satisfy next equation
Dβ=α ---eqn.BO001
that is
[w x]*[ρ+τ]=[ρ+σ] ---eqn.BO072
[y z] [ρ-τ] [ρ-σ]
Matrix in eqn.BO072 is unknown.
//matrix was given in eqn.13.14<a name="ch13b130">
Expand eqn.BO072, find
w*(ρ+τ)+x*(ρ-τ)=ρ+σ ---eqn.BO073
y*(ρ+τ)+z*(ρ-τ)=ρ-σ ---eqn.BO074
There are four unknowns w,x,y,z
two equations. We need two more
constraint to solve this problem.
<a name="ch13b131">
Given
w+x=1 ---eqn.BO075
y+z=1 ---eqn.BO076
Find w,x,y,z.
<a name="ch13b132">
Above is problem statement.
Next solve for w,x,y,z.
eqn.BO072 is re-write eqn.13.14
Only change is put 2x2 matrix
to an unknown state. All other
variables are given.
<a name="ch13b133">Index beginIndex this file
Expansion of eqn.BO072 get
w*(ρ+τ)+x*(ρ-τ)=ρ+σ ---eqn.BO073
y*(ρ+τ)+z*(ρ-τ)=ρ-σ ---eqn.BO074
Use eqn.BO075 and eqn.BO076 to
reduce unknown from w,x,y,z
four unknown to x,z two unknown
(1-x)*(ρ+τ)+x*(ρ-τ)=ρ+σ ---eqn.BO077
(1-z)*(ρ+τ)+z*(ρ-τ)=ρ-σ ---eqn.BO078
<a name="ch13b134">
Re-group get
(ρ+τ)-x*(ρ+τ)+x*(ρ-τ)=ρ+σ ---eqn.BO079
(ρ+τ)-z*(ρ+τ)+z*(ρ-τ)=ρ-σ ---eqn.BO080
or
(ρ+τ)-x*(2τ)=ρ+σ ---eqn.BO081
(ρ+τ)-z*(2τ)=ρ-σ ---eqn.BO082
<a name="ch13b135">
or
(ρ+τ)-(ρ+σ)=x*(2τ) ---eqn.BO083
(ρ+τ)-(ρ-σ)=z*(2τ) ---eqn.BO084
or
(τ-σ)=x*(2τ) ---eqn.BO085
(τ+σ)=z*(2τ) ---eqn.BO086
or
(τ-σ)/(2τ)=x ---eqn.BO087
(τ+σ)/(2τ)=z ---eqn.BO088
Above is solution for x,z.
<a name="ch13b136">
verify
x+z=[(τ-σ)/(2τ)]+[(τ+σ)/(2τ)]
x+z=[(2τ)/(2τ)]
x+z=1 ---eqn.BO089
This is a lucky coincidence.
D matrix is
D=[w x] ---eqn.BO070
[y z]
now become
D=[w (τ-σ)/(2τ)] ---eqn.BO090
[y (τ+σ)/(2τ)]
<a name="ch13b137">
eqn.BO089 tell us that
D matrix column_#2 elements
sum to one.
Constraint
w+x=1 ---eqn.BO075
tell us that
Elements of D matrix row_#1
sum to one.
<a name="ch13b138">Index beginIndex this file
eqn.BO089 and eqn.BO075 has x
as common element.
Both sum to one. Then
w=z=(τ+σ)/(2τ) ---eqn.BO091
Similarly
y=x=(τ-σ)/(2τ) ---eqn.BO092
<a name="ch13b139">
Finally D matrix is
D=[(τ+σ)/(2τ) (τ-σ)/(2τ)] ---eqn.BO093
[(τ-σ)/(2τ) (τ+σ)/(2τ)]
Compare eqn.BO093 with matrix
in eqn.13.14. We found
Hardy, Littlewood, Polya
Representation matrix !!
2010-07-05-09-51 here
<a name="ch13b140">
Above calculation is an easy
job for us. Because we start
on right track.
We start with eqn.BO072 and
eqn.BO075, eqn.BO076.
<a name="ch13b141">
Three masters Hardy,
Littlewood, Polya joint work
for this problem !?
It must NOT be an easy task
for problem developer.
<a name="ch13b142">
Thank you Professor Hardy.
Thank you Professor Littlewood.
Thank you Professor Polya.
Certainly
Thank you Professor J. Michael Steele.
2010-07-05-09-57 stop
<a name="ch13b143">Index beginIndex this file
2010-07-05-10-12 start
■ Majorization ?! Interpolation !!
Can you see what is the meaning
of next two equations?
w*(ρ+τ)+x*(ρ-τ)=ρ+σ ---eqn.BO073
y*(ρ+τ)+z*(ρ-τ)=ρ-σ ---eqn.BO074
<a name="ch13b144">
Let us re-write them as next
z*a+x*b=c ---eqn.BO094
x*a+z*b=d ---eqn.BO095
Here (a,b) and (c,d) are two
pairs data. x and z are
interpolation coefficients.
x+z=1 ---eqn.BO089
<a name="ch13b145">
with
0 ≦ σ ≦ τ ---eqn.BO065eqn.BO091 and eqn.BO092 tell us
that both
x≧0 and z≧0 ---eqn.BO096
Then eqn.BO094 and eqn.BO095 are
interpolation of (c,d) on (a,b)
<a name="ch13b146">
For example, assume
a=2 ---eqn.BO097 (four lines)
b=10
x=0.6
z=0.4
we find
c=z*a+x*b=6.8 ---eqn.BO098
d=x*a+z*b=5.2 ---eqn.BO099
<a name="ch13b147">Index beginIndex this file
On the other hand, (a,b) bound
is (2,10) what do we get if
extrapolate 15 on (2,10) ?
Solve for
15=m*2+n*10 ---eqn.BO100
1=m+n ---eqn.BO101
the answer is
m = -5/8 ---eqn.BO102
n = 13/8 ---eqn.BO103
<a name="ch13b148">
extrapolation coefficients sum
to one, because eqn.BO101 require
to be one.
extrapolation coefficients are
out of [0,1] domain !!
<a name="ch13b149">
eqn.BO073, eqn.BO074 with
eqn.BO089 and eqn.BO096 tell
us that
all elements in (ρ+σ, ρ-σ) are
interpolation of (ρ+τ, ρ-τ)<a name="ch13b150">
The fancy word is that
(ρ+τ, ρ-τ) majorize (ρ+σ, ρ-σ)
or
(ρ+σ,ρ-σ) majorized by (ρ+τ,ρ-τ)
Majorization ?! Interpolation !!
2010-07-05-10-53 stop
<a name="ch13b151">Index beginIndex this file
2010-07-05-15-35 start
■ Problem 13.4 (Schur's
Majorization Inequality)
<a name="ch13b152">
Show that if
φ:(a,b)→Real ---eqn.BO104
is a convex function, then the
function
f:(a,b)n→Real ---eqn.BO105
defined by the sum
f(x1,x2,...,xn)
=∑[k=1,n]φ(xk) ---eqn.13.17
is Schur convex.
<a name="ch13b153">
Thus, for α,β∈(a,b)n with
α(<)β ---eqn.BO106
one has
∑[k=1,n]φ(αk)
≦∑[k=1,n]φ(βk) ---eqn.13.18
2010-07-05-15-47 here
<a name="ch13b154">
Please pay attention to the
following points.
(1) φ:(a,b)→Real say
φ(t) is a ONE variable
function. variable t in (a,b)
Function value φ(t) in Real
<a name="ch13b155">
(2) f:(a,b)n→Real say
f(x) is a n variable
function. variable x has
n components. Each component
is in (a,b). Then x∈(a,b)n
f(x) has one output
which is in whole real axis.
<a name="ch13b156">Index beginIndex this file
(3) f(x1,x2,...,xn)=∑[k=1,n]φ(xk)
say that f(x) is a summation
of φ(t).
Alert, not product of φ(t).<a name="ch13b157">
(4) f(x1,x2,...,xn) is defined
for a generic sequence x
When compare magnitude, need
TWO sequences α, β with
majorization relation α(<)β.
Compare f(α) with f(β).
2010-07-05-16-06 here
<a name="ch13b158">
Example from textbook.
Walker's inequalityeqn.13.3
Two sequences are (a,b,c) and
(x,y,z)=(b+c-a,c+a-b,a+b-c) ---eqn.BO107
Require a,b,c,x,y,z all be
positive number.
<a name="ch13b159">
In this example φ(t) is
φ(t)=1/t ---eqn.BO108
and
f(a,b,c)=1/a + 1/b + 1/c ---eqn.BO109
which is same as eqn.BN030
We know that φ(t)=1/t is convex
for t>0 (Curve shape like ╰╯)
tute0046.htm#ch13a076 say that
(a,b,c) (<) (x,y,z) ---eqn.BN026<a name="ch13b160">
Then Problem 13.4 (Schur's
Majorization Inequality)
assure that (see eqn.13.3)
f(a,b,c)≦f(x,y,z) ---eqn.BO110
Up to here, it is just one
example. (not proof)
Prove Schur's Majorization
Inequality below.
2010-07-05-16-28 stop
<a name="ch13b161">Index beginIndex this file
2010-07-05-16-59 start
Schur's Majorization Inequality
cover Jensen's Inequality.
Review Jensen's Inequality
If function domain is convex set
If equation is convex function
Jensen's Inequality says:
AM in domain ≦ AM in range
<a name="ch13b162">
Use Walker's inequalityeqn.13.3
as example. Change a little bit.
Instead of (a,b,c) and
(x,y,z)=(b+c-a,c+a-b,a+b-c)
two sequence.
<a name="ch13b163">
Now use (a,b,c) and
(d,d,d)=((a+b+c)/3, ---eqn.BO111
(a+b+c)/3,
(a+b+c)/3)
(d,d,d) is majorized by (a,b,c)
Please see tute0046.htm#ch13a035
and
φ(t)=1/t ---eqn.BO108
is convex on y in (0, infinity)
<a name="ch13b164">
Schur's Majorization Inequality
tell us that for
f(d,d,d)=1/d + 1/d + 1/d
=3/(a+b+c) + 3/(a+b+c)
+3/(a+b+c)
f(d,d,d)=9/(a+b+c) ---eqn.BO112
is majorized by
f(a,b,c)=1/a + 1/b + 1/c ---eqn.BO113
<a name="ch13b165">Index beginIndex this file
that is
f(d,d,d) ≦ f(a,b,c) ---eqn.BO114
or
9/(a+b+c) ≦ 1/a + 1/b + 1/c ---eqn.BO115
<a name="ch13b166">
On the other hand,
Jensen's Inequality says:
AM in domain ≦ AM in range
For function
φ(t)=1/t ---eqn.BO108
and sequence (a,b,c)
AM of (a,b,c) is
x_AM=(a+b+c)/3 ---eqn.BO116
AM in domain is
f(x_AM)=1/x_AM=3/(a+b+c) ---eqn.BO117
<a name="ch13b167">
AM in range is
(f(a)+f(b)+f(c))/3
that is
(1/a +1/b +1/c)/3 ---eqn.BO118
<a name="ch13b168">
Jensen's Inequality says:
AM in domain ≦ AM in range
give us
3/(a+b+c) ≦ (1/a +1/b +1/c)/3 ---eqn.BO119
Move greater than side
denominator 3 to less than side
get
9/(a+b+c) ---eqn.BO120
≦ (1/a +1/b +1/c)
<a name="ch13b169">Index beginIndex this file
Schur's Majorization Inequality
result eqn.BO115 and
Jensen Inequality result eqn.BO120
are the same !!
2010-07-05-17-29 stop
<a name="ch13b170">
2010-07-05-18-33 start
If a function is convex. If this
function is differentiable. then
convex and Schur convex are the
same. because symmetric analytic
convex function satisfy Schur's
Criterion eqn.13.4<a name="ch13b171">Exercise 6.19 say Convex function
slope monotone increase.
The default not-told part is that
"independent variable increase".
The whole picture is
when independent variable increase
convex function slope monotone
increase.
<a name="ch13b172">
Above is for one variable in n
variables.
Use one example, let
x=[x1,x2,...,xn] ---eqn.BO121
and
φ(t)=1/t ---eqn.BO122
Define
f(x)=∑[i=1,n]φ(xi) ---eqn.BO123
then
f(x)=1/x1 + 1/x2 + ...
+ 1/xn ---eqn.BO124
<a name="ch13b173">Index beginIndex this file
For all n variables, the variable
pairs (xj, xk) and slope pair
(d[f(x)]/d[xj], d[f(x)]/d[xk])
are similar ordered.
<a name="ch13b174">
Schur's differential
(xj-xk)*(d[f(x)]/d[xj]
-d[f(x)]/d[xk]) ●●change
= ---eqn.BO125 //page 202 line 9
(xj-xk)*(φ'(xj)-φ'(xk))
<a name="ch13b175">
For selected example,
Schur's differential
= ---eqn.BO126
(xj-xk)*[-1/(xj2)-(-1/(xk2))]
is non-negative.
2010-07-05-19-27 here
<a name="ch13b176">
xj=1
xk=2
(xj-xk)*(-1/(xj*xj)-(-1/(xk*xk)))
=
(xj-xk)*(-1/(xj*xj)+1/(xk*xk))
=
(xj-xk)*(-xk*xk+xj*xj)/(xj*xj*xk*xk)
=
(xj-xk)*(xj-xk)*(xj+xk)/(xj*xj*xk*xk)
= 0.75
<a name="ch13b177">
2010-07-05-19-33
Schur's differential
is non-negative.
Because
(xj-xk)*[-1/(xj2)-(-1/(xk2))]
= (xj-xk)2*(xj+xk)/(xj2xk2) ---eqn.BO127
are squares and positive term
sum, which is non-negative.
2010-07-05-19-37 stop
<a name="ch13b178">Index beginIndex this file
2010-07-05-21-11 start
■ Schur's Criterion Limitation
Schur's Criterion eqn.13.4
calculate between two independent
variables, that is manipulate on
two different coordinate axis.
How can we draw ineqaulity result
from two axis calculation?<a name="ch13b179">
For example, assume
g(y,z)=y*y+1/z ---eqn.BO128
Schur's Criterion become
(y-z)*{∂[g(y,z)]/∂y - ∂[g(y,z)]/∂z}
=(y-z)*{2*y - (-1)/z/z}
=(y-z)*(2*y + 1/z/z) ---eqn.BO129
There is no rule promise this
production be non-negative.
y=2,z=3; ans=-4.111111111111111
y=5,z=3; ans=20.22222222222222
Can you answer this question?
<a name="ch13b180">
Schur's Criterion Limitation
is a point worth our attention.
No rule promise eqn.BO129
production be non-negative.
That is true.
<a name="ch13b181">
But
g(y,z)=y*y+1/z ---eqn.BO128
is NOT a Schur convex function.
y*y is convex, true.
1/z is convex, true.
g(y,z) is addition of two convex
functions. That is also true.
<a name="ch13b182">Index beginIndex this file
Schur require that ONE function
φ:(a,b)→Real ---eqn.BO104
be convex, and the multi-variable
function
f(x1,x2,...,xn)=∑[k=1,n]φ(xk) ---eqn.13.17
be summation of function φ. Each φ
sit on different coordinate axis.
Measured with this requirement,
g(y,z)=y*y+1/z ---eqn.BO128
is NOT a Schur convex function.
//h1(y,z)=y*y+z*z IS Schur convex
//h2(y,z)=1/y+1/z IS Schur convex
<a name="ch13b183">
Return to Problem 13.1
Schur's Criterion.
Problem 13.1 did not require the
definition of eqn.13.17. If we
put eqn.BO128 into eqn.13.4
there is trouble. Possibly
Problem 13.1 should include
eqn.13.17 as part of definition.
<a name="ch13b184">Schur's Criterion has a relative
small definition domain.
Many equation not qualified and
many sequences not qualified.
When we apply Schur's Criterion
we must be very careful.<a name="ch13b185">
Schur's Criterion Limitation is
(similar notes)
(1) ONE function
φ(t):(a,b)→Real ---eqn.BO104
and the multi-variable
function
f(x1,x2,...,xn)=∑[k=1,n]φ(xk) ---eqn.13.17
is a SUMMATION of φ(t) and
change t to x1,x2,...,xn
then
g(y,z)=y*y+1/z ---eqn.BO128
is not qualified.<a name="ch13b186">
(2) f(x) be symmetry to all
of its variables.
This requirement is actually
a result of (1).
<a name="ch13b187">Index beginIndex this file
(3) One function defined and two
sequences compared. These two
sequences have majorization
relation.
<a name="ch13b188">
Although
g(y,z)=y*y+1/z ---eqn.BO128
is not qualified.
but
h(y,z)=1/y+1/z ---eqn.BO130
is qualified.
Because we have ONE function
φ(t)=1/t and multi-variable
function be summation of φ(t).
<a name="ch13b189">
That is
h(y,z)=φ(y)+φ(z) ---eqn.BO131
If φ(t)=y*y or φ(t)=exp(t) or
whatever, φ(y) and φ(z) are
identical twin. Calculation
between φ(y) and φ(z) is
reasonable.
<a name="ch13b190">
Liu,Hsinhan had question about
Schur's Criterion eqn.13.4.
After think, conclude
g(y,z)=y*y+1/z ---eqn.BO128
is not qualified and
h(y,z)=1/y+1/z ---eqn.BO130
is possible.
<a name="ch13b191">
Do you agree? You can say no,
you can present a better
explanation. This page is just
a study notes. This page may
contain errors!
2010-07-05-22-11 stop
<a name="ch13b192">Index beginIndex this file
2010-07-05-22-36 start
■ Irrational beat rational
LiuHH wrote a small program
fraction to quotient
Click [t] button (test), program
generate a x=0.5945165945165945
and report x box filled 412/693
Click [fill M5] build integers.
Then click [x*M5], box M6 show
<a name="ch13b193">
answer
[[
Find minimum fraction.
Min: i=692,inNumb[i]=693
frac=0
x=0.5945165945165945
x=412/693
]]
<a name="ch13b194">
This program convert fraction to
quotient. During work, try change
from x=0.5945165945165945
to x=0.5945265945165945
Just alter one digit. Program
<a name="ch13b195">
report
[[
Find minimum fraction.
x=0.5945265945165945
x=NaN/undefined
iter=3000,min0=90,min1=90
minAtI=-1; Please extend n2
]]
<a name="ch13b196">
LiuHH think:
There are infinite many rational
There are infinite many irrational
Which one has more count?
What is the count ratio?
<a name="ch13b197">Index beginIndex this file
For example
1/3=
0.333333333333333333333333333...
Alter any '3' to for example '2'
the report is
x=NaN/undefined
<a name="ch13b198">Each altered digit string is
NOT a rational number any more
because rational number has
repeat pattern. I just alter
one digit, there is no repeat,
so the new string is irrational.
x=412/693=0.5945165945165945
repeat "594516" infinity times.
<a name="ch13b199">
One rational number 1/3 has
infinite many '3'. I can alter
infinite many different ways.
The ratio of rational 1/3 to
correspond irrational is one
to infinity !!
Any rational number be the same.
On the real axis, the ratio of
rational to irrational is one
to infinity !!
<a name="ch13b200">
Draw a line between 0 and 1.
Put pencil tip arbitrary on line
The probability of hitting a
rational is nearly zero.
The probability of hitting an
irrational is nearly one.
This is not a proof. Just a
break time think.
2010-07-05-22-55 stop
<a name="ch13b201">Index beginIndex this file
2010-07-06-10-28 start
■ A Direct Approach to Schur's
Majorization Inequality
eqn.BO125 use differentiation
φ'(xj) and φ'(xk) to evaluate
Schur's Majorization Inequality.
The following we do same thing
but not use differentiation.
<a name="ch13b202">
Schur's Majorization Inequality
∑[k=1,n]φ(αk)
≦∑[k=1,n]φ(βk) ---eqn.13.18
Given α, β the next relation
α(<)β ---eqn.BO106
Problem 13.3 promise that
we can find a doubly stochastic
matrix D such that
α=D*β ---eqn.BN121<a name="ch13b203">
Write eqn.BN121 in longhand for
each j,1,2,...,n, we have
αj=∑[k=1,n]djk*βk ---eqn.BO132
where
dj1+dj2+...+djn=1 ---eqn.BO133
(textbook page 202 line -11)
100% numerical example in matrix
form eqn.BN122
100% numerical expanded example
eqn.BO009 to eqn.BO013
50% numerical example eqn.BN148<a name="ch13b204">eqn.BO106 is a given condition.
Problem 13.3 proved result tell
us that α=D*β is also GIVEN.
For example, eqn.BN148 is given.
<a name="ch13b205">
From eqn.BN148 take α3 as example
α3 = β1*1/9+β2*16/90+β3*64/90
+β4*0 ---eqn.BO134
α3 is observed data.
β1 to β4 are also observed data.
α3 is an interpolation of β1 to
β4. (1/9,16/90,64/90 >0, sum=1)
<a name="ch13b206">Index beginIndex this file
Next, apply Jensen's inequality
to eqn.BO134. Jensen take α3 as
an input, as an independent
variable.
<a name="ch13b207">
Jensen's inequality say
For convex function, we have
AM_DOMAIN ≦ AM_RANGE ---eqn.BO135
See eqn.6.2
For current problem, we assume
that φ(t) is a convex function.
<a name="ch13b208">
Jensen's inequality tell us
φ(x_AM) ≦ AM of φ(xi) ---eqn.BO136
Put eqn.BO134 into eqn.BO136 get
φ(β1*1/9+β2*16/90+β3*64/90+0)
≦ φ(β1)*1/9+φ(β2)*16/90
+φ(β3)*64/90+φ(β4)*0 ---eqn.BO137
Red line is AM_DOMAIN
≦
Blue line is AM_RANGE
<a name="ch13b209">Alert, red line is same as φ(α3)
Alert, coefficient sum to one
1/9+16/90+64/90+0 = 1 ---eqn.BO138
all four coefficients≧0
We can call red line and blue
line as "mean" only if coefficient
sum to one and all coefficients≧0.
Both red and blue use SAME
coefficient set.
<a name="ch13b210">
Inequality eqn.BO137 is a
numerical example. Symbolic
expression is textbook page 202
line -9 equation
φ(αj)≦∑[k=1,n]djk*φ(βk) ---eqn.BO139
compare //above φ(data), below data
αj= ∑[k=1,n]djk*βk ---eqn.BO132
Observed data αj and βk have
equality relation.
Function φ(t) is convex. then
φ(αj) and φ(βk) have inequality
relation.
<a name="ch13b211">Index beginIndex this file
Inequality eqn.BO139 is our mid
point. Next, sum j from j=1 to j=n
That is write down n inequalities
like eqn.BO139 and sum them. The
result is
---page 202
---line 26
---eqn.BO140
width of above equation
<a name="ch13b213">
2010-07-06-12-18 here
eqn.BO140 red inequality is
n copy of eqn.BO139, which is
a result of Jensen's inequality.
eqn.BO140 red equality is switch
j,k summation to k,j summation.
eqn.BO140 blue equality use the
property that doubly stochastic
matrix column sum to one.
2010-07-06-12-27 stop
<a name="ch13b214">
2010-07-06-16-10 start
The net result of eqn.BO140 is
∑[j=1,n]φ(αj)
≦∑[k=1,n]φ(βk) ---eqn.BO141
We start from given
α(<)β ---eqn.BO106
and arrived eqn.BO141, it is
target eqn.13.18 exactly.
This Direct Approach to Schur's
Majorization Inequality did not
use function differentiation.
<a name="ch13b215">Index beginIndex this file
2010-07-06-16-18 here
■ Problem 13.5
A Day-to-Day Example
(Textbook page 203 top)
Given that x,y,z in (0,1) such
that
max(x,y,z)≦(x+y+z)/2<1 ---eqn.13.19
Show that one has the bound
---page 203
---line 8
---eqn.13.20
width of above equation
<a name="ch13b217">
2010-07-06-16-30 here
If do not view from majorization
This problem is hard to solve.
eqn.13.19 is given. What it tell
us?
(x,y,z) is a sequence.
(x+y+z)/2 is sequence sum divide
by 2, not divide by 3, then
(x+y+z)/2 is not Arithmetic Mean
<a name="ch13b218">
Can we build second sequence from
the given (x,y,z). How to build?
From the expression (x+y+z)/2 let
us define
s=(x+y+z)/2 ---eqn.BO142
<a name="ch13b219">
Second sequence can be (s,s,0)
Both (x,y,z) and (s,s,0) have
same total sum x+y+z.
Two seq. have same total sum
that is majorization relation
necessary condition.
<a name="ch13b220">Index beginIndex this file
We assume that
x≧y≧z ---eqn.BO143
This assumption still let us
keep problem's generality.
Next exam (x,y,z) and (s,s,0)
for majorization relation.
<a name="ch13b221">
First one term comparison is
x ≦?≧ s ---eqn.BO144
How do we know it ?
Given condition eqn.13.19
max(x,y,z)≦(x+y+z)/2 ---eqn.13.19
let us have
x ≦ s ---eqn.BO145
where s=(x+y+z)/2
//(x+y+z)/2<1 in eqn.13.19 let
//eqn.13.20 denominator be
//non zero.
<a name="ch13b222">
First two term comparison is
x+y ≦?≧ s+s ---eqn.BO146
Since
s+s=2*[(x+y+z)/2]=x+y+z ---eqn.BO147
eqn.BO146 become
x+y ≦?≧ x+y+z ---eqn.BO148
Given x,y,z in (0,1) be positive
x+y ≦ x+y+z ---eqn.BO149
is true.
<a name="ch13b223">
Final equality is already checked
in
[[
Both (x,y,z) and (s,s,0) have
same total sum x+y+z.
]]
<a name="ch13b224">
We conclude that
Majorization relation exist.
it is
(x,y,z) (<) (s,s,0) ---eqn.BO150
<a name="ch13b225">Index beginIndex this file
Schur's Majorization Inequality
∑[k=1,n]φ(αk)
≦∑[k=1,n]φ(βk) ---eqn.13.18
is a equation of summation.
Problem 13.5 target equation
eqn.13.20 is a equation of
multiplication. They are totally
different. What to do?
2010-07-06-16-56 stop
<a name="ch13b226">
2010-07-06-18-27 start
We can modify the target equation
from [(1+x)/(1-x)]*[(1+y)/(1-y)]*[(1+z)/(1-z)]
to [(1+x)/(1-x)]+[(1+y)/(1-y)]+[(1+z)/(1-z)]
<a name="ch13b227">
Certainly, this change is wrong.
We know that
log(ABC)=log(A)+log(B)+log(C) ---eqn.BO151
If we take log of eqn.13.20, then
multiplication ABC become addition
log(A)+log(B)+log(C)
<a name="ch13b228">
Next question is
inequality of ABC and
inequality of log(A)+log(B)+log(C)
are they the same?
Answer is yes, they are the same.
<a name="ch13b229">
The reason is that log() function
is a monotone increase function.
Draw log(x) curve and take two
points x1 and x2. If
x1<x2 ---eqn.BO152
then
log(x1)<log(x2) ---eqn.BO153
We can add log(), we can drop
log(), INEQUALITY is unchanged.<a name="ch13b230">Index beginIndex this file
exp(x) is monotone increase
function. Although exp(x) is
concave. But take exp() for
one expression, the inequality
direction is the same.
<a name="ch13b231">
On the other hand, if define
g(x)=1/x ---eqn.BO154
then
x1<x2 ---eqn.BO152
get reversed inequality
1/x1>1/x2 ---eqn.BO155
Because g(x)=1/x is monotone
DEcrease.
2010-07-06-18-53 here
<a name="ch13b232">
Now define
f(x,y,z)= log{[(1+x)/(1-x)]
*[(1+y)/(1-y)]
*[(1+z)/(1-z)]
} ---eqn.BO156
that is
<a name="ch13b233">
f(x,y,z)= log[(1+x)/(1-x)]
+log[(1+y)/(1-y)]
+log[(1+z)/(1-z)] ---eqn.BO157
<a name="ch13b234">
Compare eqn.BO157 with eqn.13.17
find we need define
φ(t)=log[(1+t)/(1-t)] ---eqn.BO158
Is φ(t) a convex function?
Check.
<a name="ch13b235">Index beginIndex this file
d[φ(t)]/dt
=d{log[(1+t)/(1-t)]}/dt
=[(1-t)/(1+t)]*d[(1+t)/(1-t)]/dt
=[(1-t)/(1+t)]*[1/(1-t)
+ (1+t)*(-1)*(-1)/(1-t)/(1-t)]
=[(1-t)/(1+t)]
*[(1-t)+(1+t)]/(1-t)/(1-t)
=[(1-t)/(1+t)]
*[2]/(1-t)/(1-t)
=[2]/(1-t)/(1+t)
<a name="ch13b236">
First derivative is
d[φ(t)]/dt
=2/(1-t*t) ---eqn.BO158
The expression [*(-1)*(-1)]
one [*(-1)] come from 1/(1-t) has
power -1
Second [*(-1)] come from (1-t)
has coefficient -1.
2010-07-06-19-49 here
<a name="ch13b237">
Continue for second derivative
dd[φ(t)]/dt/dt
=d[2/(1-t*t)]/dt
=2*(-1)*(-2*t)/(1-t*t)/(1-t*t)
<a name="ch13b238">
Second derivative is
dd[φ(t)]/dt/dt
=4*t/(1-t*t)/(1-t*t) ---eqn.BO159
<a name="ch13b239">Problem 13.5 given that x,y,z in (0,1)
then t is in (0,1)
Second derivative is positive in
(0,1). Therefore
φ(t)=log[(1+t)/(1-t)] ---eqn.BO158
is convex on (0,1).
eqn.BO157 f(x,y,z) is Schur convex.
<a name="ch13b240">Index beginIndex this file
We conclude that
(x,y,z) (<) (s,s,0) ---eqn.BO150
is true. Apply
Schur's Majorization Inequality
eqn.13.18 find the next relation.
similar to
---page 203
---line 8
---eqn.BO160
width of above equation
<a name="ch13b242">
2010-07-06-20-19 here
In eqn.BO160, carry out
log(ABC)=log(A)+log(B)+log(C) ---eqn.BO151
and DROP log() from equation
get //how can I drop log()?<a name="ch13b243">
//next equation is ---eqn.BO161
[(1+x)/(1-x)]*[(1+y)/(1-y)]
*[(1+z)/(1-z)] //seq. (x,y,z)≦[(1+s)/(1-s)]*[(1+s)/(1-s)]
*[(1+0)/(1-0)] //seq. (s,s,0)
eqn.BO161 is our target equation
eqn.13.20
Problem 13.5 is done.
2010-07-06-20-48 stop
<a name="ch13b244">Index beginIndex this file
2010-07-07-10-57 start
■ Overdetermined system
In D matrix
D=[w x] ---eqn.BO070
[y z]
we found
x+z=1 ---eqn.BO089
This is a lucky coincidence.
Why 'lucky coincidence' ?
<a name="ch13b245">Textbook page 198, line -1 say
"The system is overdetermined,
but it does have a solution."
Why 'overdetermined' ?
<a name="ch13b246">
Matrix D has four unknowns, w,
x,y,z. We need four constraint
to solve for four unknowns.
Constraints sre
w*(ρ+τ)+x*(ρ-τ)=ρ+σ ---eqn.BO073
y*(ρ+τ)+z*(ρ-τ)=ρ-σ ---eqn.BO074
and
<a name="ch13b247">
w+x=1 ---eqn.BO075
y+z=1 ---eqn.BO076
In above four equations
ρ,τ,σ are given constants
w,x,y,z are unknowns.
<a name="ch13b248">Index beginIndex this file
We solve four unknowns and four
constraint problem, get
(τ-σ)/(2τ)=x ---eqn.BO087
(τ+σ)/(2τ)=z ---eqn.BO088
verify found
x+z=1 ---eqn.BO089<a name="ch13b249">We are given w+x=1 and y+z=1
the result x+z=1 is a surprise
Because x+z is not constrained.
x+z can be ANY value. Now x+z=1
let us have "column sum to one"
doubly stochastic matrix
requirement for free !!<a name="ch13b250">
x+z=1 is equivalent to
FIFTH constraint. We have a
four unknown system. Fifth
constraint is overdetermined.
Fifth is consistent with
previous four, so we are lucky.
<a name="ch13b251">
Prof. J. Michael Steele say
"overdetermined, but ..." and
LiuHH say "lucky coincidence"
Is it necessary to be one ?!
Under what condition column
sum x+z would NOT be one?
<a name="ch13b252">
Let us start from
[w x]*[ρ+τ]=[ρ+σ] ---eqn.BO072
[y z] [ρ-τ] [ρ-σ]
Now insert coefficients a,b,c,d
find out if use non-one a,b,c,d.
what do we get ? Treat a,b,c,d
as adjustable constants.
2010-07-07-11-24 here
<a name="ch13b253">Index beginIndex this file
2010-07-07-11-41 start
Write eqn.BO072 as
[w x]*[aρ+bτ]=[cρ+dσ] ---eqn.BO162
[y z] [aρ-bτ] [cρ-dσ]
If set a=b=c=d=1 then no change.
Expand eqn.BO162, find
w*(aρ+bτ)+x*(aρ-bτ)=cρ+dσ ---eqn.BO163
y*(aρ+bτ)+z*(aρ-bτ)=cρ-dσ ---eqn.BO164
Solve for w,x,y,z.
<a name="ch13b254">
Use eqn.BO075 and eqn.BO076 to
reduce unknown from w,x,y,z
four unknown to x,z two unknown
(1-x)*(aρ+bτ)+x*(aρ-bτ)=cρ+dσ ---eqn.BO165
(1-z)*(aρ+bτ)+z*(aρ-bτ)=cρ-dσ ---eqn.BO166
<a name="ch13b255">
Re-group get
(aρ+bτ)-x*(aρ+bτ)+x*(aρ-bτ)=cρ+dσ ---eqn.BO167
(aρ+bτ)-z*(aρ+bτ)+z*(aρ-bτ)=cρ-dσ ---eqn.BO168
or
(aρ+bτ)-x*(2bτ)=cρ+dσ ---eqn.BO169
(aρ+bτ)-z*(2bτ)=cρ-dσ ---eqn.BO170
<a name="ch13b256">
or
(aρ+bτ)-(cρ+dσ)=x*(2bτ) ---eqn.BO171
(aρ+bτ)-(cρ-dσ)=z*(2bτ) ---eqn.BO172
or
(aρ-cρ)+(bτ-dσ)=x*(2bτ) ---eqn.BO173
(aρ-cρ)+(bτ+dσ)=z*(2bτ) ---eqn.BO174
<a name="ch13b257">Index beginIndex this file
or
[(aρ-cρ)+(bτ-dσ)]/(2bτ)=x ---eqn.BO175
[(aρ-cρ)+(bτ+dσ)]/(2bτ)=z ---eqn.BO176
Above is solution for x,z.
<a name="ch13b258">
verify column sum x+z
x+z={[(aρ-cρ)+(bτ-dσ)]/(2bτ)}
+{[(aρ-cρ)+(bτ+dσ)]/(2bτ)} ---eqn.BO177
x+z={[(aρ-cρ)+(bτ-dσ)] ---eqn.BO178
+ [(aρ-cρ)+(bτ+dσ)]}/(2bτ)
x+z={2*(aρ-cρ)+ 2*bτ}/(2bτ) ---eqn.BO179
<a name="ch13b259">
x+z={aρ-cρ + bτ}/(bτ) ---eqn.BO180
if a≠c, then
x+z={aρ-cρ + bτ}/(bτ)
NOT = 1
if a=c, then x+z=1
<a name="ch13b260">
We started from eqn.BO072
all ρ,τ,σ coefficients be one.
This condition give us column
sum to one result.
"Overdetermined" become
just-right-determined.
2010-07-07-12-01 stop
<a name="ch13b261">Index beginIndex this file
2010-07-07-13-12 start
Textbook page 204 figure 13.2
illustrate the relations between
α∈H(β)α(<)βα=Dβ
α=T1T2...Tnβ ---eqn.BO181
Please see Proof of eqn.BO181
<a name="ch13b262">
//similar equation
//α seq=P2*R2*P1*R1*ß seq ---eqn.maj01
//P2=permutation matrix, second step
//R2=RobinHood average mat. 2nd step
Figure 13.2 is not in this study
note page. Please read textbook.
2010-07-07-13-28 stop
2010-07-07-19-12 done first proofread
2010-07-08-13-59 done second proofread
<a name="ch13b263">Index beginIndex this file
2010-07-08-15-33 start
■ Doubly stochastic matrix
multiply DS get DS
Problem 13.3 given condition is
α and β two sequences have
majorization relation
α(<)β ---eqn.BO066
ask to prove there exists a
doubly stochastic matrix D
such that //example 12
α=Dβ ---eqn.BO001<a name="ch13b264">
The build of matrix D need several
iterations. i-th iteration find a
matrix Di, the final answer is
α=Dβ=DnDn-1...D2D1β ---eqn.BO182
Doubly stochastic matrix Di
multiply
doubly stochastic matrix Dj
get matrix Dk. We need to know
whether matrix Dk is
doubly stochastic or not?
<a name="ch13b265">
Link to symbolic n=3, n=general
Start with 3x3 numerical example.
First
α seq.= 8 6 4 ---eqn.BO183
β seq.= 9 7 2 ---eqn.BO184
get doubly stochastic
matrix m =
[5/6 1/30 2/15] ---eqn.BO185
[ 0 4/5 1/5 ]
[1/6 1/6 2/3 ]
<a name="ch13b266">
Second
α seq.= 0.5 0.5 0 ---eqn.BO186
β seq.= 2 0 -1 ---eqn.BO187
get doubly stochastic
0.45 0.15 0.4
0.25 0.75 0
0.3 0.1 0.6
that is
matrix n =
[9/20 3/20 2/5] ---eqn.BO188
[1/4 3/4 0 ]
[3/10 1/10 3/5]
<a name="ch13b267">
matrix m multiply matrix n =
[5/6 1/30 2/15] [9/20 3/20 2/5]
[ 0 4/5 1/5 ]*[1/4 3/4 0 ]
[1/6 1/6 2/3 ] [3/10 1/10 3/5]
= matrix s =
[s11 s12 s13] ---eqn.BO189
[s21 s22 s23]
[s31 s32 s33]
<a name="ch13b268">Index beginIndex this file
where
s11=5/6*9/20 + 1/30*1/4 + 2/15*3/10
=127/300 ---eqn.BO190
s12=5/6*3/20 + 1/30*3/4 + 2/15*1/10
= 49/300 ---eqn.BO191
s13=5/6*2/5 + 1/30*0 + 2/15*3/5
=31/75 ---eqn.BO192
<a name="ch13b269">
s21= 0 *9/20 + 4/5 *1/4 + 1/5 *3/10
=13/50 ---eqn.BO193
s22= 0 *3/20 + 4/5 *3/4 + 1/5 *1/10
=31/50 ---eqn.BO194
s23= 0 *2/5 + 4/5 *0 + 1/5 *3/5
=3/25 ---eqn.BO195
<a name="ch13b270">
s31=1/6*9/20 + 1/6 *1/4 + 2/3 *3/10
=19/60 ---eqn.BO196
s32=1/6*3/20 + 1/6 *3/4 + 2/3 *1/10
=13/60 ---eqn.BO197
s33=1/6*2/5 + 1/6 *0 + 2/3 *3/5
=7/15 ---eqn.BO198
<a name="ch13b271">
matrix m multiply matrix n =
[5/6 1/30 2/15] [9/20 3/20 2/5]
[ 0 4/5 1/5 ]*[1/4 3/4 0 ]
[1/6 1/6 2/3 ] [3/10 1/10 3/5]
= matrix s = ---eqn.BO199
[127/300 13/50 19/60]
[ 49/300 31/50 13/60]
[ 31/75 3/25 7/15]
<a name="ch13b272">
Now matrix s is a product of two
doubly stochastic matrix m and n
Is matrix s doubly stochastic ?
It is clear that all matrix s
elements are positive. Need to
check matrix s column/row sums.
(define doubly stochastic)
<a name="ch13b273">Index beginIndex this file
matrix s column 1 sum = ---eqn.BO200
127/300 + 49/300 + 31/75 =1
matrix s column 2 sum = ---eqn.BO201
13/50 + 31/50 + 3/25 =1
matrix s column 3 sum = ---eqn.BO202
19/60 + 13/60 + 7/15 =1
<a name="ch13b274">
matrix s row 1 sum = ---eqn.BO203
127/300 + 13/50 + 19/60 =1
matrix s row 2 sum = ---eqn.BO204
49/300 + 31/50 + 13/60 =1
matrix s row 3 sum = ---eqn.BO205
31/75 + 3/25 + 7/15 =1
<a name="ch13b275">
One numerical example confirmed
that
Doubly stochastic matrix m
multiply
doubly stochastic matrix n
get
doubly stochastic matrix s.
<a name="ch13b276">
Link to symbolic n=general, numerical
Next use symbolic equation to
check n=3 special case.
Doubly stochastic matrix m=
[m11 m12 m13] ---eqn.BO206
[m21 m22 m23]
[m31 m32 m33]
<a name="ch13b277">
doubly stochastic matrix n=
[n11 n12 n13] ---eqn.BO207
[n21 n22 n23]
[n31 n32 n33]
<a name="ch13b278">Index beginIndex this file
Both matrix m and n
each row sum to one
each column sum to one
Both matrix elements be
nonnegative.
<a name="ch13b279">
matrix m multiply matrix n =
[m11 m12 m13] [n11 n12 n13]
[m21 m22 m23]*[n21 n22 n23]
[m31 m32 m33] [n31 n32 n33]
<a name="ch13b280">
= matrix s = ---eqn.BO208
[s11 s12 s13]
[s21 s22 s23]
[s31 s32 s33]
<a name="ch13b281">
where
s11=m11*n11+m12*n21+m13*n31 ---eqn.BO209
s12=m11*n12+m12*n22+m13*n32 ---eqn.BO210
s13=m11*n13+m12*n23+m13*n33 ---eqn.BO211
s21=m21*n11+m22*n21+m23*n31 ---eqn.BO212
s22=m21*n12+m22*n22+m23*n32 ---eqn.BO213
s23=m21*n13+m22*n23+m23*n33 ---eqn.BO214
<a name="ch13b282">ATTENTION
s31=m31*n11+m32*n21+m33*n31 ---eqn.BO215
s32=m31*n12+m32*n22+m33*n32 ---eqn.BO216
s33=m31*n13+m32*n23+m33*n33 ---eqn.BO217
matrix s = matrix m * matrix n
Watch and learn the pattern.
sij=mi1*n1j+mi2*n2j+mi3*n3j ---eqn.BO317
sum k=1k=2k=3
sum middle third index k
from k=1 to k=n. All be the same.
Symbolic equation: eqn.BO233<a name="ch13b283">Index beginIndex this file
s12 is selected for color
illustration, help freshman
reader to know the matrix
product rule.
eqn.BO317 is important too.
<a name="ch13b284">
Check matrix s column 1 sum
s11+s21+s31= ---eqn.BO218
m11*n11+m12*n21+m13*n31
+m21*n11+m22*n21+m23*n31
+m31*n11+m32*n21+m33*n31
=
(m11+m21+m31)*n11 ---eqn.BO219
+(m12+m22+m32)*n21
+(m13+m23+m33)*n31
<a name="ch13b285">
The given condition is
matrix m
each row sum to one
each column sum to one
then we have
m11+m21+m31=1 ---eqn.BO220
m12+m22+m32=1 ---eqn.BO221
m13+m23+m33=1 ---eqn.BO222
<a name="ch13b286">
eqn.BO219 become
matrix s column 1 sum
s11+s21+s31= ---eqn.BO223
1*n11+1*n21+1*n31
= n11+ n21+ n31
Still, given matrix n
each row sum to one
each column sum to one
then we have
n11+ n21+ n31 = 1 ---eqn.BO224
matrix s column 1 sum
s11+s21+s31=1 ---eqn.BO225
<a name="ch13b287">
Same reason apply to matrix s
column 2 sum
column 3 sum
row 1 sum
row 2 sum
row 3 sum
therefore matrix s, the result
of matrix m * matrix n, is again
a doubly stochastic matrix.
This is rank=3 special case.
2010-07-08-16-50 here
<a name="ch13b288">Index beginIndex this file
Link to symbolic n=3, numerical
Above is n=3 special case.
Below is n=any general case.
Given rank n
doubly stochastic matrix A
and rank n
doubly stochastic matrix B
Prove that C=AB is also
doubly stochastic matrix
<a name="ch13b289">
Given condition let us have
matrix A column sum be one
∑[i=1,n]aij=1 ---eqn.BO226
for j=1,...,n
matrix A row sum be one
∑[j=1,n]aij=1 ---eqn.BO227
for i=1,...,n
<a name="ch13b290">
matrix B column sum be one
∑[i=1,n]bij=1 ---eqn.BO228
for j=1,...,n
matrix B row sum be one
∑[j=1,n]bij=1 ---eqn.BO229
for i=1,...,n
<a name="ch13b291">
and given
matrix A elements be nonnegative
aij≧0 ---eqn.BO230
for i,j=1,...,n
matrix B elements be nonnegative
bij≧0 ---eqn.BO231
for i,j=1,...,n
<a name="ch13b292">
Matrix production
C=AB ---eqn.BO232
for each elelent cij has
cij=∑[k=1,n]aik*bkj ---eqn.BO233
See ATTENTION for k summation
<a name="ch13b293">Index beginIndex this file
matrix C column sum be
∑[i=1,n]cij ---eqn.BO234
=∑[i=1,n]∑[k=1,n]aik*bkj
= //switch i_sum with k_sum
//bkj no i, sum i first
∑[k=1,n]{∑[i=1,n]aik}*bkj ---eqn.BO235
= // matrix A column sum be one
∑[k=1,n]{1}*bkj
// matrix B column sum be one
= 1 ---eqn.BO236
This is general for j=1,...,n
Above proved
matrix C column sum be ONE.
Next is
<a name="ch13b294">
matrix C row sum be
∑[j=1,n]cij ---eqn.BO237
=∑[j=1,n]∑[k=1,n]aik*bkj
= //switch j_sum with k_sum
//aik no j, sum j first
∑[k=1,n]{∑[j=1,n]bkj}*aik ---eqn.BO238
= // matrix B row sum be one
∑[k=1,n]{1}*aik
// matrix A row sum be one
= 1 ---eqn.BO239
This is general for i=1,...,n
Above proved
matrix C row sum be ONE.
2010-07-08-17-20 here
<a name="ch13b295">
Matrix C elements are defined
in eqn.BO233. Since it is
addition and multiplication
of nonnegative numbers aik
and bkj, then cij must be
nonnegative number.
<a name="ch13b296">
Conclusion:
For doubly stochastic matrix
A and B, their production
matrix C=AB is also a
doubly stochastic matrix.
(how about D=ABC ?)
//doubly stochastic definition
2010-07-08-17-27 stop
<a name="ch13b297">Index beginIndex this file
2010-07-08-18-57 start
■ Convex combination of doubly
stochastic is doubly stochastic
<a name="ch13b298">
Let matrix A,B,C be same rank n
doubly stochastic matrices.
Above discussed that matrix
multiplication
D=ABC ---eqn.BO240
is again doubly stochastic matrix.
<a name="ch13b299">
Below discuss that matrix addition
E=A+B+C ---eqn.BO241
is or is not doubly stochastic.
<a name="ch13b300">
The definition of matrix addition
is termwise addition. That is
[m11 m12 m13] [n11 n12 n13]
[m21 m22 m23]+[n21 n22 n23]
[m31 m32 m33] [n31 n32 n33]
= ---eqn.BO242
[m11+n11 m12+n12 m13+n13]
[m21+n21 m22+n22 m23+n23]
[m31+n31 m32+n32 m33+n33]
<a name="ch13b301">
If matrix M and N are doubly
stochastic, then the addition
result eqn.BO242 can NOT be
doubly stochastic. Because
column_one sum of M+N is
(m11+n11)+(m21+n21)+(m31+n31)
<a name="ch13b302">Index beginIndex this file
which is
(m11+m21+m31)+(n11+n21+n31)
its value is 2. because
(m11+m21+m31) is matrix M
column_one sum, value is one.
(n11+n21+n31) is matrix N
column_one sum, value is one.
Adding two one get two. Then
<a name="ch13b303">
adding three doubly stochastic
E=A+B+C ---eqn.BO241
has column sum three.
has row sum three too.
<a name="ch13b304">
Now assumn
p+q=1 ---eqn.BO243
assume
p≧0 and q≧0 ---eqn.BO244
<a name="ch13b305">
The weighted matrix sum
[m11 m12 m13] [n11 n12 n13]
p*[m21 m22 m23]+q*[n21 n22 n23]
[m31 m32 m33] [n31 n32 n33]
= ---eqn.BO245
[p*m11+q*n11 p*m12+q*n12 p*m13+q*n13]
[p*m21+q*n21 p*m22+q*n22 p*m23+q*n23]
[p*m31+q*n31 p*m32+q*n32 p*m33+q*n33]
IS a doubly stochastic matrix.
<a name="ch13b306">
Because its column_one sum is
(p*m11+q*n11)+(p*m21+q*n21)
+(p*m31+q*n31)
=p*(m11+m21+m31)+q*(n11+n21+n31)
=p*(1)+q*(1)
=p+q = 1 ---eqn.BO246
<a name="ch13b307">Index beginIndex this file
In general case.
If we have n doubly stochastic
matrices
{At: t=1,n} ---eqn.BO247
If we have n non-negative
coefficients
{μt: t=1,n, μt≧0} ---eqn.BO248
such that
∑[t=1,n]μi=1 ---eqn.BO249
<a name="ch13b308">
Then the convex combination of
doubly stochastic matrices
B=∑[t=1,n]μtAt ---eqn.BO250
is doubly stochastic matrix.
<a name="ch13b309">
Ths column sum of
B=∑[t=1,n]μtAt ---eqn.BO251
is
colSumj=∑[i=1,n]Bij ---eqn.BO252
=∑[i=1,n]{∑[t=1,n]μtAt}
=∑[t=1,n]μt{∑[i=1,n]aij,t}
//t-th matrix column sum=1
=∑[t=1,n]μt{1} ---eqn.BO253
//from eqn.BO249
= 1
Similarly, row sum equal to one.
<a name="ch13b310">
Given n doubly stochastic
matrices
{At: t=1,n} ---eqn.BO247
then all of their elements are
nonnegative.
<a name="ch13b311">
The coefficient μt be nonnegative
and sum to one. (eqn.BO248)
The termwise summation
μt{∑[i=1,n]aij,t} ---eqn.BO254
must be nonnegative.
<a name="ch13b312">
Conclude that
Convex combination of doubly
stochastic is doubly stochastic.
//doubly stochastic definition
2010-07-08-19-45 stop
<a name="ch13b313">Index beginIndex this file
2010-07-08-21-48 start
■ What is 'Convex combination'?
Fill with enough air, a basketball
is a full shape ball. No where
indent inward. This shape is
called convex.
<a name="ch13b314">
Another key point is interpolation.
Given supporting points a=2, b=10.
such that other points can be
interpolated relative to [a,b].
For example point
c=5=x*2+y*10=(5/8)*2+(3/8)*10 ---eqn.BO255
<a name="ch13b315">
Interpolation coefficients x
and y have the property
x+y=1 ---eqn.BO256
x*2+y*10=5 ---eqn.BO257
if change 5 to any value in [2,10]
x and y are always between [0,1].
<a name="ch13b316">
If point 12 'interpolate' relative
to [a,b]=[2,10] we find x=-1/4 and
y=5/4. Both x,y are out of [0,1]
range. Actually point 12 relative
to [2,10] has extrapolation,
not 'interpolation'.
<a name="ch13b317">
'Convex combination' is
interpolation under a convex
supporting (convex reference,
convex hull etc.)
<a name="ch13b318">
If a basketball leak air with
indentation. A point inside the
ball interpolate relative to
the indented surface, the
interpolation coefficients
may/may not be in [0,1]
<a name="ch13b319">Index beginIndex this file
What is 'Convex combination'?
Speak mathematically
Given a convex set data sequence
(data can be length/mass/time etc)
d1,d2,...,dn ---eqn.BO258
Given a weight sequence
(weights must be pure numbers)
w1,w2,...,wn ---eqn.BO259
<a name="ch13b320">
weight sequence has the following
property
(1) nonnegativity.
wi≧0 for i=1 to i=n ---eqn.BO260
(2) sum to one
w1+w2+...+wn=1 ---eqn.BO261
//if wi<0 that is extrapolation
//if data di<0 and all wi≧0
//that is still interpolation.
<a name="ch13b321">
then
c=∑[i=1,n]widi ---eqn.BO262
is a Convex combination of
d1,d2,...,dn ---eqn.BO258
<a name="ch13b322">
Key point is that
if wi≧0 for i=1 to i=n
be true, then eqn.BO262 is
a Convex combination.
Any one wk<0 or wk>1 that
is NOT Convex combination.
(eqn.BO261 ∑[i=1,n]wi=1
still necessary)<a name="ch13b323">
On the other hand, if given
'Convex combination', then
that is given eqn.BO260 and
eqn.BO261.
<a name="ch13b324">Index beginIndex this file
Above explanation for
What is 'Convex combination'?
may not be clear (even may not
be correct) Please ask a
mathematics expert for better
explanation.
2010-07-08-22-56 stop
2010-07-09-12-03 done first proofread
2010-07-09-15-15 done second proofread
<a name="Copyright">Index beginIndex this file
2009-06-19-10-48
If you are interested in inequality,
suggest you buy the book
The Cauchy-Schwarz Master Class
written by Prof. J. Michael Steele
The Cauchy-Schwarz Master Class is
this web page's textbook.
To buy textbook, that is to show thanks
to Prof. J.M. Steele's great work.
and it is also respect copyright law.
Thank you. Freeman 2009-06-19
The Cauchy-Schwarz Master Class
J. MichaelSteele★★★★★
ISBN 978-0-521-54677-5
2009-06-19-10-56