线性代数|机器学习-P14随机矩阵乘法

作者 : admin 本文共5643个字,预计阅读时间需要15分钟 发布时间: 2024-06-16 共1人阅读

文章目录

  • 1. 概述
  • 2. 样本均值与方差计算
  • 3. 随机抽样AB
  • 4. 拉格朗日乘子法

1. 概述

  • 单个样本均值和方差
  • 重复n次同一实验的总样本和总方差
  • 拉格朗日乘子法求最大概率
  • AB矩阵通过概率采样得到CR矩阵

    A

    S

    S

    T

    B

    =

    C

    R

    ,

    A

    B

    C

    R

    ASS^TB=CR,AB\approx CR

    ASSTB=CR,ABCR

2. 样本均值与方差计算

  • 假设我们有一个实验盒,里面放着a,b两个球,我们一次只能取一个球,取完后放回原地后重复取第二次,单次中只有50%概率取到

    [

    a

    ,

    0

    ]

    [a,0]

    [a,0],另外50%概率取到

    [

    0

    ,

    b

    ]

    [0,b]

    [0,b],计算单次实验期望,方差

    E

    1

    (

    x

    )

    ,

    D

    1

    (

    x

    )

    E_1(x),D_1(x)

    E1(x),D1(x)

    E

    1

    (

    x

    )

    =

    1

    2

    [

    a

    ,

    0

    ]

    +

    1

    2

    [

    0

    ,

    b

    ]

    =

    [

    1

    2

    a

    ,

    1

    2

    b

    ]

    \begin{equation} E_1(x)=\frac{1}{2}[a,0]+\frac{1}{2}[0,b]=[\frac{1}{2}a,\frac{1}{2}b] \end{equation}

    E1(x)=21[a,0]+21[0,b]=[21a,21b]

    D

    1

    (

    x

    )

    =

    1

    2

    [

    [

    a

    ,

    0

    ]

    [

    1

    2

    a

    ,

    1

    2

    b

    ]

    ]

    2

    +

    1

    2

    [

    [

    0

    ,

    b

    ]

    [

    1

    2

    a

    ,

    1

    2

    b

    ]

    ]

    2

    =

    [

    a

    2

    4

    ,

    b

    2

    4

    ]

    \begin{equation} D_1(x)=\frac{1}{2}[[a,0]-[\frac{1}{2}a,\frac{1}{2}b]]^2+\frac{1}{2}[[0,b]-[\frac{1}{2}a,\frac{1}{2}b]]^2=[\frac{a^2}{4},\frac{b^2}{4}] \end{equation}

    D1(x)=21[[a,0][21a,21b]]2+21[[0,b][21a,21b]]2=[4a2,4b2]

  • 那么重复上面两次的实验的期望,方差

    E

    2

    (

    x

    )

    ,

    D

    2

    (

    x

    )

    E_2(x),D_2(x)

    E2(x),D2(x)

    E

    2

    (

    x

    )

    =

    2

    E

    1

    (

    x

    )

    =

    [

    a

    ,

    b

    ]

    ;

    D

    2

    (

    x

    )

    =

    2

    D

    1

    (

    x

    )

    =

    [

    a

    2

    2

    ,

    b

    2

    2

    ]

    \begin{equation} E_2(x)=2E_1(x)=[a,b];D_2(x)=2D_1(x)=[\frac{a^2}{2},\frac{b^2}{2}] \end{equation}

    E2(x)=2E1(x)=[a,b];D2(x)=2D1(x)=[2a2,2b2]

  • 期望与方差的公式,需要记住,后面会用,后面公式前提是单独重复实验

    D

    (

    x

    )

    =

    E

    (

    x

    2

    )

    E

    2

    (

    x

    )

    ,

    D

    n

    (

    x

    )

    =

    n

    D

    (

    x

    )

    ,

    E

    n

    (

    x

    )

    =

    n

    E

    (

    x

    )

    \begin{equation} D(x)=E(x^2)-E^2(x),D_n(x)=nD(x),E_n(x)=nE(x) \end{equation}

    D(x)=E(x2)E2(x),Dn(x)=nD(x),En(x)=nE(x)

3. 随机抽样AB

假设我们有一个1000000长度的矩阵AB,这样我们对AB单独进行计算是不显示的,我们希望通过概率的方式对矩阵AB进行采样形成新的矩阵CR,具体如下:

A

S

=

[

a

1

a

2

a

3

]

[

s

11

0

0

0

0

s

32

]

=

[

s

11

a

1

s

32

a

3

]

\begin{equation} AS=\begin{bmatrix}a_1&a_2&a_3\end{bmatrix}\begin{bmatrix}s_{11}&0\\0&0\\0&s_{32}\end{bmatrix}=\begin{bmatrix}s_{11}a_1&s_{32}a_3\end{bmatrix} \end{equation}

AS=[a1a2a3]

s110000s32

=[s11a1s32a3]

S

T

B

=

[

s

11

0

0

0

0

s

32

]

[

b

1

T

b

2

T

b

3

T

]

=

[

s

11

b

1

T

s

32

b

3

T

]

\begin{equation} S^TB=\begin{bmatrix}s_{11}&0&0\\0&0&s_{32}\end{bmatrix}\begin{bmatrix}b_1^T\\b_2^T\\b_3^T\end{bmatrix}=\begin{bmatrix}s_{11}b_1^T\\s_{32}b_3^T\end{bmatrix} \end{equation}

STB=

s110000s32

b1Tb2Tb3T

=

s11b1Ts32b3T


A

S

S

T

B

=

[

a

1

a

2

a

3

]

[

s

11

0

0

0

0

s

32

]

[

s

11

0

0

0

0

s

32

]

[

b

1

T

b

2

T

b

3

T

]

=

s

11

2

a

1

b

1

T

+

s

32

2

a

3

b

3

T

\begin{equation} ASS^TB=\begin{bmatrix}a_1&a_2&a_3\end{bmatrix}\begin{bmatrix}s_{11}&0\\0&0\\0&s_{32}\end{bmatrix}\begin{bmatrix}s_{11}&0&0\\0&0&s_{32}\end{bmatrix}\begin{bmatrix}b_1^T\\b_2^T\\b_3^T\end{bmatrix}=s_{11}^2a_1b_1^T+s_{32}^2a_3b_3^T \end{equation}

ASSTB=[a1a2a3]

s110000s32

s110000s32

b1Tb2Tb3T

=s112a1b1T+s322a3b3T

  • 我们定义从矩阵A的列向量中随机抽取到第 j 列的概率为

    q

    j

    q_j

    qj,那么重复进行s次单个实验的概率为

    p

    =

    1

    s

    q

    j

    p=\frac{1}{\sqrt{sq_j}}

    p=sqj

    1,这样做的好处是为了使得最后求单次实验的期望值为AB/s

  • 单次实验的期望值

    E

    1

    (

    x

    )

    E_1(x)

    E1(x)

    E

    1

    (

    x

    )

    =

    j

    =

    1

    n

    q

    j

    1

    s

    q

    j

    a

    j

    1

    s

    q

    j

    b

    j

    T

    =

    j

    =

    1

    n

    1

    s

    a

    j

    b

    j

    T

    =

    j

    =

    1

    n

    a

    j

    b

    j

    T

    s

    =

    A

    B

    s

    \begin{equation} E_1(x)=\sum_{j=1}^nq_j\frac{1}{\sqrt{sq_j}}a_j \cdot\frac{1}{\sqrt{sq_j}}b^T_j=\sum_{j=1}^n\frac{1}{s}a_jb_j^T=\frac{\sum_{j=1}^n a_jb_j^T}{s}=\frac{AB}{s} \end{equation}

    E1(x)=j=1nqjsqj

    1ajsqj

    1bjT=j=1ns1ajbjT=sj=1najbjT=sAB

  • s 次重复实验后的期望

    E

    s

    (

    x

    )

    E_s(x)

    Es(x)

    E

    s

    (

    x

    )

    =

    s

    E

    1

    (

    x

    )

    =

    s

    A

    B

    s

    =

    A

    B

    \begin{equation} E_s(x)=sE_1(x)=s\cdot \frac{AB}{s}=AB \end{equation}

    Es(x)=sE1(x)=ssAB=AB
    是不是很完美,这样的话,我们就能够在随机采样中,虽然得到的每个矩阵都不是对的,但是组合起来的矩阵CR的期望值居然是AB

  • 单次实验的方差

    D

    1

    (

    x

    )

    D_1(x)

    D1(x),现在我们已经知道了

    E

    (

    x

    )

    E(x)

    E(x),现在只要知道

    E

    (

    x

    2

    )

    E(x^2)

    E(x2)就好了

    D

    1

    (

    x

    )

    =

    E

    (

    x

    2

    )

    E

    (

    x

    )

    2

    \begin{equation} D_1(x)=E(x^2)-E(x)^2 \end{equation}

    D1(x)=E(x2)E(x)2

  • 我们知道单个列采样表示的秩为1的矩阵如下

    X

    j

    =

    [

    a

    j

    s

    q

    j

    ,

    b

    j

    T

    s

    q

    j

    ]

    X

    j

    2

    =

    [

    a

    j

    2

    s

    q

    j

    ,

    (

    b

    j

    T

    )

    2

    s

    q

    j

    ]

    \begin{equation} X_j=[\frac{a_j}{\sqrt{sq_j}},\frac{b_j^T}{\sqrt{sq_j}}]\rightarrow X_j^2=[\frac{a_j^2}{sq_j},\frac{{(b_j^T)}^2}{sq_j}] \end{equation}

    Xj=[sqj

    aj,sqj

    bjT]Xj2=[sqjaj2,sqj(bjT)2]

  • 那么可得

    E

    (

    x

    2

    )

    E(x^2)

    E(x2)

    E

    (

    x

    2

    )

    =

    j

    =

    1

    n

    q

    j

    a

    j

    2

    s

    q

    j

    (

    b

    j

    T

    )

    2

    s

    q

    j

    \begin{equation} E(x^2)=\sum_{j=1}^nq_j\frac{a_j^2}{sq_j}\cdot\frac{{(b_j^T)}^2}{sq_j} \end{equation}

    E(x2)=j=1nqjsqjaj2sqj(bjT)2

  • 根据

    D

    (

    x

    )

    =

    E

    (

    x

    2

    )

    E

    2

    (

    x

    )

    D(x)=E(x^2)-E^2(x)

    D(x)=E(x2)E2(x)

    D

    (

    x

    )

    =

    j

    =

    1

    n

    q

    j

    a

    j

    2

    (

    b

    j

    T

    )

    2

    s

    2

    q

    j

    2

    A

    B

    F

    2

    s

    2

    \begin{equation} D(x)=\sum_{j=1}^nq_j\frac{a_j^2{(b_j^T)}^2}{s^2q_j^2}-\frac{||AB||_F^2}{s^2} \end{equation}

    D(x)=j=1nqjs2qj2aj2(bjT)2s2∣∣ABF2

  • 那么 s 次重复实验的

    D

    s

    (

    x

    )

    D_s(x)

    Ds(x)方差如下:

    D

    s

    (

    x

    )

    =

    s

    D

    (

    x

    )

    =

    s

    j

    =

    1

    n

    q

    j

    a

    j

    2

    (

    b

    j

    T

    )

    2

    s

    2

    q

    j

    2

    s

    A

    B

    F

    2

    s

    2

    \begin{equation} D_s(x)=sD(x)=s\sum_{j=1}^nq_j\frac{a_j^2{(b_j^T)}^2}{s^2q_j^2}-s\frac{||AB||_F^2}{s^2} \end{equation}

    Ds(x)=sD(x)=sj=1nqjs2qj2aj2(bjT)2ss2∣∣ABF2

    D

    s

    (

    x

    )

    =

    j

    =

    1

    n

    a

    j

    2

    (

    b

    j

    T

    )

    2

    s

    q

    j

    A

    B

    F

    2

    s

    \begin{equation} D_s(x)=\sum_{j=1}^n\frac{a_j^2{(b_j^T)}^2}{sq_j}-\frac{||AB||_F^2}{s} \end{equation}

    Ds(x)=j=1nsqjaj2(bjT)2s∣∣ABF2

  • 那当

    q

    j

    q_j

    qj

    j

    =

    1

    n

    q

    j

    =

    1

    \sum_{j=1}^nq_j=1

    j=1nqj=1的情况下,满足什么条件的时候使得

    D

    s

    (

    x

    )

    D_s(x)

    Ds(x)最小?
    对于约束条件下求最小值,我们一般会用到拉格朗日乘子法!!!

4. 拉格朗日乘子法

根据条件

j

=

1

n

q

j

=

1

\sum_{j=1}^nq_j=1

j=1nqj=1,求

arg

m

i

n

D

s

(

x

)

\arg\limits_{min}D_s(x)

minargDs(x),构建拉格朗日乘子方程如下

F

(

x

,

q

j

,

λ

)

=

j

=

1

n

a

j

2

(

b

j

T

)

2

s

q

j

A

B

F

2

s

+

λ

(

j

=

1

n

q

j

1

)

\begin{equation} F(x,q_j,\lambda)=\sum_{j=1}^n\frac{a_j^2{(b_j^T)}^2}{sq_j}-\frac{||AB||_F^2}{s}+\lambda(\sum_{j=1}^nq_j-1) \end{equation}

F(x,qj,λ)=j=1nsqjaj2(bjT)2s∣∣ABF2+λ(j=1nqj1)

  • 求偏导可得

    F

    (

    x

    ,

    q

    j

    ,

    λ

    )

    q

    j

    =

    a

    j

    2

    (

    b

    j

    T

    )

    2

    s

    q

    j

    2

    +

    λ

    =

    0

    \begin{equation} \frac{\partial F(x,q_j,\lambda)}{\partial q_j}=-\frac{a_j^2{(b_j^T)}^2}{sq_j^2}+\lambda=0 \end{equation}

    qjF(x,qj,λ)=sqj2aj2(bjT)2+λ=0

  • 整理可得:

    a

    j

    2

    (

    b

    j

    T

    )

    2

    s

    q

    j

    2

    =

    λ

    q

    j

    =

    a

    j

    b

    j

    T

    s

    λ

    ,

    j

    =

    1

    n

    q

    j

    =

    1

    \begin{equation} \frac{a_j^2{(b_j^T)}^2}{sq_j^2}=\lambda\rightarrow q_j=\frac{a_jb_j^T}{\sqrt{s\lambda}},\sum_{j=1}^nq_j=1 \end{equation}

    sqj2aj2(bjT)2=λqj=sλ

    ajbjT,j=1nqj=1

  • 代入可得:

    j

    =

    1

    n

    q

    j

    =

    j

    =

    1

    n

    a

    j

    b

    j

    T

    s

    λ

    =

    1

    \begin{equation} \sum_{j=1}^nq_j=\frac{\sum_{j=1}^na_jb_j^T}{\sqrt{s\lambda}}=1 \end{equation}

    j=1nqj=sλ

    j=1najbjT=1

  • 整理可得:

    s

    λ

    =

    j

    =

    1

    n

    a

    j

    b

    j

    T

    =

    A

    B

    \begin{equation} \sqrt{s\lambda}=\sum_{j=1}^na_jb_j^T=AB \end{equation}

    sλ

    =j=1najbjT=AB

  • 代入公式18,可得:

    q

    j

    =

    a

    j

    b

    j

    T

    s

    λ

    =

    a

    j

    b

    j

    T

    j

    =

    1

    n

    a

    j

    b

    j

    T

    \begin{equation} q_j=\frac{a_jb_j^T}{\sqrt{s\lambda}}=\frac{a_jb_j^T}{\sum_{j=1}^na_jb_j^T} \end{equation}

    qj=sλ

    ajbjT=j=1najbjTajbjT

  • 小结:
    当我们的概率按照

    q

    j

    =

    a

    j

    b

    j

    T

    j

    =

    1

    n

    a

    j

    b

    j

    T

    q_j=\frac{a_jb_j^T}{\sum_{j=1}^na_jb_j^T}

    qj=j=1najbjTajbjT,来进行采样,那么我们得到的矩阵CR的期望值为AB,并且方差最小。完结撒花!!!

本站无任何商业行为
个人在线分享 » 线性代数|机器学习-P14随机矩阵乘法
E-->