πŸ–₯ Computer Science/ν™•λ₯ κ³Ό 톡계

[ν™•λ₯ κ³Ό 톡계] - (4) 닀쀑 ν™•λ₯ λ³€μˆ˜μ™€ 결합뢄포, 주변뢄포(Joint distribution, marginal distribution)

말 λž‘ 2022. 3. 30. 15:03
728x90

 

 

 

Multiple Random Variables 

 

μ§€κΈˆκΉŒμ§€λŠ” ν•˜λ‚˜μ˜ νŠΉμ •ν•œ ν™•λ₯ λ³€μˆ˜(RV)에 λŒ€ν•œ ν™•λ₯ μ΄ μ–Όλ§ˆκ°€ λ˜λŠ”μ§€λ§Œ κ΅¬ν•΄λ³΄μ•˜μŠ΅λ‹ˆλ‹€.

λ˜ν•œ pdf(Probability Density Function)μ—μ„œλŠ” XλΌλŠ” RV의 값이 νŠΉμ •ν•¨ λ²”μœ„ μ•ˆμ— λ“€μ–΄μžˆμ„ ν™•λ₯ μ„ κ΅¬ν–ˆμ—ˆμŠ΅λ‹ˆλ‹€.

κ·ΈλŸ¬λ‚˜ μ΄λ²ˆμ—λŠ” ν™•λ₯ λ³€μˆ˜κ°€ 2개인 κ²½μš°μ— λŒ€ν•΄μ„œμ˜ ν™•λ₯ μ— λŒ€ν•΄μ„œ μ•Œμ•„λ³΄κ² μŠ΅λ‹ˆλ‹€.

 

 

 

 

 

 


 

 

 

 

 

 

Joint distribution (결합뢄포)

2개 μ΄μƒμ˜ ν™•λ₯  λ³€μˆ˜(multiful r.v.s)에 μ˜ν•œ λΆ„ν¬μž…λ‹ˆλ‹€.

 

X, YλΌλŠ” λ‘κ°œμ˜ ν™•λ₯ λ³€μˆ˜κ°€ μžˆμ„ λ•Œ, X와 Y의 κ²°ν•©λΆ„ν¬λŠ” λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.

$$P(X = x_1, Y = y_1)$$

 

μ΄λŸ¬ν•œ 경우 Sample Space(ν‘œλ³Έκ³΅κ°„)λŠ” Xκ°€ μ •μ˜λœ Sample Space와 Yκ°€ μ •μ˜λœ Sample Space의 μΉ΄ν…Œμ‹œμ•ˆ κ³±(즉 μˆœμ„œμŒ)으둜 ν™•μž₯λ©λ‹ˆλ‹€.

λ”°λΌμ„œ Sample Spaceκ°€ ν™•μž₯됐기에, ν™•λ₯ λ³€μˆ˜μ˜ μΉ˜μ—­λ„ X와 Y의 μΉ˜μ—­μ˜ μΉ΄ν…Œμ‹œμ•ˆ 곱으둜 ν™•μž₯λ©λ‹ˆλ‹€.

 

 

μš°μ„  μ—¬λŸ¬κ°œμ˜ ν™•λ₯ λ³€μˆ˜(r.v.s)에 λŒ€ν•œ CDFλ₯Ό μ•Œμ•„λ³΄λ„λ‘ ν•˜κ²ŸμŠ΅λ‹ˆλ‹€.

 

 

 

 

 


 

 

 

 

marginal distribution (주변뢄포)

μ—¬λŸ¬κ°œμ˜ ν™•λ₯  λ³€μˆ˜λ‘œ 이루어진 Joint c.d.fλ‚˜ p.d.f, p.m.fκ°€ μžˆμ„ λ•Œ 각각의 λ³€μˆ˜ ν•˜λ‚˜ν•˜λ‚˜μ— λŒ€ν•œ c.d.f, p.d.f λ˜λŠ” p.f에 Marginal μ΄λΌλŠ” 단어λ₯Ό λΆ™μ—¬μ„œ ν‘œν˜„ν•©λ‹ˆλ‹€.

 

λ˜ν•œ Marginal Distribution을 κ΅¬ν•˜κ³  싢은 경우, κ΅¬ν•˜κ³  싢은 λ³€μˆ˜ μ΄μ™Έμ˜ λ‹€λ₯Έ κ°’μ˜ λͺ¨λ“  경우λ₯Ό λ‹€ κ³ λ €ν•΄μ€€λ‹€λ©΄ κ΅¬ν•˜κ³  싢은 λ³€μˆ˜μ— λŒ€ν•œ dritribution을 ꡬ할 수 μžˆμŠ΅λ‹ˆλ‹€. 

 

예λ₯Ό λ“€μ–΄ Joint c.d.f 인 경우

$$F_x(x) = F_{xy}(x, \infty)$$

$$F_y(y) = F_{xy}(\infty, y)$$

 

 

 

 

 


 

 

 

 

 

Joint Cumulative Distribution Function (Joint CDF, κ²°ν•© λˆ„μ λΆ„ν¬ν•¨μˆ˜)

 

Joint CDFλ₯Ό λ‹€μŒκ³Ό 같이 ν‘œν˜„ν•˜κ² μŠ΅λ‹ˆλ‹€. 

$$F_{xy}(x, y)\overset{\underset{\mathrm{def}}{}}{=}P(X\leq x, Y \leq y)$$

 

 

이것(cdf)이 μ˜λ―Έν•˜λŠ” 것은 μ•„λž˜ ν‘œμ‹œλœ λΆ€λΆ„μ˜ μ˜μ—­μ— x, y μ’Œν‘œκ°€ λ“€μ–΄μ˜€κ²Œ 될 ν™•λ₯ μž…λ‹ˆλ‹€.

 

 

 

 

 

 

 

Joint CDF의 νŠΉμ§•λ“€μ„ μ•Œμ•„λ³΄λ„λ‘ ν•˜κ² μŠ΅λ‹ˆλ‹€.

 

Properties of Joint CDF

 

1) CDF도 ν™•λ₯ μ΄κΈ° λ•Œλ¬Έμ— μ•„λž˜κ°€ μ„±λ¦½ν•©λ‹ˆλ‹€.

$$ 0\leq F_{xy}(x, y)\leq 1 $$

 

 

 

2) Joint CDFλŠ” non-decreasing surfaceμž…λ‹ˆλ‹€. λ”°λΌμ„œ

$$for \quad  x_1 < x_2, \; y_1 < y_2  $$

$$ F_{xy}(x_1, y_1) \leq F_{xy}(x_1, y_2)  \leq F_{xy}(x_2, y_2)$$

$$ F_{xy}(x_1, y_1) \leq F_{xy}(x_2, y_1)  \leq F_{xy}(x_2, y_2)$$

(surface라고 ν‘œν˜„ν•œ 것은 function이라 ν‘œν˜„ν•œ 경우 곑선등을 생각할 것이기 λ•Œλ¬Έμ—, 이λ₯Ό λͺ…ν™•νžˆ ν•˜κΈ° μœ„ν•΄ surface라고 ν‘œν˜„ν•˜μ˜€μŠ΅λ‹ˆλ‹€. RVκ°€ 1개일 λ•ŒλŠ” CDFλŠ” κ³‘μ„ μœΌλ‘œ ν‘œν˜„λ˜μ§€λ§Œ, RVκ°€ 2개일 경우 CDFλŠ” 곑면으둜 ν‘œν˜„λ©λ‹ˆλ‹€.)

 

μ•„λž˜λŠ” 이해λ₯Ό 돕기 μœ„ν•œ Joint CDF의 κ·Έλ¦Όμž…λ‹ˆλ‹€.

http://www.columbia.edu/~ad3217/joint_pmf_and_pdf/pdf.html

 

 

 

 

 

$$3) \;\;\; \displaystyle \lim_{ {x\to \infty \; y\to \infty}}F_{xy} = P(X\leq \infty , Y\leq \infty) = 1$$

 

 

 

 

 

$$4) \;\;\;\displaystyle \lim_{ {x\to -\infty }}F_{xy} = P(X\leq -\infty , Y\leq y) = 0$$

$$\displaystyle \lim_{ {y\to -\infty }}F_{xy} = P(X\leq x , Y\leq -\infty) = 0$$

 

 

 

 

$$5) \;\;\;P(x_1 <  X \leq x_2,\; Y\leq y)$$ μœ„ 식은 μ•„λž˜ 면적의 넓이λ₯Ό κ΅¬ν•˜λŠ” κ²ƒμž…λ‹ˆλ‹€.

즉 λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.

$$F_{xy}(x_2, y) - F_{xy}(x_1, y)$$

 

 

 

 

 

 

$$6) \;\;\;P(x_1 <  X \leq x_2,\; y_1<Y\leq y_2)$$ μœ„ 식은 μ•„λž˜ 주황색 λΆ€λΆ„μ˜ 면적의 넓이λ₯Ό κ΅¬ν•˜λŠ” κ²ƒμž…λ‹ˆλ‹€.

(사싀 λΆ€ν”Όλ₯Ό κ΅¬ν•˜λŠ” κ²ƒμ΄λ‚˜, μ’€ 더 νŽΈν•˜κ²Œ μ΄ν•΄ν•˜κΈ° μœ„ν•΄ ν‰λ©΄μœΌλ‘œ κ·Έλ ΈμŠ΅λ‹ˆλ‹€.)

즉 λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.

$$F_{xy}(x_2, y_2) - F_{xy}(x_1, y_2)- F_{xy}(x_2, y_1) + F_{xy}(x_1, y_1)$$

 

 

 

 

 

 

 

 

 

marginal CDF

Joint CDF에 λŒ€ν•΄μ„œ λ‹€λ₯Έ λ³€μˆ˜λ₯Ό λ¬΄ν•œλŒ€λ‘œ 보내버리면(즉 λͺ¨λ“  쑰건을 λ‹€ μƒκ°ν•œλ‹€λ©΄) λ‚˜λ¨Έμ§€ λ³€μˆ˜μ— λŒ€ν•œ marginal CDFλ₯Ό 얻을 수 μžˆμŠ΅λ‹ˆλ‹€.

$$F_x(x) = F_{xy}(x, \infty)$$

$$F_y(y) = F_{xy}(\infty, y)$$

 

 

 

 

 

 

 


 

 

 

 

 

 

Joint Probability Mass Function (Joint p.m.f ,   p.f)

 

2개의 Discrete Random Variables에 λŒ€ν•œ ν™•λ₯ λΆ„포 즉 Joint Probability Mass Function은 λ‹€μŒκ³Ό 같이 μ •μ˜λ©λ‹ˆλ‹€.

$$\forall{(x, y) \in R^{2} } , \;\;\; f(x,y) = P(X=x \;and \; Y = y)$$

 

 

https://www.researchgate.net/figure/A-joint-probability-mass-function-assigning-probabilities-to-vector-observations-x_fig3_239840414

 

 

X = x 인 경우의 ν™•λ₯ κ³Ό Y = y인 경우의 ν™•λ₯ μ„ κ΅¬ν•΄μ„œ κ³±ν•˜λ©΄ λ˜μ§€ μ•Šμ„κΉŒλž€ 생각을 ν•  μˆ˜λ„ μžˆλŠ”λ°, μ΄λŠ” Independentν•œ μƒν™©μ—μ„œλ§Œ μ„±λ¦½ν•˜λ©°, μΌλ°˜μ μœΌλ‘œλŠ” μ„±λ¦½ν•˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€.

 

 

 

 

 

Properties of Joint PMF

 

1)λ§Œμ•½ (X , Y)κ°€ μˆœμ„œμŒ (x, y)λ₯Ό κ°–μ§€ μ•ŠλŠ”λ‹€λ©΄, ν•΄λ‹Ή ν™•λ₯ μ€ 0μž…λ‹ˆλ‹€.

$$if\;\; (X, Y)\;\; cannot \;\; have \;\; a \;\; ordered\;\; pair\;\; (x,y) \;\; then \;\;f(x, y) = 0  $$

 

2) λͺ¨λ“  (X, Y) μˆœμ„œμŒμ˜ ν™•λ₯ μ„ λ”ν•œ 값은 1μž…λ‹ˆλ‹€.

$$\sum_{x}^{}\sum_{y}^{} P_{xy}(x,y)=1$$

 

3) ν™•λ₯ μ΄λ―€λ‘œ 0κ³Ό 1 μ‚¬μ΄μ˜ 값을 κ°€μ§‘λ‹ˆλ‹€.

$$0\leq p_{xy}(x, y)\leq 1$$

 

4)λˆ„μ λΆ„ν¬ν•¨μˆ˜λŠ” λ‹€μŒκ³Ό 같이 κ΅¬ν•©λ‹ˆλ‹€.

$$F_{xy}(x,y) = P(X \leq x, Y \leq y) = \sum_{X\leq x}^{}\sum_{Y\leq y}^{}P_{xy}(x,y)$$

 

 

 

 

 

marginal PMF

$$P_x(x)=\sum_{y}^{}P_{xy}(x,y)$$

$$P_y(y)=\sum_{x}^{}P_{xy}(x,y)$$

 

 

 

 

 

Independent 인 경우

$$P_{xy}(x, y)=P_x(x)  P_y(y)$$

 

 

 

 

 

 

 


 

 

 

 

 

 

Joint Probability Density Function (Joint p.d.f)

 

μ•„λž˜μ™€ 같은 음이 μ•„λ‹Œ ν•¨μˆ˜ fλ₯Ό κ°€μ§ˆ λ•Œ,  X와 YλŠ” continuous joint distribution λ₯Ό κ°€μ§„λ‹€κ³  ν•©λ‹ˆλ‹€.λ‹ˆλ‹€.

$$P((X,Y) \in C \subset R^{2}) = \int_{C} \int f(x, y) dxdy$$

그리고 μ΄λ•Œμ˜ fλ₯Ό joint probability density function (joint p.d.f)라고 ν•©λ‹ˆλ‹€.

 

 

 

 

 

 

 

 

 

PDFλŠ” CDFλ₯Ό λ―ΈλΆ„ν•˜λ©΄ ꡬ할 수 μžˆμ—ˆμŠ΅λ‹ˆλ‹€. λ”°λΌμ„œ

$$f_{xy}(x,y) \overset{\underset{\mathrm{def}}{}}{=} \frac{\partial^{2} F_{xy}(x,y)}{\partial x \partial y }$$

 

 

 

λ˜ν•œ CDFλŠ” PDFλ₯Ό μ λΆ„ν•˜λ©΄ ꡬ할 수 μžˆμœΌλ―€λ‘œ

$$F_{xy}(x, y) = \int_{-\infty}^{y} \int_{-\infty}^{x} f_{xy}(u ,v)dudv$$

 

 

 

 

 

 

 

Properties of Joint PDF

1) x y ν‰λ©΄μ˜ λͺ¨λ“  각각의 ν•œ 점(point)μ—μ„œμ˜ ν™•λ₯ μ€ 0μž…λ‹ˆλ‹€. 

 

2)(x, y)κ°€ 평면이 μ•„λ‹Œ ν•˜λ‚˜μ˜ 직선 μœ„μ— μ‘΄μž¬ν•  경우, 이에 λŒ€ν•œ ν™•λ₯ μ€ 0μž…λ‹ˆλ‹€.

$$when \;\; C = \left\{ (x, y)  | y =f(x) \right\} \;\; or \;\;  C = \left\{ (x, y)  | x =f(y)\right\}  $$

$$then \;\; \int \int f(x, y)dxdy = 0$$

 

3) ν™•λ₯ μ΄λ―€λ‘œ λ‹€μŒμ΄ μ„±λ¦½ν•©λ‹ˆλ‹€.

$$)1  \geq  f(x, y ) \geq 0$$

 

4) λͺ¨λ“  경우의 ν™•λ₯ μ„ λ‹€ λ”ν•œ 값은 1μž…λ‹ˆλ‹€.

$$\int_{-\infty}^{\infty}\int_{-\infty}^{\infty}f_{xy}(x, y)dxdy = 1 = F_{xy}(\infty, \infty)$$

 

5) (X, Y)κ°€ νŠΉμ • ꡬ간에 μ‘΄μž¬ν•  ν™•λ₯ μ€ λ‹€μŒκ³Ό 같이 κ΅¬ν•©λ‹ˆλ‹€.

$$P(x_1 < X\leq X_2, y_1< Y\leq y_2) = \int_{y_1}^{y_2}\int_{x_1}^{x_2}f_{xy}(x,y)dxdy$$

$$= F_{xy}(x_2, y_2)-F_{xy}(x_1, y_2) - F_{xy}(x_2, y_1) + F_{xy}(x_1, y_1)$$

λ˜ν•œ μ΄κ³³μ—μ„œ 이쀑적뢄은 λΆ€ν”Όλ₯Ό κ΅¬ν•˜λŠ” 것을 μ˜λ―Έν•©λ‹ˆλ‹€

 

 

 

 

 

 

 

marginal PDF

$$f_x(x) = \int_{-\infty}^{\infty} f_{xy}(x, y) dy$$

$$f_y(y) = \int_{-\infty}^{\infty}f_{xy}(x,y)dx$$

 

 

 

 

 

 

 

Independent 인 경우

 

$$f_{xy}(x, y) =f_x(x)f_y(y)$$

 

 

λ˜ν•œ x와 y의 λ²”μœ„κ°€ unboundedν•œ 경우(x >=0, y>=0),

$$f(x, y)  = h(x)\cdot h(y)$$ μΌλ•Œ,

 h(x)λŠ” 였직 x에 μ˜ν•΄μ„œλ§Œ 영ν–₯을 λ°›κ³ , h(y)λŠ” y에 μ˜ν•΄μ„œλ§Œ 영ν–₯을 λ°›μœΌλ―€λ‘œ, 

X와 YλŠ” independent ν•©λ‹ˆλ‹€.

 

 

 

 

 

 

 


 

 

예제

1개의 동전을 3번 λ˜μ§„λ‹€κ³  κ°€μ •ν•΄ λ³΄κ² μŠ΅λ‹ˆλ‹€.

 

X : 맨 처음 동전이 Hκ°€ λ‚˜μ˜€λ©΄ 1, Tκ°€ λ‚˜μ˜€λ©΄ 0

Y : 총 Hκ°€ λ‚˜μ˜¨ 횟수

 

μ΄λŠ” λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.

$$P_x(0) = \frac{1}{2}, \; P_x(1) = \frac{1}{2}, \;\;P_y(0) = \frac{1}{8}, \; P_y(1) = \frac{1}{4}, \;\;P_y(2) = \frac{1}{4}, \; P_y(3) = \frac{1}{8},$$

 

 

X |Y 0 1 2 3
0 P(0, 0) = 1/8 P(0, 1) = 2/8 P(0, 2) = 1/8 0
1 0 P(1, 1) = 1/8 P(1, 2) = 2/8 P(1, 3) = 1/8

 

 

 

$$\sum_{x=0}^{1}P_x(x) = 1$$

$$\sum_{y=0}^{3}P_y(y) = 1$$

$$\sum_{y=0}^{3}\sum_{x=0}^{1}P_{xy}(x,y) = 1$$

 

 

μΆ”κ°€λ‘œ joint probability λ‘œλΆ€ν„° X에 λŒ€ν•œ marginal probabilityλ₯Ό κ΅¬ν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€.

$$P_x(0) = \sum_{y=0}^{3}P_{xy}(0, y ) = \frac{1}{2}$$

$$P_y(2) = \sum_{x=0}^{1}P_{xy}(x, 2 ) = \frac{3}{8}$$

 

μΆ”κ°€λ‘œ Independent의 μ—¬λΆ€λ₯Ό κ΅¬ν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€.

$$P_{xy}(0, 2) = \frac{1}{8}  $$

$$P_x(0)P_y(2) =\frac{1}{2} * \frac{3}{8}  = \frac{3}{16} \neq P_{xy}(0, 2) $$

λ”°λΌμ„œ X와 YλŠ” Independentκ°€ μ•„λ‹™λ‹ˆλ‹€.

 

 

 

 


 

 

2번

$$P_{xy}(x,y)= k(2x+y)$$

에 λŒ€ν•΄,

x = 1,2

y = 1,2

이닀.

 

μ΄λ•Œ k, marginal Prob, Independent μ—¬λΆ€λ₯Ό κ΅¬ν•΄λ³΄κ² μŠ΅λ‹ˆλ‹€.

 

$$\sum_{x=1}^{2}\sum_{y=1}^{2}k(2x+y) = 1$$

μ΄λ―€λ‘œ,

k(3 + 4 + 5 + 6) = 1이며, λ”°λΌμ„œ 

$$ k = \frac{1}{18}$$

 

 

 

marginal ProbλŠ” λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€.

$$P_x(x) = \sum_{y=1}^{2}\frac{1}{18}(2x + y) = \frac{1}{18}(4x+3)$$

$$P_y(y) = \sum_{x=1}^{2}\frac{1}{18}(2x + y) = \frac{1}{18}(2y+6)$$

 

 

 

 

Independent μ—¬λΆ€λŠ” λ‹€μŒκ³Ό κ°™μŠ΅λ‹ˆλ‹€

$$\frac{1}{18}(2x+y) \neq \frac{1}{18}(4x+3) * \frac{1}{18}(2y+6)$$

μ΄λ―€λ‘œ Independentν•˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€.

 

 

 

728x90