Three devices are monitored until failure. The observed lifetimes are 1.1, 2.2, and 0.4 years. If the lifetimes are modeled as exponential distribution with rate λ \lambda λ ,
T i ∼ Exp ( λ ) , f ( t ∣ λ ) = λ e − λ t , t > 0 , λ > 0 T_i \sim \text{Exp}(\lambda), \quad f(t \mid \lambda) = \lambda e^{-\lambda t}, \quad t > 0, \lambda > 0 T i ∼ Exp ( λ ) , f ( t ∣ λ ) = λ e − λ t , t > 0 , λ > 0
Assume an exponential prior on λ \lambda λ :
λ ∼ Exp ( 2 ) , π ( λ ) = 2 e − 2 λ , λ > 0 \lambda \sim \text{Exp}(2), \quad \pi(\lambda) = 2e^{-2\lambda}, \quad \lambda > 0 λ ∼ Exp ( 2 ) , π ( λ ) = 2 e − 2 λ , λ > 0
(a) Find the posterior distribution of λ \lambda λ .
(b) Find the Bayes estimator for λ \lambda λ .
(c) Find the MAP estimator for λ \lambda λ .
(d) Numerically find both the equi-tailed and highest posterior density credible sets for λ \lambda λ , at the 95% credibility level.
(e) Find the posterior probability of hypothesis H 0 : λ ≤ 1 / 2 H_0 : \lambda \leq 1/2 H 0 : λ ≤ 1/2 .
详情 a 求 λ \lambda λ 的后验分布 (b) 贝叶斯估计量(Bayes Estimator) (c) 最大后验估计量(MAP Estimator) (d) 95% 等尾和最高后验密度置信区间 (e) 后验概率 P ( H 0 : λ ≤ 1 / 2 ) P(H_0: \lambda \leq 1/2) P ( H 0 : λ ≤ 1/2 ) 根据题意,设备的寿命 T i ∼ Exp ( λ ) T_i \sim \text{Exp}(\lambda) T i ∼ Exp ( λ ) ,且 λ \lambda λ 的先验分布是指数分布 Exp ( 2 ) \text{Exp}(2) Exp ( 2 ) 。
似然函数 :寿命 T 1 = 1.1 T_1 = 1.1 T 1 = 1.1 , T 2 = 2.2 T_2 = 2.2 T 2 = 2.2 , T 3 = 0.4 T_3 = 0.4 T 3 = 0.4 为观测值,对应的似然函数为:L ( λ ) = ∏ i = 1 3 f ( T i ∣ λ ) = λ 3 e − λ ( 1.1 + 2.2 + 0.4 ) = λ 3 e − λ ⋅ 3.7 L(\lambda) = \prod_{i=1}^{3} f(T_i \mid \lambda) = \lambda^3 e^{-\lambda (1.1 + 2.2 + 0.4)} = \lambda^3 e^{-\lambda \cdot 3.7} L ( λ ) = ∏ i = 1 3 f ( T i ∣ λ ) = λ 3 e − λ ( 1.1 + 2.2 + 0.4 ) = λ 3 e − λ ⋅ 3.7
先验分布 :λ \lambda λ 的先验分布为 λ ∼ Exp ( 2 ) \lambda \sim \text{Exp}(2) λ ∼ Exp ( 2 ) ,即先验分布密度为:π ( λ ) = 2 e − 2 λ \pi(\lambda) = 2e^{-2\lambda} π ( λ ) = 2 e − 2 λ
后验分布 :由贝叶斯定理,后验分布与似然函数和先验分布成正比:π ( λ ∣ T 1 , T 2 , T 3 ) ∝ L ( λ ) π ( λ ) = λ 3 e − λ ⋅ 3.7 ⋅ 2 e − 2 λ = 2 λ 3 e − λ ⋅ 5.7 \pi(\lambda \mid T_1, T_2, T_3) \propto L(\lambda) \pi(\lambda) = \lambda^3 e^{-\lambda \cdot 3.7} \cdot 2e^{-2\lambda} = 2\lambda^3 e^{-\lambda \cdot 5.7} π ( λ ∣ T 1 , T 2 , T 3 ) ∝ L ( λ ) π ( λ ) = λ 3 e − λ ⋅ 3.7 ⋅ 2 e − 2 λ = 2 λ 3 e − λ ⋅ 5.7
因此,λ \lambda λ 的后验分布为 Gamma 分布:λ ∣ T 1 , T 2 , T 3 ∼ Gamma ( 4 , 5.7 ) \lambda \mid T_1, T_2, T_3 \sim \text{Gamma}(4, 5.7) λ ∣ T 1 , T 2 , T 3 ∼ Gamma ( 4 , 5.7 )
其中,Gamma 分布的形式为 Gamma ( k , θ ) \text{Gamma}(k, \theta) Gamma ( k , θ ) ,其密度函数为:f ( λ ∣ k , θ ) = λ k − 1 e − λ / θ θ k Γ ( k ) f(\lambda \mid k, \theta) = \frac{\lambda^{k-1} e^{-\lambda / \theta}}{\theta^k \Gamma(k)} f ( λ ∣ k , θ ) = θ k Γ ( k ) λ k − 1 e − λ / θ
(b) 贝叶斯估计量(Bayes Estimator)
贝叶斯估计量在平方损失函数下是后验分布的期望。对于 λ ∣ T 1 , T 2 , T 3 ∼ Gamma ( 4 , 5.7 ) \lambda \mid T_1, T_2, T_3 \sim \text{Gamma}(4, 5.7) λ ∣ T 1 , T 2 , T 3 ∼ Gamma ( 4 , 5.7 ) ,Gamma 分布的期望为:
E [ λ ∣ T 1 , T 2 , T 3 ] = k θ = 4 5.7 E[\lambda \mid T_1, T_2, T_3] = \frac{k}{\theta} = \frac{4}{5.7} E [ λ ∣ T 1 , T 2 , T 3 ] = θ k = 5.7 4
所以,贝叶斯估计量为:λ ^ Bayes = 4 5.7 ≈ 0.7018 \hat{\lambda}_{\text{Bayes}} = \frac{4}{5.7} \approx 0.7018 λ ^ Bayes = 5.7 4 ≈ 0.7018
(c) 最大后验估计量(MAP Estimator)
MAP 估计量是后验分布的众数,即找到 λ \lambda λ 使得后验分布 π ( λ ∣ T 1 , T 2 , T 3 ) \pi(\lambda \mid T_1, T_2, T_3) π ( λ ∣ T 1 , T 2 , T 3 ) 最大化。对于 Gamma 分布 Gamma ( k , θ ) \text{Gamma}(k, \theta) Gamma ( k , θ ) ,众数为:λ ^ MAP = k − 1 θ = 4 − 1 5.7 = 3 5.7 ≈ 0.5263 \hat{\lambda}_{\text{MAP}} = \frac{k - 1}{\theta} = \frac{4 - 1}{5.7} = \frac{3}{5.7} \approx 0.5263 λ ^ MAP = θ k − 1 = 5.7 4 − 1 = 5.7 3 ≈ 0.5263
(d) 95% 等尾和最高后验密度置信区间
等尾置信区间 :等尾置信区间是使得后验分布两侧各截去 2.5 % 2.5\% 2.5% 的区间。对 λ ∣ T 1 , T 2 , T 3 ∼ Gamma ( 4 , 5.7 ) \lambda \mid T_1, T_2, T_3 \sim \text{Gamma}(4, 5.7) λ ∣ T 1 , T 2 , T 3 ∼ Gamma ( 4 , 5.7 ) ,可以通过查找 Gamma 分布的累积分布函数 (CDF) 来找到两个截点 λ L \lambda_L λ L 和 λ U \lambda_U λ U ,满足:P ( λ L ≤ λ ≤ λ U ∣ T 1 , T 2 , T 3 ) = 0.95 P(\lambda_L \leq \lambda \leq \lambda_U \mid T_1, T_2, T_3) = 0.95 P ( λ L ≤ λ ≤ λ U ∣ T 1 , T 2 , T 3 ) = 0.95
数值计算可以使用 Python 或 R 进行。比如在 Python 中可以用 scipy.stats.gamma.ppf
求得置信区间。
最高后验密度(HPD)置信区间 :HPD 区间是概率密度最高的区间,即在给定置信水平下包含最大后验概率密度的区间。HPD 区间的计算比较复杂,可以通过数值方法来进行。
(e) 后验概率
P ( H 0 : λ ≤ 1 / 2 ) P(H_0: \lambda \leq 1/2) P ( H 0 : λ ≤ 1/2 ) 后验概率 P ( H 0 : λ ≤ 1 / 2 ) P(H_0: \lambda \leq 1/2) P ( H 0 : λ ≤ 1/2 ) 可以通过计算 λ ≤ 0.5 \lambda \leq 0.5 λ ≤ 0.5 的后验累积概率来求得:
P ( λ ≤ 1 / 2 ∣ T 1 , T 2 , T 3 ) = ∫ 0 1 / 2 f ( λ ∣ 4 , 5.7 ) d λ P(\lambda \leq 1/2 \mid T_1, T_2, T_3) = \int_0^{1/2} f(\lambda \mid 4, 5.7) d\lambda P ( λ ≤ 1/2 ∣ T 1 , T 2 , T 3 ) = ∫ 0 1/2 f ( λ ∣ 4 , 5.7 ) d λ
同样可以通过数值方法来求解,比如使用 Python 的 scipy.stats.gamma.cdf
函数:
from scipy.stats import gamma
P_lambda_leq_half = gamma.cdf( 0.5 , a = 4 , scale = 1 / 5.7 )
这样可以得到具体的后验概率。
Let
y i ∣ θ i ∼ i n d . Poisson ( θ i ) y_i \mid \theta_i \sim^{ind.} \text{Poisson}(\theta_i) y i ∣ θ i ∼ in d . Poisson ( θ i )
θ i ∼ i i d Gamma ( 2 , b ) \theta_i \sim^{iid} \text{Gamma}(2, b) θ i ∼ ii d Gamma ( 2 , b )
for i = 1 , … , n i = 1, \dots, n i = 1 , … , n , where b b b is unknown. Find the empirical Bayes estimator of θ i , i = 1 , … , n \theta_i, i = 1, \dots, n θ i , i = 1 , … , n . (Note: If X ∼ Gamma ( a , b ) X \sim \text{Gamma}(a, b) X ∼ Gamma ( a , b ) , then its pdf is
p ( x ) = b a Γ ( a ) x a − 1 e − b x for x ≥ 0 , a , b > 0. p(x) = \frac{b^a}{\Gamma(a)} x^{a-1} e^{-bx} \text{ for } x \geq 0, a, b > 0. p ( x ) = Γ ( a ) b a x a − 1 e − b x for x ≥ 0 , a , b > 0.
Soultion 要找到 θ i \theta_i θ i 的经验贝叶斯估计,需要以下步骤:
1. 计算后验分布:
给定似然函数和先验分布:
似然函数:
P ( y i ∣ θ i ) = θ i y i e − θ i y i ! P(y_i \mid \theta_i) = \frac{\theta_i^{y_i} e^{-\theta_i}}{y_i!} P ( y i ∣ θ i ) = y i ! θ i y i e − θ i
先验分布:
p ( θ i ) = b 2 Γ ( 2 ) θ i 2 − 1 e − b θ i = b 2 θ i e − b θ i p(\theta_i) = \frac{b^2}{\Gamma(2)} \theta_i^{2-1} e^{-b\theta_i} = b^2 \theta_i e^{-b\theta_i} p ( θ i ) = Γ ( 2 ) b 2 θ i 2 − 1 e − b θ i = b 2 θ i e − b θ i
因此,后验分布为:
p ( θ i ∣ y i ) ∝ P ( y i ∣ θ i ) p ( θ i ) = θ i y i e − θ i ⋅ θ i e − b θ i = θ i y i + 1 e − ( b + 1 ) θ i p(\theta_i \mid y_i) \propto P(y_i \mid \theta_i) p(\theta_i) = \theta_i^{y_i} e^{-\theta_i} \cdot \theta_i e^{-b\theta_i} = \theta_i^{y_i+1} e^{-(b+1)\theta_i} p ( θ i ∣ y i ) ∝ P ( y i ∣ θ i ) p ( θ i ) = θ i y i e − θ i ⋅ θ i e − b θ i = θ i y i + 1 e − ( b + 1 ) θ i
这表明后验分布是一个新的 Gamma 分布:
θ i ∣ y i ∼ Gamma ( y i + 2 , b + 1 ) \theta_i \mid y_i \sim \text{Gamma}(y_i + 2, b + 1) θ i ∣ y i ∼ Gamma ( y i + 2 , b + 1 )
2. 计算后验均值:
后验均值(即贝叶斯估计)为:
E [ θ i ∣ y i ] = y i + 2 b + 1 E[\theta_i \mid y_i] = \frac{y_i + 2}{b + 1} E [ θ i ∣ y i ] = b + 1 y i + 2
3. 估计超参数 ( b ):
为了应用经验贝叶斯方法,我们需要估计未知的超参数 ( b )。首先,计算边缘似然函数:
P ( y i ) = ∫ 0 ∞ P ( y i ∣ θ i ) p ( θ i ) d θ i = b 2 ( y i + 1 ) ! ( b + 1 ) y i + 2 y i ! P(y_i) = \int_0^\infty P(y_i \mid \theta_i) p(\theta_i) d\theta_i = \frac{b^2 (y_i + 1)!}{(b + 1)^{y_i + 2} y_i!} P ( y i ) = ∫ 0 ∞ P ( y i ∣ θ i ) p ( θ i ) d θ i = ( b + 1 ) y i + 2 y i ! b 2 ( y i + 1 )!
因此,样本的对数似然函数为:
log L ( b ) = 2 n log b − ( S + 2 n ) log ( b + 1 ) + ∑ i = 1 n log ( y i + 1 ) \log L(b) = 2n \log b - (S + 2n) \log(b + 1) + \sum_{i=1}^n \log(y_i + 1) log L ( b ) = 2 n log b − ( S + 2 n ) log ( b + 1 ) + i = 1 ∑ n log ( y i + 1 )
其中 S = ∑ i = 1 n y i S = \sum_{i=1}^n y_i S = ∑ i = 1 n y i 。
对 ( b ) 求导并令导数为零,得到:
d d b log L ( b ) = 2 n b − S + 2 n b + 1 = 0 \frac{d}{db} \log L(b) = \frac{2n}{b} - \frac{S + 2n}{b + 1} = 0 d b d log L ( b ) = b 2 n − b + 1 S + 2 n = 0
解方程得到 ( b ) 的估计值:
b ^ = 2 n S \hat{b} = \frac{2n}{S} b ^ = S 2 n
4. 计算经验贝叶斯估计:
将 b ^ \hat{b} b ^ 代入后验均值,得到经验贝叶斯估计:
θ ^ i = E [ θ i ∣ y i , b ^ ] = y i + 2 b ^ + 1 = ( y i + 2 ) S 2 n + S \hat{\theta}_i = E[\theta_i \mid y_i, \hat{b}] = \frac{y_i + 2}{\hat{b} + 1} = (y_i + 2) \frac{S}{2n + S} θ ^ i = E [ θ i ∣ y i , b ^ ] = b ^ + 1 y i + 2 = ( y i + 2 ) 2 n + S S
最终答案:
经验贝叶斯估计为:
θ ^ i = ( y i + 2 ) ⋅ ∑ j = 1 n y j 2 n + ∑ j = 1 n y j \hat{\theta}_i = (y_i + 2) \cdot \frac{\sum_{j=1}^n y_j}{2n + \sum_{j=1}^n y_j} θ ^ i = ( y i + 2 ) ⋅ 2 n + ∑ j = 1 n y j ∑ j = 1 n y j
Suppose y ∣ β ∼ Gamma ( α , β ) y \mid \beta \sim \text{Gamma}(\alpha, \beta) y ∣ β ∼ Gamma ( α , β ) , where α \alpha α is known.
(a) Find the Jeffreys prior for β \beta β .
(b) Using the Jeffreys prior from Part 1, derive the posterior distribution p ( β ∣ y 1 , … , y n ) p(\beta \mid y_1, \dots, y_n) p ( β ∣ y 1 , … , y n ) for n n n i.i.d. observations y 1 , … , y n y_1, \dots, y_n y 1 , … , y n .
Solution (a) 求 Jeffreys 先验分布:
给定条件 y ∣ β ∼ Gamma ( α , β ) y \mid \beta \sim \text{Gamma}(\alpha, \beta) y ∣ β ∼ Gamma ( α , β ) ,其中 α \alpha α 已知。
首先,写出似然函数:f ( y ∣ β ) = β α Γ ( α ) y α − 1 e − β y f(y \mid \beta) = \frac{\beta^\alpha}{\Gamma(\alpha)} y^{\alpha -1} e^{-\beta y} f ( y ∣ β ) = Γ ( α ) β α y α − 1 e − β y
计算对数似然函数:ln f ( y ∣ β ) = α ln β − ln Γ ( α ) + ( α − 1 ) ln y − β y \ln f(y \mid \beta) = \alpha \ln \beta - \ln \Gamma(\alpha) + (\alpha -1) \ln y - \beta y ln f ( y ∣ β ) = α ln β − ln Γ ( α ) + ( α − 1 ) ln y − β y
计算关于 β \beta β 的一阶导数:∂ ∂ β ln f ( y ∣ β ) = α β − y \frac{\partial}{\partial \beta} \ln f(y \mid \beta) = \frac{\alpha}{\beta} - y ∂ β ∂ ln f ( y ∣ β ) = β α − y
计算关于 β \beta β 的二阶导数:∂ 2 ∂ β 2 ln f ( y ∣ β ) = − α β 2 \frac{\partial^2}{\partial \beta^2} \ln f(y \mid \beta) = -\frac{\alpha}{\beta^2} ∂ β 2 ∂ 2 ln f ( y ∣ β ) = − β 2 α
Fisher 信息量为二阶导数的负期望值:I ( β ) = − E [ ∂ 2 ∂ β 2 ln f ( y ∣ β ) ] = α β 2 I(\beta) = -E\left[ \frac{\partial^2}{\partial \beta^2} \ln f(y \mid \beta) \right] = \frac{\alpha}{\beta^2} I ( β ) = − E [ ∂ β 2 ∂ 2 ln f ( y ∣ β ) ] = β 2 α
因此,Jeffreys 先验分布为:π ( β ) ∝ I ( β ) = α β 2 ∝ 1 β \pi(\beta) \propto \sqrt{I(\beta)} = \sqrt{\frac{\alpha}{\beta^2}} \propto \frac{1}{\beta} π ( β ) ∝ I ( β ) = β 2 α ∝ β 1
答案: Jeffreys 先验分布为 π ( β ) ∝ 1 β \pi(\beta) \propto \dfrac{1}{\beta} π ( β ) ∝ β 1 。
(b) 推导后验分布 p ( β ∣ y 1 , … , y n ) p(\beta \mid y_1, \dots, y_n) p ( β ∣ y 1 , … , y n ) :
利用 Jeffreys 先验分布 π ( β ) ∝ 1 β \pi(\beta) \propto \dfrac{1}{\beta} π ( β ) ∝ β 1 ,以及独立同分布的观测数据,似然函数为:
L ( β ) = ∏ i = 1 n f ( y i ∣ β ) = ( β α Γ ( α ) ) n ∏ i = 1 n y i α − 1 e − β y i L(\beta) = \prod_{i=1}^n f(y_i \mid \beta) = \left( \frac{\beta^\alpha}{\Gamma(\alpha)} \right)^n \prod_{i=1}^n y_i^{\alpha -1} e^{-\beta y_i} L ( β ) = i = 1 ∏ n f ( y i ∣ β ) = ( Γ ( α ) β α ) n i = 1 ∏ n y i α − 1 e − β y i
后验分布正比于先验分布与似然函数的乘积:
p ( β ∣ y 1 , … , y n ) ∝ π ( β ) L ( β ) p(\beta \mid y_1, \dots, y_n) \propto \pi(\beta) L(\beta) p ( β ∣ y 1 , … , y n ) ∝ π ( β ) L ( β )
∝ 1 β ⋅ β n α e − β ∑ i = 1 n y i = β n α − 1 e − β ∑ i = 1 n y i \propto \frac{1}{\beta} \cdot \beta^{n\alpha} e^{-\beta \sum_{i=1}^n y_i} \\= \beta^{n\alpha -1} e^{-\beta \sum_{i=1}^n y_i} ∝ β 1 ⋅ β n α e − β ∑ i = 1 n y i = β n α − 1 e − β ∑ i = 1 n y i
这对应于 Gamma 分布的形式,因此:
β ∣ y 1 , … , y n ∼ Gamma ( n α , ∑ i = 1 n y i ) \beta \mid y_1, \dots, y_n \sim \text{Gamma}\left(n\alpha, \sum_{i=1}^n y_i\right) β ∣ y 1 , … , y n ∼ Gamma ( n α , i = 1 ∑ n y i )
答案: 后验分布为 β ∣ y 1 , … , y n ∼ Gamma ( n α , ∑ i = 1 n y i ) \beta \mid y_1, \dots, y_n \sim \text{Gamma}\left(n\alpha, \sum_{i=1}^n y_i\right) β ∣ y 1 , … , y n ∼ Gamma ( n α , ∑ i = 1 n y i ) 。
公众号:AI悦创【二维码】 AI悦创·编程一对一
AI悦创·推出辅导班啦,包括「Python 语言辅导班、C++ 辅导班、java 辅导班、算法/数据结构辅导班、少儿编程、pygame 游戏开发、Web、Linux」,全部都是一对一教学:一对一辅导 + 一对一答疑 + 布置作业 + 项目实践等。当然,还有线下线上摄影课程、Photoshop、Premiere 一对一教学、QQ、微信在线,随时响应!微信:Jiabcdefh
C++ 信息奥赛题解,长期更新!长期招收一对一中小学信息奥赛集训,莆田、厦门地区有机会线下上门,其他地区线上。微信:Jiabcdefh
方法一:QQ
方法二:微信:Jiabcdefh