close
Warning:
Can't synchronize with repository "(default)" (/var/svn/tolp does not appear to be a Subversion repository.). Look in the Trac log for more information.
- Timestamp:
-
Dec 26, 2010, 12:45:33 AM (14 years ago)
- Author:
-
Víctor de Buen Remiro
- Comment:
-
--
Legend:
- Unmodified
- Added
- Removed
- Modified
-
v8
|
v9
|
|
32 | 32 | can be expressed as a constrained uniform distribution. |
33 | 33 | |
34 | | Let [[LatexEquation( \beta )]] a uniform random variable in a region |
| 34 | Let [[LatexEquation( x )]] a uniform random variable in a region |
35 | 35 | [[LatexEquation(\Omega\in\mathbb{R}^{n} )]] which likelihood function is [[BR]] |
36 | 36 | |
37 | | [[LatexEquation(lk\left(\beta\right) \propto 1 )]] |
| 37 | [[LatexEquation(lk\left(x\right) \propto 1 )]] |
38 | 38 | |
39 | 39 | Since the logarithm of the likelihood but a constant is zero, when |
… |
… |
|
51 | 51 | bounds:[[BR]][[BR]] |
52 | 52 | |
53 | | [[LatexEquation( \beta\in\Omega\Longleftrightarrow l_{k}\leq\beta_{i_k}\leq u_{k}\wedge-\infty\leq l_{k}<u_{k}\leq\infty\forall k=1\ldots r )]] |
| 53 | [[LatexEquation( x\in\Omega\Longleftrightarrow l_{k}\leqx_{i_k}\leq u_{k}\wedge-\infty\leq l_{k}<u_{k}\leq\infty\forall k=1\ldots r )]] |
54 | 54 | |
55 | 55 | If both lower and upper bounds are non finite, then we call it the neutral |
… |
… |
|
63 | 63 | A polytope prior is defined by a system of compatible linear inequalities [[BR]] |
64 | 64 | |
65 | | [[LatexEquation( A\beta\leq a\wedge A\in\mathbb{R}^{r\times n}\wedge a\in\mathbb{R}^{r} )]] |
| 65 | [[LatexEquation( Ax\leq a\wedge A\in\mathbb{R}^{r\times n}\wedge a\in\mathbb{R}^{r} )]] |
66 | 66 | |
67 | 67 | || ''bounded region''[[BR]][[Image(source:/tolp/OfficialTolArchiveNetwork/BysPrior/doc/image003.png)]] || ''unbounded region''[[BR]][[Image(source:/tolp/OfficialTolArchiveNetwork/BysPrior/doc/image002.png)]] || |
… |
… |
|
70 | 70 | An special and common case of polytope region is the defined by order relations like |
71 | 71 | |
72 | | [[LatexEquation( \beta_{i}}\leq\beta_{j}})]] |
| 72 | [[LatexEquation( x_{i}}\leqx_{j}})]] |
73 | 73 | |
74 | 74 | We can implement this type of prior by means of a set of [[LatexEquation( r )]] |
… |
… |
|
78 | 78 | that is equivalent to the full set of linear inequations. |
79 | 79 | |
80 | | ||If we define [[BR]][[BR]] [[LatexEquation( d\left(\beta\right)=A\beta-a=\left(d_{k}\left(\beta\right)\right)_{k=1\ldots r} )]] then [[br]][[br]] [[LatexEquation( D_{k}\left(\beta\right)=\begin{cases} 0 & \forall d_{k}\left(\beta\right)\leq0\\ d_{k}\left(\beta\right) & \forall d_{k}\left(\beta\right)>0\end{cases} )]] [[br]][[br]] is a continuous function in [[LatexEquation( \mathbb{R}^{n} )]] and [[br]][[br]] [[LatexEquation( D_{k}^{3}\left(\beta\right)=\begin{cases} 0 & \forall d_{k}\left(\beta\right)\leq0\\ d_{k}^{3}\left(\beta\right) & \forall d_{k}\left(\beta\right)>0\end{cases} )]] [[br]][[br]] is continuous and differentiable in [[LatexEquation( \mathbb{R}^{n} )]] [[br]][[br]] [[LatexEquation( \frac{\partial D_{k}^{3}\left(\beta\right)}{\partial\beta_{i}}=\begin{cases} 0 & \forall d_{k}\left(\beta\right)\leq0\\ 3d_{k}^{2}\left(\beta\right)A_{ki} & \forall d_{k}\left(\beta\right)>0\end{cases} )]] || [[Image(source:/tolp/OfficialTolArchiveNetwork/BysPrior/doc/image004.png)]] || |
| 80 | ||If we define [[BR]][[BR]] [[LatexEquation( d\left(x\right)=Ax-a=\left(d_{k}\left(x\right)\right)_{k=1\ldots r} )]] then [[br]][[br]] [[LatexEquation( D_{k}\left(x\right)=\begin{cases} 0 & \forall d_{k}\left(x\right)\leq0\\ d_{k}\left(x\right) & \forall d_{k}\left(x\right)>0\end{cases} )]] [[br]][[br]] is a continuous function in [[LatexEquation( \mathbb{R}^{n} )]] and [[br]][[br]] [[LatexEquation( D_{k}^{3}\left(x\right)=\begin{cases} 0 & \forall d_{k}\left(x\right)\leq0\\ d_{k}^{3}\left(x\right) & \forall d_{k}\left(x\right)>0\end{cases} )]] [[br]][[br]] is continuous and differentiable in [[LatexEquation( \mathbb{R}^{n} )]] [[br]][[br]] [[LatexEquation( \frac{\partial D_{k}^{3}\left(x\right)}{\partialx_{i}}=\begin{cases} 0 & \forall d_{k}\left(x\right)\leq0\\ 3d_{k}^{2}\left(x\right)A_{ki} & \forall d_{k}\left(x\right)>0\end{cases} )]] || [[Image(source:/tolp/OfficialTolArchiveNetwork/BysPrior/doc/image004.png)]] || |
81 | 81 | |
82 | 82 | The feasibility condition can be defined as a single nonlinear |
83 | 83 | inequality continuous and differentiable everywhere |
84 | 84 | |
85 | | [[LatexEquation( g\left(\beta\right)=\underset{k=1}{\overset{r}{\sum}}D_{k}^{3}\left(\beta\right)\leq0 )]] |
| 85 | [[LatexEquation( g\left(x\right)=\underset{k=1}{\overset{r}{\sum}}D_{k}^{3}\left(x\right)\leq0 )]] |
86 | 86 | |
87 | 87 | The gradient of this function is |
88 | 88 | |
89 | | [[LatexEquation( \frac{\partial g\left(\beta\right)}{\partial\beta_{i}}=3\underset{k=1}{\overset{r}{\sum}}D_{k}^{2}\left(\beta\right)A_{ki} )]] |
| 89 | [[LatexEquation( \frac{\partial g\left(x\right)}{\partialx_{i}}=3\underset{k=1}{\overset{r}{\sum}}D_{k}^{2}\left(x\right)A_{ki} )]] |
90 | 90 | |
91 | 91 | == Random priors == |
… |
… |
|
101 | 101 | distribution |
102 | 102 | |
103 | | [[LatexEquation( \beta\sim N\left(\mu,\Sigma\right) )]] |
| 103 | [[LatexEquation( x\sim N\left(\mu,\Sigma\right) )]] |
104 | 104 | |
105 | 105 | which likelihood function is |
106 | 106 | |
107 | | [[LatexEquation( lk\left(\beta\right)=\frac{1}{\left(2\pi\right)^{n}\left|\Sigma\right|^{\frac{1}{2}}}e^{^{-\frac{1}{2}\left(\beta-\mu\right)^{T}\Sigma^{-1}\left(\beta-\mu\right)}} )]] |
| 107 | [[LatexEquation( lk\left(x\right)=\frac{1}{\left(2\pi\right)^{n}\left|\Sigma\right|^{\frac{1}{2}}}e^{^{-\frac{1}{2}\left(x-\mu\right)^{T}\Sigma^{-1}\left(x-\mu\right)}} )]] |
108 | 108 | |
109 | 109 | The log-likelihood is |
110 | 110 | |
111 | | [[LatexEquation( L\left(\beta\right)=\ln\left(lk\left(\beta\right)\right)=-\frac{n}{2}\ln\left(2\pi\right)-\frac{1}{2}\ln\left(\left|\Sigma\right|\right)-\frac{1}{2}\left(\beta-\mu\right)^{T}\Sigma^{-1}\left(\beta-\mu\right) )]] |
| 111 | [[LatexEquation( L\left(x\right)=\ln\left(lk\left(x\right)\right)=-\frac{n}{2}\ln\left(2\pi\right)-\frac{1}{2}\ln\left(\left|\Sigma\right|\right)-\frac{1}{2}\left(x-\mu\right)^{T}\Sigma^{-1}\left(x-\mu\right) )]] |
112 | 112 | |
113 | 113 | The gradient is |
114 | 114 | |
115 | | [[LatexEquation( \left(\frac{\partial L\left(\beta\right)}{\partial\beta_{i}}\right)_{i=1\ldots n}=-\Sigma^{-1}\left(\beta-\mu\right) )]] |
| 115 | [[LatexEquation( \left(\frac{\partial L\left(x\right)}{\partialx_{i}}\right)_{i=1\ldots n}=-\Sigma^{-1}\left(x-\mu\right) )]] |
116 | 116 | |
117 | 117 | and the hessian |
118 | 118 | |
119 | | [[LatexEquation( \left(\frac{\partial^{2}L\left(\beta\right)}{\partial\beta_{i}\partial\beta_{j}}\right)_{i,j=1\ldots n}=-\Sigma^{-1} )]] |
| 119 | [[LatexEquation( \left(\frac{\partial^{2}L\left(x\right)}{\partialx_{i}\partialx_{j}}\right)_{i,j=1\ldots n}=-\Sigma^{-1} )]] |
120 | 120 | |
121 | 121 | |
122 | | == Transformed prior == |
| 122 | === Inverse chi-square prior === |
| 123 | |
| 124 | In a model with normal waste is permissible to award the unknown variance an |
| 125 | inverse chi-square distribution with scale parameter equal to the average of |
| 126 | squares of residuals and freedom degrees the data length. |
| 127 | |
| 128 | The likelihood is now the scalar function |
| 129 | |
| 130 | [[LatexEquation( lk\left(x\right)=\frac{\left(\frac{\nu}{2}\right)^{\frac{\nu}{2}}}{\Gamma\left(\frac{\nu}{2}\right)}x^{-\frac{\nu}{2}-1}e^{-\frac{\nu}{2x}} )]] |
| 131 | |
| 132 | The log-likelihood is |
| 133 | |
| 134 | [[LatexEquation( L\left(x\right)=\frac{\nu}{2}\ln\left(\frac{\nu}{2}\right)-\ln\left(\Gamma\left(\frac{\nu}{2}\right)\right)-\left(\frac{\nu}{2}+1\right)x-\frac{\nu}{2x} )]] |
| 135 | |
| 136 | The first derivative is |
| 137 | |
| 138 | [[LatexEquation( \frac{dL\left(x\right)}{dx}=-\left(\frac{\nu}{2}+1\right)+\frac{\nu}{2x^{2}} )]] |
| 139 | |
| 140 | The second derivative is |
| 141 | |
| 142 | [[LatexEquation( \frac{d^{2}L\left(x\right)}{d^{2}x}=-\frac{\nu}{6x^{3}} )]] |
| 143 | |
| 144 | |
| 145 | |
| 146 | === Transformed prior === |
123 | 147 | |
124 | 148 | Sometimes we have an information prior that has a simple distribution over a |
… |
… |
|
127 | 151 | as in the case of latent variables in hierarquical models |
128 | 152 | |
129 | | [[LatexEquation( \beta_{i}\sim N\left(\beta_{1},\sigma\right)\forall i=2\ldots n )]] |
| 153 | [[LatexEquation( x_{i}\sim N\left(x_{1},\sigma\right)\forall i=2\ldots n )]] |
130 | 154 | |
131 | 155 | Then we can define a variable transformation like this |
132 | 156 | |
133 | | [[LatexEquation( \gamma \left(\beta\right)=\left(\begin{array}{c} \beta_{2}-\beta_{1}\\ \vdots\\ \beta_{n}-\beta_{1}\end{array}\right)\in\mathbb{R}^{n-1} )]] |
| 157 | [[LatexEquation( \gamma \left(x\right)=\left(\begin{array}{c} x_{2}-x_{1}\\ \vdots\\ x_{n}-x_{1}\end{array}\right)\in\mathbb{R}^{n-1} )]] |
134 | 158 | |
135 | 159 | and define the simple normal prior |
… |
… |
|
140 | 164 | transformed one as |
141 | 165 | |
142 | | [[LatexEquation( L\left(\beta\right)=L^{*}\left(\gamma\left(\beta\right)\right) )]] |
| 166 | [[LatexEquation( L\left(x\right)=L^{*}\left(\gamma\left(x\right)\right) )]] |
143 | 167 | |
144 | 168 | If we know the first and second derivatives of the transformation |
145 | 169 | |
146 | | [[LatexEquation( \frac{\partial\gamma_{k}}{\partial\beta_{i}} )]] |
| 170 | [[LatexEquation( \frac{\partial\gamma_{k}}{\partialx_{i}} )]] |
147 | 171 | |
148 | | [[LatexEquation( \frac{\partial^{2}\gamma_{k}}{\partial\beta_{i}\partial\beta_{j}} )]] |
| 172 | [[LatexEquation( \frac{\partial^{2}\gamma_{k}}{\partialx_{i}\partialx_{j}} )]] |
149 | 173 | |
150 | 174 | the we can calculate the original gradient and the hessian after the gradient |
151 | 175 | and the hessian of the transformed prior as following |
152 | 176 | |
153 | | [[LatexEquation( \frac{\partial L\left(\beta\right)}{\partial\beta_{i}}=\underset{k=1}{\overset{K}{\sum}}\frac{\partial L^{*}\left(\gamma\right)}{\partial\gamma_{k}}\frac{\partial\gamma_{k}}{\partial\beta_{i}} )]] |
| 177 | [[LatexEquation( \frac{\partial L\left(x\right)}{\partialx_{i}}=\underset{k=1}{\overset{K}{\sum}}\frac{\partial L^{*}\left(\gamma\right)}{\partial\gamma_{k}}\frac{\partial\gamma_{k}}{\partialx_{i}} )]] |
154 | 178 | |
155 | | [[LatexEquation( \frac{\partial L^{2}\left(\beta\right)}{\partial\beta_{i}\partial\beta_{j}}=\underset{k=1}{\overset{K}{\sum}}\left(\frac{\partial^{2}L^{*}\left(\gamma\right)}{\partial\gamma_{k}\partial\beta_{j}}\frac{\partial\gamma_{k}}{\partial\beta_{i}}+\frac{\partial L^{*}\left(\gamma\right)}{\partial\gamma_{k}}\frac{\partial^{2}\gamma_{k}}{\partial\beta_{i}\partial\beta_{j}}\right)=\underset{k=1}{\overset{K}{\sum}}\left(\frac{\partial^{2}L^{*}\left(\gamma\right)}{\partial\gamma_{k}\partial\gamma_{k}}\frac{\partial\gamma_{k}}{\partial\beta_{i}}\frac{\partial\gamma_{k}}{\partial\beta_{j}}+\frac{\partial L^{*}\left(\gamma\right)}{\partial\gamma_{k}}\frac{\partial^{2}\gamma_{k}}{\partial\beta_{i}\partial\beta_{j}}\right) )]] |
| 179 | [[LatexEquation( \frac{\partial L^{2}\left(x\right)}{\partialx_{i}\partialx_{j}}=\underset{k=1}{\overset{K}{\sum}}\left(\frac{\partial^{2}L^{*}\left(\gamma\right)}{\partial\gamma_{k}\partialx_{j}}\frac{\partial\gamma_{k}}{\partialx_{i}}+\frac{\partial L^{*}\left(\gamma\right)}{\partial\gamma_{k}}\frac{\partial^{2}\gamma_{k}}{\partialx_{i}\partialx_{j}}\right)=\underset{k=1}{\overset{K}{\sum}}\left(\frac{\partial^{2}L^{*}\left(\gamma\right)}{\partial\gamma_{k}\partial\gamma_{k}}\frac{\partial\gamma_{k}}{\partialx_{i}}\frac{\partial\gamma_{k}}{\partialx_{j}}+\frac{\partial L^{*}\left(\gamma\right)}{\partial\gamma_{k}}\frac{\partial^{2}\gamma_{k}}{\partialx_{i}\partialx_{j}}\right) )]] |
156 | 180 | |
| 181 | Thus it is possible to define a variety of information a priori from a |
| 182 | pre-existing instance of a transformation defined with their first and |
| 183 | second derivatives. |
157 | 184 | |
| 185 | For example we can define a log-normal prior without to define explicitly |
| 186 | its log-likelihood, gradient and hessian. |