close Warning: Can't synchronize with repository "(default)" (/var/svn/tolp does not appear to be a Subversion repository.). Look in the Trac log for more information.

Version 10 (modified by Víctor de Buen Remiro, 14 years ago) (diff)

--

Package QltvRespModel

Max-likelihood and bayesian estimation of qualitative response models.

Weighted Boolean Regresions

Abstract class @WgtBoolReg is the base to inherit weighted boolean regressions as logit or probit or any other, given just the scalar distribution function  F and the corresponding density function  f . In a weighted regression each row of input data has a distinct weight in the likelihood function. For example, it can be very usefull to handle with data extrated from an stratified sample.

This class implements max-likelihood estimation by means of package NonLinGloOpt and bayesian simulation using BysSampler.

Let be

  •  X\in\mathbb{R}^{m\times n} the regression input matrix
  •  w\in\mathbb{R}^{m} the vector of weights of each register
  •  y\in\mathbb{R}^{m} the regression output matrix

The hypotesis is that  \forall i=1 \dots m

 y_{i}\sim Bernoulli\left(\pi_{i}\right)  \pi_{i}=Pr\left[y_{i}=1\right] = F\left(X_{i}\beta\right)

The likelihood function is then

 lk\left(\beta\right)=\underset{i}{\prod}\pi_{i}^{w_{i}y_{i}}\left(1-\pi_{i}\right)^{w_{i}\left(1-y_{i}\right)}

and its logarithm

 L\left(\beta\right)=\ln\left(lk\left(\beta\right)\right)=\underset{i}{\sum}w_{i}\left(y_{i}\ln\left(\pi_{i}\right)+\left(1-y_{i}\right)\ln\left(1-\pi_{i}\right)\right)

The gradient of the logarithm of the likelihood function will be

 \frac{\partial L\left(\beta\right)}{\partial\beta_{j}}=\underset{i}{\sum}w_{i}\left(y_{i}\frac{f\left(x_{i}\beta\right)}{F\left(x_{i}\beta\right)}-\left(1-y_{i}\right)\frac{f\left(x_{i}\beta\right)}{1-F\left(x_{i}\beta\right)}\right)x_{ij}

and the hessian is

 \frac{\partial L\left(\beta\right)}{\partial\beta_{i}\partial_{j}}=\underset{k}{\sum}w_{k}\left(y_{k}\frac{f'\left(x_{k}\beta\right)F\left(x_{k}\beta\right)-f^{2}\left(x_{k}\beta\right)}{F^{2}\left(x_{k}\beta\right)}-\left(1-y_{k}\right)\frac{f'\left(x_{k}\beta\right)\left(1-F\left(x_{k}\beta\right)\right)+f^{2}\left(x_{k}\beta\right)}{\left(1-F\left(x_{k}\beta\right)\right)^{2}}\right)x_{ik}x_{jk}

User can and should define scalar truncated normal or uniform prior information and bounds for all variables for which he/she has robust knowledge.

 \beta_k \sim N\left(\nu_k, \sigma_k \right)

 l_k \le \beta_k \le u_k \wedge l_k < u_k

When  \sigma_k is infinite or unknown we will express a uniform prior. When  l_k = -\infty or unknown we will express that variable has no lower bound. When  u_k = +\infty or unknown we will express that variable has no upper bound.

It's also allowed to give any set of constraining linear inequations if they are compatible with lower and upper bounds

 A \beta \le a

Weighted Logit Regression

Class @WgtLogit is an specialization of class @WgtBoolReg that handles with weighted logit regressions.

In this case we have that scalar distribution is the logistic one.

 F\left(z\right) = \frac{1}{1+e^{-z}}

 f\left(z\right) = \frac{e^{-z}}{\left(1+e^{-z}\right)^2}

 f'\left(z\right) = - f\left(z\right) F\left(z\right) {\left(1-e^{-z}\right)}

 L\left(\beta\right)=\underset{i}{\sum}w_{i}\left(y_{i}x_{i}^{t}\beta-\ln\left(1+e^{x_{i}^{t}\beta}\right)\right)

 \frac{\partial L\left(\beta\right)}{\partial\beta_{j}}=\underset{i}{\sum}w_{i}\left(y_{i}x_{ij}-x_{ij}\left(\frac{e^{x_{i}^{t}\beta}}{1+e^{x_{i}^{t}\beta}}\right)\right)=\underset{i}{\sum}w_{i}x_{ij}\left(y_{i}-\left(\frac{e^{x_{i}^{t}\beta}}{1+e^{x_{i}^{t}\beta}}\right)\right)

 \frac{\partial L\left(\beta\right)}{\partial\beta_{i}\partial_{j}}=-\underset{k}{\sum}w_{k}\frac{\left(1+e^{x_{k}^{t}\beta}\right)e^{x_{k}^{t}\beta}x_{ki}x_{kj}-\left(e^{x_{k}^{t}\beta}\right)^{2}x_{ki}x_{kj}}{\left(1+e^{x_{k}^{t}\beta}\right)}=-\underset{k}{\sum}x_{ki}x_{kj}w_{k}\pi_{i}\left(1-\pi_{i}\right)

Weighted Probit Regression

Class @WgtProbit is an specialization of class @WgtBoolReg that handles with weighted probit regressions.

In this case we have that scalar distribution is the standard normal one.

 F\left(z\right) = \Phi\left(z\right)

 f\left(z\right) = \phi\left(z\right)

 f'\left(z\right) = -z \phi\left(z\right)

 L\left(\beta\right)=\underset{i}{\sum}w_{i}\left(y_{i}\ln\left(\Phi\left(x_{i}\beta\right)\right)+\left(1-y_{i}\right)\ln\left(\Phi\left(-x_{i}\beta\right)\right)\right)

 \frac{\partial L\left(\beta\right)}{\partial\beta_{j}}=\underset{i}{\sum}w_{i}\left(y_{i}\frac{\phi\left(x_{i}\beta\right)}{\Phi\left(x_{i}\beta\right)}-\left(1-y_{i}\right)\frac{\phi\left(x_{i}\beta\right)}{\Phi\left(-x_{i}\beta\right)}\right)x_{ij}

 \frac{\partial L\left(\beta\right)}{\partial\beta_{i}\partial_{j}}=-\underset{k}{\sum}w_{k}\phi\left(x_{k}\beta\right)\left(y_{k}\frac{z\Phi\left(x_{k}\beta\right)+\phi\left(x_{k}\beta\right)}{\Phi^{2}\left(x_{k}\beta\right)}+\left(1-y_{k}\right)\frac{-z \Phi\left(-x_{k}\beta\right)+\phi\left(x_{k}\beta\right)}{\Phi\left(-x_{k}\beta\right)^{2}}\right)x_{ik}x_{jk}