[[PageOutline]]

= Package QltvRespModel =

Max-likelihood and bayesian estimation of qualitative response models as an special 
case of [wiki:OfficialTolArchiveNetworkGrzLinModel generalized linear models]

== Weighted Boolean Regresions ==

Abstract class 
[source:/tolp/OfficialTolArchiveNetwork/QltvRespModel/WgtBoolReg.tol @WgtBoolReg] 
is an specialization of class 
[source:/tolp/OfficialTolArchiveNetwork/GrzLinModel/WgtReg.tol GrzLinModel::@WgtReg]
and is the base to inherit weighted boolean regressions as logit or probit or any other, 
given just the scalar distribution function [[LatexEquation( F )]] and the 
corresponding density function [[LatexEquation( f )]]. In a weighted regression 
each row of input data has a distinct weight in the likelihood function. For
example, it can be very usefull to handle with data extrated from an stratified 
sample.

Let be 
 * [[LatexEquation( X\in\mathbb{R}^{m\times n} )]] the regression input matrix 
 * [[LatexEquation( w\in\mathbb{R}^{m} )]] the vector of weights of each register
 * [[LatexEquation( y\in\mathbb{R}^{m} )]] the regression output matrix 
 
The hypotesis is that [[LatexEquation( \forall i=1 \dots m )]] 

  [[LatexEquation( y_{i}\sim Bernoulli\left(\pi_{i}\right) )]] [[BR]] [[BR]]
  [[LatexEquation( \pi_{i}=Pr\left[y_{i}=1\right] = F\left(X_{i}\beta\right) )]] 

The likelihood function is then 

  [[LatexEquation( lk\left(\beta\right)=\underset{i}{\prod}\pi_{i}^{w_{i}y_{i}}\left(1-\pi_{i}\right)^{w_{i}\left(1-y_{i}\right)} )]] 

and its logarithm

  [[LatexEquation( L\left(\beta\right)=\ln\left(lk\left(\beta\right)\right)=\underset{i}{\sum}w_{i}\left(y_{i}\ln\left(\pi_{i}\right)+\left(1-y_{i}\right)\ln\left(1-\pi_{i}\right)\right) )]] 

The gradient of the logarithm of the likelihood function will be

  [[LatexEquation( \frac{\partial L\left(\beta\right)}{\partial\beta_{j}}=\underset{i}{\sum}w_{i}\left(y_{i}\frac{f\left(x_{i}\beta\right)}{F\left(x_{i}\beta\right)}-\left(1-y_{i}\right)\frac{f\left(x_{i}\beta\right)}{1-F\left(x_{i}\beta\right)}\right)x_{ij} )]] 

and the hessian is

  [[LatexEquation( \frac{\partial L\left(\beta\right)}{\partial\beta_{i}\partial_{j}}=\underset{k}{\sum}w_{k}\left(y_{k}\frac{f'\left(x_{k}\beta\right)F\left(x_{k}\beta\right)-f^{2}\left(x_{k}\beta\right)}{F^{2}\left(x_{k}\beta\right)}-\left(1-y_{k}\right)\frac{f'\left(x_{k}\beta\right)\left(1-F\left(x_{k}\beta\right)\right)+f^{2}\left(x_{k}\beta\right)}{\left(1-F\left(x_{k}\beta\right)\right)^{2}}\right)x_{ik}x_{jk} )]] 


User can and should define scalar truncated normal or uniform prior information and 
bounds for all variables for which he/she has robust knowledge.[[BR]] [[BR]]
[[LatexEquation( \beta_k \sim N\left(\nu_k, \sigma_k \right) )]] [[BR]] [[BR]]
[[LatexEquation( l_k \le \beta_k \le u_k \wedge l_k < u_k)]] [[BR]] [[BR]]
When [[LatexEquation( \sigma_k )]] is infinite or unknown we will express a uniform
prior.
When [[LatexEquation( l_k = -\infty)]] or unknown we will express that variable 
has no lower bound.
When [[LatexEquation( u_k = +\infty)]] or unknown we will express that variable
has no upper bound.

It's also allowed to give any set of constraining linear inequations if they 
are compatible with lower and upper bounds [[BR]] [[BR]]
[[LatexEquation( A \beta \le a )]] [[BR]] [[BR]]

This class implements max-likelihood estimation by means of package 
[wiki:OfficialTolArchiveNetworkNonLinGloOpt NonLinGloOpt] and bayesian simulation
using [wiki:OfficialTolArchiveNetworkBysSampler BysSampler].

The only mandatory members are the matrices of output and input of the regression
{{{
#!cpp
  //Output vector 0 o 1 (mx1)
  VMatrix y;
  //Input matrix (mxn)
  VMatrix X;
}}}
You can also specify these other members:
{{{
#!cpp
  //Weights  vector (mx1), default values are 1
  VMatrix w=Rand(0,0,0,0);
  //Name of output
  Text output.name = "";
  //Names of input variables
  Set input.name = Copy(Empty);
  //Set of GrzLinModel::@PsbTrnNrmUnfSclDst
  Set prior = Copy(Empty);
  //Constraining matrices A*b<=a
  //Constraining coefficient matrix
  VMatrix A=Rand(0,0,0,0); 
  //Constraining border vector
  VMatrix a=Rand(0,0,0,0); 
}}}

=== Weighted Logit Regression ===
Class [source:/tolp/OfficialTolArchiveNetwork/QltvRespModel/WgtLogit.tol @WgtLogit] 
is an specialization of class 
[source:/tolp/OfficialTolArchiveNetwork/QltvRespModel/WgtBoolReg.tol @WgtBoolReg]
that handles with weighted logit regressions.

You can view [source:/tolp/OfficialTolArchiveNetwork/QltvRespModel/test/test_0003/test.tol test_0003]
of using this class.

In this case we have that scalar distribution is the logistic one.

''Scalar cumulant'': [[BR]] 
  [[LatexEquation( F\left(z\right) = \frac{1}{1+e^{-z}} )]] [[BR]] [[BR]]

''Scalar density'': [[BR]] 
  [[LatexEquation( f\left(z\right) = \frac{e^{-z}}{\left(1+e^{-z}\right)^2} )]] [[BR]] [[BR]]

''Scalar density derivative'': [[BR]] 
  [[LatexEquation( f'\left(z\right) = - f\left(z\right) F\left(z\right) {\left(1-e^{-z}\right)} )]] [[BR]] [[BR]]

''Logarithm of likelihood'': [[BR]]
  [[LatexEquation( L\left(\beta\right)=\underset{i}{-\sum}w_{i}\left(\ln\left(1+e^{-x_{i}^{t}\beta}\right)+\left(1-y_{i}\right)x_{i}^{t}\beta\right) )]] 

''Gradient'': [[BR]] 
  [[LatexEquation( \frac{\partial L\left(\beta\right)}{\partial\beta_{j}}=\underset{i}{\sum}w_{i}\left(\frac{e^{-x_{i}^{t}\beta}}{1+e^{-x_{i}^{t}\beta}}-\left(1-y_{i}\right)\right)x_{ij}=\underset{i}{\sum}w_{i}\left(\left(1-\pi_{i}\right)-\left(1-y_{i}\right)\right)x_{ij}=\underset{i}{\sum}w_{i}\left(y_{i}-\pi_{i}\right)x_{ij} )]] 

''Hessian'': [[BR]] 
  [[LatexEquation( \frac{\partial L\left(\beta\right)}{\partial\beta_{i}\partial_{j}}=\underset{k}{\sum}w_{k}\frac{-e^{-x_{k}^{t}\beta}}{\left(1+e^{-x_{k}^{t}\beta}\right)^{2}}x_{ki}x_{kj}=-\underset{k}{\sum}x_{ki}x_{kj}w_{k}\pi_{k}\left(1-\pi_{k}\right) )]] 

From the standpoint of arithmetic discrete numerical calculation must take into account that[[BR]]

  [[LatexEquation( e^{710}=\infty )]] [[BR]] [[BR]]
  
For this reason we must carefully try to contain the exponential expressions.
In this case it will use the following asymptotic equalities[[BR]] 
  
  [[LatexEquation( \ln\left(1+e^{-z}\right)\;\overset{z\rightarrow-\infty}{\longrightarrow}\;-z )]] [[BR]] [[BR]]
  [[LatexEquation( \frac{e^{-z}}{1+e^{-z}}\;\overset{z\rightarrow-\infty}{\longrightarrow}\;1 )]] [[BR]] [[BR]]


=== Weighted Probit Regression ===
Class [source:/tolp/OfficialTolArchiveNetwork/QltvRespModel/WgtProbit.tol @WgtProbit] 
is an specialization of class 
[source:/tolp/OfficialTolArchiveNetwork/QltvRespModel/WgtBoolReg.tol @WgtBoolReg]
that handles with weighted probit regressions.

You can view [source:/tolp/OfficialTolArchiveNetwork/QltvRespModel/test/test_0004/test.tol test_0004]
of using this class.

In this case we have that scalar distribution is the standard normal one.

''Scalar cumulant'': [[BR]] 
  [[LatexEquation( F\left(z\right) = \Phi\left(z\right) )]] [[BR]] [[BR]]

''Scalar density'': [[BR]] 
  [[LatexEquation( f\left(z\right) = \phi\left(z\right) )]] [[BR]] [[BR]]

''Scalar density derivative'': [[BR]] 
  [[LatexEquation( f'\left(z\right) = -z \phi\left(z\right) )]] [[BR]] [[BR]]

''Logarithm of likelihood'': [[BR]]
  [[LatexEquation( L\left(\beta\right)=\underset{i}{\sum}w_{i}\left(y_{i}\ln\left(\Phi\left(x_{i}\beta\right)\right)+\left(1-y_{i}\right)\ln\left(\Phi\left(-x_{i}\beta\right)\right)\right) )]] 

''Gradient'': [[BR]] 
  [[LatexEquation( \frac{\partial L\left(\beta\right)}{\partial\beta_{j}}=\underset{i}{\sum}w_{i}\left(y_{i}\frac{\phi\left(x_{i}\beta\right)}{\Phi\left(x_{i}\beta\right)}-\left(1-y_{i}\right)\frac{\phi\left(x_{i}\beta\right)}{\Phi\left(-x_{i}\beta\right)}\right)x_{ij} )]] 

''Hessian'': [[BR]] 
  [[LatexEquation( \frac{\partial L\left(\beta\right)}{\partial\beta_{i}\partial_{j}}=\frac{\partial L\left(\beta\right)}{\partial\beta_{i}\partial_{j}}=-\underset{k}{\sum}w_{k}\phi\left(x_{k}\beta\right)\left(y_{k}\frac{z\Phi\left(x_{k}\beta\right)+\phi\left(x_{k}\beta\right)}{\Phi\left(x_{k}\beta\right)^{2}}+\left(1-y_{k}\right)\frac{-z\Phi\left(-x_{k}\beta\right)+\phi\left(x_{k}\beta\right)}{\Phi\left(-x_{k}\beta\right)^{2}}\right)x_{ik}x_{jk} )]] 


To avoid numerical problems will use the following equality
  
  [[LatexEquation( \ln\left(\Phi\left(z\right)\right)=\ln\left(1-erf\left(\frac{-z}{\sqrt{2}}\right)\right)-\ln2 )]] [[BR]] [[BR]]
  
The function logarithm of complemetary error function

[[LatexEquation( \ln\left(1-erf\left(u\right)\right) )]] 

is implemented as 
[http://www.gnu.org/software/gsl/manual/html_node/Log-Complementary-Error-Function.html gsl_sf_log_erfc] 
that is available in TOL.

In the gradient it appears two times the Hazard function [[BR]] [[BR]]

[[LatexEquation( h\left(z\right)=\frac{\phi\left(z\right)}{1-\Phi\left(z\right)}=\frac{\phi\left(z\right)}{\Phi\left(-z\right)} )]] [[BR]] [[BR]]
[[LatexEquation( \frac{\phi\left(z\right)}{\Phi\left(z\right)}=\frac{\phi\left(-z\right)}{\Phi\left(z\right)}=h\left(-z\right) )]]

decreases rapidly as [[LatexEquation(z)]] approaches [[LatexEquation(-\infty)]] and asymptotes to 
[[LatexEquation(h\left(z\right) \sim z )]] as [[LatexEquation(z)]] approaches [[LatexEquation(+\infty.)]]

Hazard function is implemented as [http://www.gnu.org/software/gsl/manual/html_node/Probability-functions.html gsl_sf_hazard ] 
that is also available in TOL.