[[PageOutline]] = Package QltvRespModel = Max-likelihood and bayesian estimation of qualitative response models as an special case of [wiki:OfficialTolArchiveNetworkGrzLinModel generalized linear models] == Weighted Boolean Regresions == Abstract class [source:/tolp/OfficialTolArchiveNetwork/QltvRespModel/WgtBoolReg.tol @WgtBoolReg] is an specialization of class [source:/tolp/OfficialTolArchiveNetwork/GrzLinModel/WgtReg.tol GrzLinModel::@WgtReg] and is the base to inherit weighted boolean regressions as logit or probit or any other, given just the scalar distribution function [[LatexEquation( F )]] and the corresponding density function [[LatexEquation( f )]]. In a weighted regression each row of input data has a distinct weight in the likelihood function. For example, it can be very usefull to handle with data extrated from an stratified sample. Let be * [[LatexEquation( X\in\mathbb{R}^{m\times n} )]] the regression input matrix * [[LatexEquation( w\in\mathbb{R}^{m} )]] the vector of weights of each register * [[LatexEquation( y\in\mathbb{R}^{m} )]] the regression output matrix The hypotesis is that [[LatexEquation( \forall i=1 \dots m )]] [[LatexEquation( y_{i}\sim Bernoulli\left(\pi_{i}\right) )]] [[BR]] [[BR]] [[LatexEquation( \pi_{i}=Pr\left[y_{i}=1\right] = F\left(X_{i}\beta\right) )]] The likelihood function is then [[LatexEquation( lk\left(\beta\right)=\underset{i}{\prod}\pi_{i}^{w_{i}y_{i}}\left(1-\pi_{i}\right)^{w_{i}\left(1-y_{i}\right)} )]] and its logarithm [[LatexEquation( L\left(\beta\right)=\ln\left(lk\left(\beta\right)\right)=\underset{i}{\sum}w_{i}\left(y_{i}\ln\left(\pi_{i}\right)+\left(1-y_{i}\right)\ln\left(1-\pi_{i}\right)\right) )]] The gradient of the logarithm of the likelihood function will be [[LatexEquation( \frac{\partial L\left(\beta\right)}{\partial\beta_{j}}=\underset{i}{\sum}w_{i}\left(y_{i}\frac{f\left(x_{i}\beta\right)}{F\left(x_{i}\beta\right)}-\left(1-y_{i}\right)\frac{f\left(x_{i}\beta\right)}{1-F\left(x_{i}\beta\right)}\right)x_{ij} )]] and the hessian is [[LatexEquation( \frac{\partial L\left(\beta\right)}{\partial\beta_{i}\partial_{j}}=\underset{k}{\sum}w_{k}\left(y_{k}\frac{f'\left(x_{k}\beta\right)F\left(x_{k}\beta\right)-f^{2}\left(x_{k}\beta\right)}{F^{2}\left(x_{k}\beta\right)}-\left(1-y_{k}\right)\frac{f'\left(x_{k}\beta\right)\left(1-F\left(x_{k}\beta\right)\right)+f^{2}\left(x_{k}\beta\right)}{\left(1-F\left(x_{k}\beta\right)\right)^{2}}\right)x_{ik}x_{jk} )]] User can and should define scalar truncated normal or uniform prior information and bounds for all variables for which he/she has robust knowledge.[[BR]] [[BR]] [[LatexEquation( \beta_k \sim N\left(\nu_k, \sigma_k \right) )]] [[BR]] [[BR]] [[LatexEquation( l_k \le \beta_k \le u_k \wedge l_k < u_k)]] [[BR]] [[BR]] When [[LatexEquation( \sigma_k )]] is infinite or unknown we will express a uniform prior. When [[LatexEquation( l_k = -\infty)]] or unknown we will express that variable has no lower bound. When [[LatexEquation( u_k = +\infty)]] or unknown we will express that variable has no upper bound. It's also allowed to give any set of constraining linear inequations if they are compatible with lower and upper bounds [[BR]] [[BR]] [[LatexEquation( A \beta \le a )]] [[BR]] [[BR]] This class implements max-likelihood estimation by means of package [wiki:OfficialTolArchiveNetworkNonLinGloOpt NonLinGloOpt] and bayesian simulation using [wiki:OfficialTolArchiveNetworkBysSampler BysSampler]. The only mandatory members are the matrices of output and input of the regression {{{ #!cpp //Output vector 0 o 1 (mx1) VMatrix y; //Input matrix (mxn) VMatrix X; }}} You can also specify these other members: {{{ #!cpp //Weights vector (mx1), default values are 1 VMatrix w=Rand(0,0,0,0); //Name of output Text output.name = ""; //Names of input variables Set input.name = Copy(Empty); //Set of GrzLinModel::@PsbTrnNrmUnfSclDst Set prior = Copy(Empty); //Constraining matrices A*b<=a //Constraining coefficient matrix VMatrix A=Rand(0,0,0,0); //Constraining border vector VMatrix a=Rand(0,0,0,0); }}} === Weighted Logit Regression === Class [source:/tolp/OfficialTolArchiveNetwork/QltvRespModel/WgtLogit.tol @WgtLogit] is an specialization of class [source:/tolp/OfficialTolArchiveNetwork/QltvRespModel/WgtBoolReg.tol @WgtBoolReg] that handles with weighted logit regressions. You can view [source:/tolp/OfficialTolArchiveNetwork/QltvRespModel/test/test_0003/test.tol test_0003] of using this class. In this case we have that scalar distribution is the logistic one. ''Scalar cumulant'': [[BR]] [[LatexEquation( F\left(z\right) = \frac{1}{1+e^{-z}} )]] [[BR]] [[BR]] ''Scalar density'': [[BR]] [[LatexEquation( f\left(z\right) = \frac{e^{-z}}{\left(1+e^{-z}\right)^2} )]] [[BR]] [[BR]] ''Scalar density derivative'': [[BR]] [[LatexEquation( f'\left(z\right) = - f\left(z\right) F\left(z\right) {\left(1-e^{-z}\right)} )]] [[BR]] [[BR]] ''Logarithm of likelihood'': [[BR]] [[LatexEquation( L\left(\beta\right)=\underset{i}{-\sum}w_{i}\left(\ln\left(1+e^{-x_{i}^{t}\beta}\right)+\left(1-y_{i}\right)x_{i}^{t}\beta\right) )]] ''Gradient'': [[BR]] [[LatexEquation( \frac{\partial L\left(\beta\right)}{\partial\beta_{j}}=\underset{i}{\sum}w_{i}\left(\frac{e^{-x_{i}^{t}\beta}}{1+e^{-x_{i}^{t}\beta}}-\left(1-y_{i}\right)\right)x_{ij}=\underset{i}{\sum}w_{i}\left(\left(1-\pi_{i}\right)-\left(1-y_{i}\right)\right)x_{ij}=\underset{i}{\sum}w_{i}\left(y_{i}-\pi_{i}\right)x_{ij} )]] ''Hessian'': [[BR]] [[LatexEquation( \frac{\partial L\left(\beta\right)}{\partial\beta_{i}\partial_{j}}=\underset{k}{\sum}w_{k}\frac{-e^{-x_{k}^{t}\beta}}{\left(1+e^{-x_{k}^{t}\beta}\right)^{2}}x_{ki}x_{kj}=-\underset{k}{\sum}x_{ki}x_{kj}w_{k}\pi_{k}\left(1-\pi_{k}\right) )]] From the standpoint of arithmetic discrete numerical calculation must take into account that[[BR]] [[LatexEquation( e^{710}=\infty )]] [[BR]] [[BR]] For this reason we must carefully try to contain the exponential expressions. In this case it will use the following asymptotic equalities[[BR]] [[LatexEquation( \ln\left(1+e^{-z}\right)\;\overset{z\rightarrow-\infty}{\longrightarrow}\;-z )]] [[BR]] [[BR]] [[LatexEquation( \frac{e^{-z}}{1+e^{-z}}\;\overset{z\rightarrow-\infty}{\longrightarrow}\;1 )]] [[BR]] [[BR]] === Weighted Probit Regression === Class [source:/tolp/OfficialTolArchiveNetwork/QltvRespModel/WgtProbit.tol @WgtProbit] is an specialization of class [source:/tolp/OfficialTolArchiveNetwork/QltvRespModel/WgtBoolReg.tol @WgtBoolReg] that handles with weighted probit regressions. You can view [source:/tolp/OfficialTolArchiveNetwork/QltvRespModel/test/test_0004/test.tol test_0004] of using this class. In this case we have that scalar distribution is the standard normal one. ''Scalar cumulant'': [[BR]] [[LatexEquation( F\left(z\right) = \Phi\left(z\right) )]] [[BR]] [[BR]] ''Scalar density'': [[BR]] [[LatexEquation( f\left(z\right) = \phi\left(z\right) )]] [[BR]] [[BR]] ''Scalar density derivative'': [[BR]] [[LatexEquation( f'\left(z\right) = -z \phi\left(z\right) )]] [[BR]] [[BR]] ''Logarithm of likelihood'': [[BR]] [[LatexEquation( L\left(\beta\right)=\underset{i}{\sum}w_{i}\left(y_{i}\ln\left(\Phi\left(x_{i}\beta\right)\right)+\left(1-y_{i}\right)\ln\left(\Phi\left(-x_{i}\beta\right)\right)\right) )]] ''Gradient'': [[BR]] [[LatexEquation( \frac{\partial L\left(\beta\right)}{\partial\beta_{j}}=\underset{i}{\sum}w_{i}\left(y_{i}\frac{\phi\left(x_{i}\beta\right)}{\Phi\left(x_{i}\beta\right)}-\left(1-y_{i}\right)\frac{\phi\left(x_{i}\beta\right)}{\Phi\left(-x_{i}\beta\right)}\right)x_{ij} )]] ''Hessian'': [[BR]] [[LatexEquation( \frac{\partial L\left(\beta\right)}{\partial\beta_{i}\partial_{j}}=\frac{\partial L\left(\beta\right)}{\partial\beta_{i}\partial_{j}}=-\underset{k}{\sum}w_{k}\phi\left(x_{k}\beta\right)\left(y_{k}\frac{z\Phi\left(x_{k}\beta\right)+\phi\left(x_{k}\beta\right)}{\Phi\left(x_{k}\beta\right)^{2}}+\left(1-y_{k}\right)\frac{-z\Phi\left(-x_{k}\beta\right)+\phi\left(x_{k}\beta\right)}{\Phi\left(-x_{k}\beta\right)^{2}}\right)x_{ik}x_{jk} )]] To avoid numerical problems will use the following equality [[LatexEquation( \ln\left(\Phi\left(z\right)\right)=\ln\left(1-erf\left(\frac{-z}{\sqrt{2}}\right)\right)-\ln2 )]] [[BR]] [[BR]] The function logarithm of complemetary error function [[LatexEquation( \ln\left(1-erf\left(u\right)\right) )]] is implemented as [http://www.gnu.org/software/gsl/manual/html_node/Log-Complementary-Error-Function.html gsl_sf_log_erfc] that is available in TOL. In the gradient it appears two times the Hazard function [[BR]] [[BR]] [[LatexEquation( h\left(z\right)=\frac{\phi\left(z\right)}{1-\Phi\left(z\right)}=\frac{\phi\left(z\right)}{\Phi\left(-z\right)} )]] [[BR]] [[BR]] [[LatexEquation( \frac{\phi\left(z\right)}{\Phi\left(z\right)}=\frac{\phi\left(-z\right)}{\Phi\left(z\right)}=h\left(-z\right) )]] decreases rapidly as [[LatexEquation(z)]] approaches [[LatexEquation(-\infty)]] and asymptotes to [[LatexEquation(h\left(z\right) \sim z )]] as [[LatexEquation(z)]] approaches [[LatexEquation(+\infty.)]] Hazard function is implemented as [http://www.gnu.org/software/gsl/manual/html_node/Probability-functions.html gsl_sf_hazard ] that is also available in TOL.