Title: | Variable Selection using Simulated Annealing |
---|---|
Description: | Highest posterior model is widely accepted as a good model among available models. In terms of variable selection highest posterior model is often the true model. Our stochastic search process SAHPM based on simulated annealing maximization method tries to find the highest posterior model by maximizing the model space with respect to the posterior probabilities of the models. This package currently contains the SAHPM method only for linear models. The codes for GLM will be added in future. |
Authors: | Arnab Maity [aut, cre], Sanjib Basu [ctb] |
Maintainer: | Arnab Maity <[email protected]> |
License: | GPL-2 |
Version: | 1.0.1 |
Built: | 2025-02-02 04:10:15 UTC |
Source: | https://github.com/cran/sahpm |
Highest posterior model is widely accepted as a good model among available models. In terms of variable selection highest posterior model is often the true model. Our stochastic search process SAHPM based on simulated annealing maximization method tries to find the highest posterior model by maximizing the model space with respect to the posterior probabilities of the models. This function currently contains the SAHPM method only for linear models. The codes for GLM will be added in future.
sahpmlm(formula, data, na.action, g = n, nstep = 200, abstol = 1e-07, replace = FALSE, burnin = FALSE, nburnin = 50)
sahpmlm(formula, data, na.action, g = n, nstep = 200, abstol = 1e-07, replace = FALSE, burnin = FALSE, nburnin = 50)
formula |
an object of class |
data |
an optional data frame, list or environment (or object coercible
by |
na.action |
a function which indicates what should happen when the data contain
|
g |
value of |
nstep |
maximum number of steps for simulated annealing search. |
abstol |
desired level of difference of marginal likelihoods between two steps. |
replace |
logical. If |
burnin |
logical. If |
nburnin |
Number of burnin (required if burnin = TRUE). Default is 50. |
The model is:
The Zellner's prior is used with default
.
final.model |
A column vector which corresponds to the original variable indices. |
history |
A history of the search process. By columns: Step number, temperature, current objective function value, current minimal objective function value, current model, posterior probability of current model. |
Maity, A., K., and Basu, S. Highest Posterior Model Computation and Variable Selection via the Simulated Annealing
require(mvtnorm) # for multivariate normal distribution n <- 100 # sample size k <- 40 # number of variables z <- as.vector(rmvnorm(1, mean = rep(0, n), sigma = diag(n))) x <- matrix(NA, nrow = n, ncol = k) for(i in 1:k) { x[, i] <- as.vector(rmvnorm(1, mean = rep(0, n), sigma = diag(n))) + z } # this induce 0.5 correlation among the variables beta <- c(rep(0, 10), rep(2, 10), rep(0, 10), rep(2, 10)) # vector of coefficients sigma <- 1 sigma.square <- sigma^2 linear.pred <- x %*% beta y <- as.numeric(t(rmvnorm(1, mean = linear.pred, sigma = diag(sigma.square, n)))) # response answer <- sahpmlm(formula = y ~ x) answer$final.model answer$history ## Not run: # With small effect size beta <- c(rep(0, 10), rep(1, 10), rep(0, 10), rep(1, 10)) # vector of coefficients linear.pred <- x %*% beta y <- as.numeric(t(rmvnorm(1, mean = linear.pred, sigma = diag(sigma.square, n)))) # response answer <- sahpmlm(formula = y ~ x) answer$final.model # Might miss some of the true predictors answer$history # Able to recover all the predictors with 50 burnin answer <- sahpmlm(formula = y ~ x, burnin = TRUE, nburnin = 50) answer$final.model # Misses some of the true predictors answer$history ## End(Not run)
require(mvtnorm) # for multivariate normal distribution n <- 100 # sample size k <- 40 # number of variables z <- as.vector(rmvnorm(1, mean = rep(0, n), sigma = diag(n))) x <- matrix(NA, nrow = n, ncol = k) for(i in 1:k) { x[, i] <- as.vector(rmvnorm(1, mean = rep(0, n), sigma = diag(n))) + z } # this induce 0.5 correlation among the variables beta <- c(rep(0, 10), rep(2, 10), rep(0, 10), rep(2, 10)) # vector of coefficients sigma <- 1 sigma.square <- sigma^2 linear.pred <- x %*% beta y <- as.numeric(t(rmvnorm(1, mean = linear.pred, sigma = diag(sigma.square, n)))) # response answer <- sahpmlm(formula = y ~ x) answer$final.model answer$history ## Not run: # With small effect size beta <- c(rep(0, 10), rep(1, 10), rep(0, 10), rep(1, 10)) # vector of coefficients linear.pred <- x %*% beta y <- as.numeric(t(rmvnorm(1, mean = linear.pred, sigma = diag(sigma.square, n)))) # response answer <- sahpmlm(formula = y ~ x) answer$final.model # Might miss some of the true predictors answer$history # Able to recover all the predictors with 50 burnin answer <- sahpmlm(formula = y ~ x, burnin = TRUE, nburnin = 50) answer$final.model # Misses some of the true predictors answer$history ## End(Not run)