VAST
VAST copied to clipboard
Question about model error structure
Less of an issue and more of a clarification: When I use e to specify subsets of the data which should be fit to using multiple separate error distributions (which are then specified in ObsModel) how does VAST handle this? Does it split the dataset and fit multiple instances of the model, or does it fit a single model but with a likelihood function which can adopt different outputs depending on the data subset?
Hello,
When you fit a VAST model to multiple data types and, therefore, need to specify multiple distribution models in VAST, you need to: (1) Include a “Data_type” column in your dataset. (2) Modify the “ObsModel” object in VAST so that it includes several rows instead of just one. (3) Include a data type catchability factor in the first linear predictor of your VAST model (preferably specified as a fixed effect):
catchability_data = my_dataset[,'Data_type',drop = FALSE]
Q1_formula = ~ factor( Data_type )
Let us see how things work in practice: (1) If you work only with biomass-sampling data (or any data type that can take any non-negative real number), then you will not need any “Data_type” column in your dataset; and you will set ObsModel to c( 2, 1 ) (2) If you work with biomass-sampling data and count data (or any data type that can take any positive integer), then you will need to: include a “Data_type” column in your dataset, with levels “Count” and “Biomass”, in this order; and set ObsModel to cbind( c( 14, 2 ), 1 ) (3) If you work with biomass-sampling data and encounter/non-encounter data, then you will need to: include a “Data_type” column in your dataset, with levels “Encounter” and “Biomass”, in this order; and set ObsModel to cbind( c( 13, 2 ), 1 ) (4) If you work with count data and encounter/non-encounter data, then you will need to: include a “Data_type” column in your dataset, with levels “Encounter” and “Count”, in this order; and set ObsModel to cbind( c( 13, 14 ), 1 ) (5) If you work with biomass-sampling data, count data and encounter/non-encounter data, then you will need to: include a “Data_type” column in your dataset, with levels “Encounter”, “Count” and “Biomass”, in this order; and set ObsModel to cbind( c( 13, 14, 2 ), 1 )
When your VAST model is fitted to multiple data types, the likelihoods for the different data types (e.g., encounters/non-encounters, counts and biomass-sampling data) have parameters in common since you are using a Poisson-link delta model (as you specified ObsModel[,2] as being equal to 1). Consequently, only one single VAST model is fitted to all the data and the likelihood of your VAST model fitted to multiple data types is obtained as the product of the likelihoods for the different individual data types.