Package: surveyCV 0.2.0.9003

Jerzy Wieczorek

surveyCV: Cross Validation Based on Survey Design

Functions to generate K-fold cross validation (CV) folds and CV test error estimates that take into account how a survey dataset's sampling design was constructed (SRS, clustering, stratification, and/or unequal sampling weights). You can input linear and logistic regression models, along with data and a type of survey design in order to get an output that can help you determine which model best fits the data using K-fold cross validation. Our paper on "K-Fold Cross-Validation for Complex Sample Surveys" by Wieczorek, Guerin, and McMahon (2022) <doi:10.1002/sta4.454> explains why differing how we take folds based on survey design is useful.

Authors:Cole Guerin [aut], Thomas McMahon [aut], Jerzy Wieczorek [cre, aut], Ben Schneider [ctb], Hunter Ratliff [ctb]

surveyCV_0.2.0.9003.tar.gz
surveyCV_0.2.0.9003.zip(r-4.7)surveyCV_0.2.0.9003.zip(r-4.6)surveyCV_0.2.0.9003.zip(r-4.5)
surveyCV_0.2.0.9003.tgz(r-4.6-any)surveyCV_0.2.0.9003.tgz(r-4.5-any)
surveyCV_0.2.0.9003.tar.gz(r-4.7-any)surveyCV_0.2.0.9003.tar.gz(r-4.6-any)
surveyCV_0.2.0.9003.tgz(r-4.6-emscripten)
manual.pdf |manual.html
DESCRIPTION |NEWS
card.svg |card.png
surveyCV/json (API)

# Install 'surveyCV' in R:
install.packages('surveyCV', repos = c('https://colbystatsvyrsch.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/colbystatsvyrsch/surveycv/issues

Datasets:
  • NSFG_data - Subset of the 2015-2017 National Survey of Family Growth (NSFG): one birth per respondent.
  • NSFG_data_everypreg - Subset of the 2015-2017 National Survey of Family Growth (NSFG): all live births per respondent.

On CRAN:

Conda:

6.10 score 8 stars 35 scripts 366 downloads 6 exports 11 dependencies

Last updated from:c14d5b5192. Checks:9 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK162
source / vignettesOK198
linux-release-x86_64OK201
macos-release-arm64OK129
macos-oldrel-arm64OK205
windows-develOK98
windows-releaseOK102
windows-oldrelOK95
wasm-releaseOK157

Exports:%>%cv.svycv.svydesigncv.svyglmfolds.svyfolds.svydesign

Dependencies:DBIlatticemagrittrMatrixminqamitoolsnumDerivRcppRcppArmadillosurveysurvival

surveyCV: Cross Validation Based on Survey Design
Introduction to surveyCV | Setup | Linear and logistic regression with cv.svy, cv.svydesign, and cv.svyglm | Direct control with cv.svy | Using a survey design object with cv.svydesign | Using a survey GLM object with cv.svyglm | Fitting a logistic model instead of linear | Other models with folds.svy and folds.svydesign

Last update: 2022-03-15
Started: 2021-04-13

Plots for our surveyCV Stat paper based on SDSS presentation
Generate the artificial population | Sims: Use of surveyCV folds with SRS, clustered, or stratified samples | Sims: Use of sampling weights (in training model-fits vs in test loss-estimates)

Last update: 2022-01-11
Started: 2021-07-17

Informal tests of surveyCV using the Auto dataset
cv.svy() | cv.svydesign() | cv.svyglm() | Test for equivalence of the 3 functions at same random seed | Test of logistic regression | Tests that we expect should fail

Last update: 2021-07-12
Started: 2021-07-08