weightflow 0.1.0
First release.
A dependency-free, pipeable API to compute survey weights from design base weights through a chain of hierarchical adjustment stages. Build a recipe lazily, estimate it with prep(), and extract the weights with collect_weights(). Separating define from apply makes the whole process reproducible and auditable, and lets the bootstrap re-run the entire cascade on each replicate.
Adjustment steps
-
step_unknown_eligibility()— redistribute the weight of unknown-eligibility cases to the known ones (person- or household-level viacluster). -
step_drop_ineligible()— zero out out-of-scope units. -
step_select_within()— within-household selection (unequalprobor equaln_eligible). -
step_nonresponse()— weighting-class or propensity adjustment, at the person or household level (cluster). -
step_calibrate()— raking, post-stratification and linear/GREG calibration, with bounded (Deville-Särndal) and integrative (one weight per household) cluster options. -
step_model_calibration()— Wu-Sitter model calibration. -
step_trim(),step_trim_weights(),step_round(),step_rescale()— trimming, rounding and rescaling. -
step_assert()— quality checkpoint (deff, weight ratio, effective n).
Inspection and reporting
-
summary(),plot()andweight_factors()for per-stage diagnostics. -
design_effect()for the Kish design effect and effective sample size. -
report_weighting()builds a self-contained HTML report with a pipeline diagram, the variables used, per-stage summaries and per-step visuals.
Variance estimation
-
bootstrap_weights()resamples PSUs within strata (Rao-Wu rescaling) and re-applies the whole recipe on each replicate, so the replicate weights carry the variability of every adjustment. -
boot_mean()andboot_total()return the estimate, standard error and CI. -
as_svydesign(),as_svrepdesign()andcollect_replicate_weights()bridge to thesurveyandsrvyrpackages for design-based inference.
Data
- Bundled example datasets
population,sample_survey(take-all roster) andsample_one(multistage select-one design), all with stratum, PSU and design weight.
Development version
The following are available in the development version on GitHub and are planned for a future CRAN release:
-
Machine-learning response propensities (CART, random forest and gradient boosting via
xgboost) forstep_nonresponse()andstep_model_calibration(). -
k-fold cross-fitting (
crossfit) to estimate each unit out-of-sample, with folds formed by cluster to avoid leakage. -
Ridge (penalized) calibration (
penalty) to keep weights stable with many auxiliaries. -
Potter MSE-optimal trimming (
method = "potter"), a data-driven cutoff.
Install with remotes::install_github("jpferreira33/weightflow") to use them today.
