Package: subsampling 0.4.0

subsampling: Optimal Subsampling Methods for Statistical Models

Balancing computational and statistical efficiency, subsampling techniques offer a practical solution for handling large-scale data analysis. Subsampling methods enhance statistical modeling for massive datasets by efficiently drawing representative subsamples from full dataset based on tailored sampling probabilities. These probabilities are optimized for specific goals, such as minimizing the variance of coefficient estimates or reducing prediction error. Based on specified modeling assumptions and subsampling techniques, the package provides functions to draw subsamples from the full data, fit the model on the subsamples, and perform statistical inference.

Authors:Qingkai Dong [aut, cre, cph], Yaqiong Yao [aut], Haiying Wang [aut], Qiang Zhang [ctb], Jun Yan [ctb]

subsampling_0.4.0.tar.gz
subsampling_0.4.0.zip(r-4.7)subsampling_0.4.0.zip(r-4.6)subsampling_0.4.0.zip(r-4.5)
subsampling_0.4.0.tgz(r-4.6-x86_64)subsampling_0.4.0.tgz(r-4.6-arm64)subsampling_0.4.0.tgz(r-4.5-x86_64)subsampling_0.4.0.tgz(r-4.5-arm64)
subsampling_0.4.0.tar.gz(r-4.7-arm64)subsampling_0.4.0.tar.gz(r-4.7-x86_64)subsampling_0.4.0.tar.gz(r-4.6-arm64)subsampling_0.4.0.tar.gz(r-4.6-x86_64)
subsampling_0.4.0.tgz(r-4.6-emscripten)
manual.pdf |manual.html
DESCRIPTION |NEWS
card.svg |card.png
subsampling/json (API)

# Install 'subsampling' in R:
install.packages('subsampling', repos = c('https://dqksnow.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/dqksnow/subsampling/issues

Pkgdown/docs site:https://dqksnow.github.io

Uses libs:
  • openblas– Optimized BLAS
  • c++– GNU Standard C++ Library v3

On CRAN:

Conda:

openblascpp

6.02 score 7 stars 7 scripts 565 downloads 5 exports 16 dependencies

Last updated from:d5a31ed1a8. Checks:13 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-arm64OK147
linux-devel-x86_64OK156
source / vignettesOK195
linux-release-arm64OK145
linux-release-x86_64OK152
macos-release-arm64OK142
macos-release-x86_64OK521
macos-oldrel-arm64OK129
macos-oldrel-x86_64OK436
windows-develOK162
windows-releaseOK199
windows-oldrelOK159
wasm-releaseOK110

Exports:ssp.glmssp.glm.rFssp.quantregssp.relogitssp.softmax

Dependencies:DBIexpmlatticeMASSMatrixMatrixModelsminqamitoolsnnetnumDerivquantregRcppRcppArmadilloSparseMsurveysurvival

ssp.glm.rF: Balanced Subsampling for Preserving Rare Features in Generalized Linear Models
Setup | Simulated Logistic Regression Example | One-Step Balanced Sampling | Two-Step Rareness-Aware Optimal Subsampling | Automatically Account for Rarity if rareFeature.index = NULL | Balancing the Outcome for Logistic Regression | Objective Weights | Control Options | Non-Binomial Families

Last update: 2026-06-21
Started: 2025-12-04

ssp.glm: Optimal Subsampling for Generalized Linear Models
Model and Objective | Two Ways ssp.glm() Works | Pilot Estimator | $$\pi_i^ | Choosing criterion | criterion = "optA" | criterion = "optL" | criterion = "LCC" | criterion = "uniform" | Choosing sampling.method | sampling.method = "withReplacement" | sampling.method = "poisson" | Choosing likelihood | likelihood = "weighted" | likelihood = "logOddsCorrection" | $$P(Y_i = 1 \mid x_i, i \in S) | Method Summary | Example Data | Example 1: Default Optimal Subsampling | Example 2: Logistic Poisson Correction | Example 3: Uniform Baseline | Returned Object | Other Families | References

Last update: 2026-06-21
Started: 2024-08-16

ssp.quantreg: Subsampling for Quantile Regression
Terminology | Example | Key Arguments | criterion | sampling.method | likelihood | boot and B | Outputs | Returned object | References

Last update: 2026-06-21
Started: 2024-10-14

ssp.relogit: Optimal Subsampling for Logistic Regression with Rare Events
Model and Rare-Event Setting | How ssp.relogit() Works | Pilot Estimator | Choosing criterion | criterion = "optA" | criterion = "optL" | criterion = "LCC" | criterion = "uniform" | Negative-Sampling Probabilities | Choosing likelihood | likelihood = "weighted" | likelihood = "logOddsCorrection" | $$P(Y_i = 1 \mid x_i, i \in S) | Method Summary | Example Data | Example 1: Default Rare-Event Fit | Example 2: A-Optimal Criterion | Example 3: Uniform Baseline | Returned Object | References

Last update: 2026-06-21
Started: 2024-10-14

ssp.softmax: Subsampling for Softmax (Multinomial) Regression Model
Terminology | Example | Key Arguments | criterion | sampling.method | likelihood | constraint | Outputs | References

Last update: 2026-06-21
Started: 2024-08-27

Algorithmic Structure of Optimal Subsampling
Full-Data Problem | Step 1: Pilot Estimation | Step 2: Calculate Raw Importance Scores | Step 3: Construct Probabilities and Draw the Second-Step Subsample | Step 4: Fit a Corrected Subsample Objective Function | Step 5: Optional Combination | $$\widehat | Function Map | Checking Points

Last update: 2026-06-21
Started: 2026-06-21