Non-probability sampling has become increasingly prevalent across various fields, especially in social studies and policy research, due to rapid access to numerous low-cost, nonprobability data sources like web surveys and social media. However, unlike probability sampling, the chance of being selected is unknown in non-probability sampling, and controlled random selection is absent. This makes traditional design-based methods of inference, commonly used in probability surveys, inadequate and raises concerns about potential bias due to self-selection. To ensure valid inferences with non-probability samples, various methods have been developed over time. A review of them is given in (Wu, 2022). Some of these methods adapt techniques used in probability sampling to address non-response issues. Furthermore, some of these methods align with those used in the causal inference context, where data from randomized controlled trials and observational studies are combined to enhance the generalizability of the former by leveraging the representativeness of the latter. Despite their differences, all the proposed methods rely on the use of auxiliary information available at the population level or, in most cases, from a probability sample drawn from the same study population. They can be categorized based on whether auxiliary variables are used to create predictive models for the response variable, models for survey participation propensity, or both, using techniques such as doubly robust estimation or calibration. For both the response variable and the propensity score a range of parametric and non-parametric models, explicit or implicit, are possible. The goal of the contriute is to empirically compare some of these methods in contexts involving multiple auxiliary variables with diverse characteristics and relationships, both among themselves and with the study variable, while considering various self-selection processes.
An empirical analysis of various methods for making inferences with non-probability samples / Lisa Braito; Emilia Rocco. - ELETTRONICO. - (2025), pp. 85-90. ( ASA 2024 Roma 18-20 settembre) [10.26398/asaproc.0084].
An empirical analysis of various methods for making inferences with non-probability samples
Lisa Braito;Emilia Rocco
2025
Abstract
Non-probability sampling has become increasingly prevalent across various fields, especially in social studies and policy research, due to rapid access to numerous low-cost, nonprobability data sources like web surveys and social media. However, unlike probability sampling, the chance of being selected is unknown in non-probability sampling, and controlled random selection is absent. This makes traditional design-based methods of inference, commonly used in probability surveys, inadequate and raises concerns about potential bias due to self-selection. To ensure valid inferences with non-probability samples, various methods have been developed over time. A review of them is given in (Wu, 2022). Some of these methods adapt techniques used in probability sampling to address non-response issues. Furthermore, some of these methods align with those used in the causal inference context, where data from randomized controlled trials and observational studies are combined to enhance the generalizability of the former by leveraging the representativeness of the latter. Despite their differences, all the proposed methods rely on the use of auxiliary information available at the population level or, in most cases, from a probability sample drawn from the same study population. They can be categorized based on whether auxiliary variables are used to create predictive models for the response variable, models for survey participation propensity, or both, using techniques such as doubly robust estimation or calibration. For both the response variable and the propensity score a range of parametric and non-parametric models, explicit or implicit, are possible. The goal of the contriute is to empirically compare some of these methods in contexts involving multiple auxiliary variables with diverse characteristics and relationships, both among themselves and with the study variable, while considering various self-selection processes.I documenti in FLORE sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.



