A social-media company wants to learn whether displaying a new visual badge next to selected user comments actually causes those comments to receive more "likes." The team must collect generated data that lets them defend a causal conclusion while minimizing the influence of confounders such as posting time or user popularity. Which generated-data strategy best fulfills these requirements, and why?
Pull existing administrative records of user engagement metrics and account attributes; their consistent structure over time supports longitudinal analysis of badge exposure.
Run a randomized A/B experiment in which comments are randomly assigned to show the badge or not; random assignment isolates the badge's causal effect on likes.
Analyze post-deployment transactional click-stream logs to compare likes on comments with and without the badge; the high-volume behavioral data reveals real-world usage patterns.
Issue an online survey asking users whether the badge influences their liking behavior; self-reported opinions directly capture perceived impact.
Only experimental data collected through a randomized A/B test can support a strong causal claim in this scenario. Randomly assigning comments (or users) to "badge" and "no-badge" conditions balances both observed and unobserved confounders, so any difference in average likes can be attributed to the badge itself.
Transactional click-stream logs are generated passively, but because the badge would already be deployed, badge presence could be correlated with many other uncontrolled factors; the data would be observational, not experimental.
Survey data captures user opinions but is self-reported, subject to recall and desirability bias, and does not measure actual behavioral change.
Administrative records (e.g., account settings, historical engagement) are structured and longitudinal, yet they were collected for operational purposes; they contain no randomized intervention and thus cannot isolate the badge's effect. Therefore, the randomized A/B test is the only option that both belongs to the "generated data" category and satisfies the causal-inference requirement.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is a randomized A/B experiment?
Open an interactive chat with Bash
Why are observational data like click-stream logs insufficient for causal inference?
Open an interactive chat with Bash
What are the limitations of using survey data to study behavioral impact?