Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
SR2 Chapter 3 Hard
Posted on 5 April, 2020 by Brian Tags: statistical rethinking, solutions, grid approximation, posterior predictive check, posterior predictive distribution, map, binomial, hpdi Category: statistical-rethinking-2
Here’s my solutions to the hard exercises in chapter 3 of McElreath’s Statistical Rethinking, 2nd edition.
\(\DeclareMathOperator{\dbinomial}{Binomial} \DeclareMathOperator{\dbernoulli}{Bernoulli} \DeclareMathOperator{\dpoisson}{Poisson} \DeclareMathOperator{\dnormal}{Normal} \DeclareMathOperator{\dt}{t} \DeclareMathOperator{\dcauchy}{Cauchy} \DeclareMathOperator{\dexponential}{Exp} \DeclareMathOperator{\duniform}{Uniform} \DeclareMathOperator{\dgamma}{Gamma} \DeclareMathOperator{\dinvpamma}{Invpamma} \DeclareMathOperator{\invlogit}{InvLogit} \DeclareMathOperator{\logit}{Logit} \DeclareMathOperator{\ddirichlet}{Dirichlet} \DeclareMathOperator{\dbeta}{Beta}\)
Let’s first put the data into a tibble for easier manipulation later.
data(homeworkch3)df <-tibble(birth1 = birth1, birth2 = birth2) %>%mutate(birth =row_number())
birth1 | birth2 | birth |
---|---|---|
1 | 0 | 1 |
0 | 1 | 2 |
0 | 0 | 3 |
0 | 1 | 4 |
1 | 0 | 5 |
1 | 1 | 6 |
3H1
Let’s check we have the correct total cound and the correct number of boys.
h1_counts <-df %>%gather(order, gender, -birth) %>%summarise(boys =sum(gender), births =n())
Now we can grid approximate the posterior as before.
granularity <-1000h1_grid <-tibble(p =seq(0, 1, length.out = granularity)) %>%mutate(prior =1)h1_posterior <-h1_grid %>%mutate( likelihood =dbinom(h1_counts$boys, h1_counts$births, p), posterior = prior *likelihood, posterior = posterior /sum(posterior) )
The maximum a posteriori (MAP) value is the value of p that maximises the posterior.
h1_map <-h1_posterior %>%slice(which.max(posterior)) %>%pull(p)h1_map
[1] 0.5545546
3H2
We draw samples with weight equalt to the posterior. We then apply the HPDI
function to these samples, each time with a different width.
h2_samples <-h1_posterior %>%sample_n(10000, replace =TRUE, weight = posterior) %>%pull(p)h2_hpdi <-h2_samples %>%crossing(prob =c(0.5, 0.89, 0.97)) %>%group_by(prob) %>%group_map(HPDI) h2_hpdi
[[1]] |0.5 0.5| 0.4574575 0.5735736 [[2]] |0.89 0.89| 0.4534535 0.6606607 [[3]] |0.97 0.97| 0.4294294 0.6616617
3H3
The posterior predictive samples are possible observations according to our posterior.
h3_posterior_predictive <-rbinom(10000, 200, h2_samples)
The number of observed births is very close to the MAP of the posterior predictive distribution, suggesting we have a decent fit.
3H4
Our data are from birth pairs and so far we didn’t make any distinction between the first and second births. To test this assumption, we can perform a posterior predictive check as in 3H3, but this time for first births.
h4_posterior_predictive <-rbinom(10000, 100, h2_samples)
The fit doesn’t look quite as good for first births as it did for all births together. It also doesn’t look bad since there is still a fair bit of probability mass around the observed number of first birth boys.
3H5
As the final posterior predictive check, let’s check the number of boys born after a girl.
h5_counts <-df %>%filter(birth1 ==0) %>%summarise(boys =sum(birth2), births =n())h5_posterior_predictive <-rbinom(10000, h5_counts$births, h2_samples)
The fit here looks bad, since the observed number of boys is higher than the bulk of the model’s expectations.
/** * RECOMMENDED CONFIGURATION VARIABLES: EDIT AND UNCOMMENT THE SECTION BELOW TO INSERT DYNAMIC VALUES FROM YOUR PLATFORM OR CMS. * LEARN WHY DEFINING THESE VARIABLES IS IMPORTANT: https://disqus.com/admin/universalcode/#configuration-variables *//* var disqus_config = function () { this.page.url = PAGE_URL; // Replace PAGE_URL with your page's canonical URL variable this.page.identifier = PAGE_IDENTIFIER; // Replace PAGE_IDENTIFIER with your page's unique identifier variable }; */(function() { // DON'T EDIT BELOW THIS LINE var d = document, s = d.createElement('script');</p><p> s.src = '//stappit-github-io.disqus.com/embed.js';</p><p> s.setAttribute('data-timestamp', +new Date()); (d.head || d.body).appendChild(s); })();
Please enable JavaScript to view the comments powered by Disqus.
var vglnk = { key: '949efb41171ac6ec1bf7f206d57e90b8' }; (function(d, t) {var s = d.createElement(t); s.type = 'text/javascript'; s.async = true;s.src = '//cdn.viglink.com/api/vglnk.js';var r = d.getElementsByTagName(t)[0]; r.parentNode.insertBefore(s, r); }(document, 'script'));R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.