---
Now that you've constructed a prior model of your support in the upcoming election, let's turn to the next important piece of a Bayesian analysis - the data!
In your quest for election to public office, recall that parameter p denotes the underlying proportion of voters that support you.
To gain insight into p, your campaign conducted a small poll and found that X = 6 of n = 10 (or 60% of) voters support you. These data provide some evidence about p.
For example, you're more likely to observe such poll results if your underlying support were also around 0.6 than, say, if it were below the 0.5 winning threshold. Of course, to rigorously quantify the likelihood of the poll results under different election scenarios, we must understand how polling data X depend on your underlying support p.
To this end, you can make two reasonable assumptions about the polling data. First, voters respond independently of one another. Second, the probability that any given voter supports you is p, your underlying support in the population.
In turn, you can view X, the number of n polled voters that support you, as a count of successes in n independent trials, each having probability of success p.
This might sound familiar! Under these settings, the conditional dependence of X on p is modeled by the Binomial distribution with parameters n and p (communicated by the mathematical notation here).
The Binomial model provides the tools needed to quantify the probability of observing your poll result under different election scenarios. This result is represented by the red dot: X = 6 of n = 10 (or 60% of) voters support you.
If your underlying support p were only 50%, there's a roughly 20% chance that a poll of 10 voters would produce X = 6.
You're less likely to observe such a relatively low poll result if your underlying support p were as high as 80%.
Further, it’s possible though unlikely that you would observe such a relatively high poll result if your underlying support were only 30%.
Similarly, we can calculate the likelihood of these poll results for any level of underlying support p between 0 and 1. Connecting the dots, the resulting curve represents the likelihood function.
The likelihood function summarizes the likelihood of observing polling data X under different values of the underlying support parameter p. Thus, the likelihood is a function of p that depends upon the observed data X. In turn, it provides insight into which parameter values are most compatible with the poll.
Here we see that the likelihood function is highest for values of support p between 0.4 and 0.8. Thus these values are the most compatible with the poll.
In contrast, with low likelihoods, small values of support below 0.4 and large values of support above 0.8, are not compatible with the poll.
To conclude, the likelihood function plays an important role in quantifying the insights from our data. Though it's possible to calculate the exact Binomial likelihood function, you'll use simulation techniques to approximate and build intuition for the likelihood in the following exercises.
0 Yorumlar