• (population) sample space

  • (population size) or .

  • (population random variable)

  • (population distribution)

  • (parameter ) A fixed (unknown usually) value on which the population distribution depends.

    • (population mean)
    • (population variance)
  • (sample) where each

    • (e.g.9.1) For each , .
    • (observed data values, realizations)
      • The joint PMF is called the sample distribution of
        • (9.3) .
        • (Similar holds for the continuous case with PDF ).
  • A random variable is called a statistic if:

    • is a function of , and
    • does not depend on any unknown parameters.
  • An estimator (אומד) (of an unknown parameter ) is a statistic denoted by or , that is used to estimate .

    • (For any given observed values) is called an estimate (אומדן) of the parameter .
    • is called the error of the estimate.
    • The bias of an estimator is defined as .
      • An estimator is called an unbiased estimator of if .
        • If is an unbiased estimator of , then is an unbiased estimator of for any linear function .
      • An estimator is called a biased estimator of if .
    • The mean squared error (MSE) of an estimator of is defined as .
      • An estimator is better than (of the same parameter ) if .
      • If and are unbiased estimators of , then:
        • .
  • Let be a random sample from a population with pmf/pdf with unknown parameter .

    • The likelihood function (of given realizations ) is a function of defined by .
    • is the log-likelihood function.
    • An estimator is called a maximum likelihood estimator (MLE) of if , or equivalently, if for all for all .
      • (9.20) If is a MLE of , then is a MLE of for any one-to-one function .
  • A confidence interval (for an unknown parameter ) with confidence level is an interval such that .

    • Example:
      • (confidence level)
      • parameter:
      • estimator: (sample mean)
      • (confidence interval for with confidence level ).
        • For multiple number of samples, of the cases, the confidence interval will contain the true value of the parameter .

Examples

  • The statistic is called the sample mean of .
    • (9.8) for all . (i.e. the expectation of the sample mean is equal to the population mean and to the expectation of any ).
    • The sample mean is an unbiased estimator of . (the population mean ).
    • (i.e. the variance of the sample mean is equal to the population variance divided by the sample size ).
  • The statistic is called the sample variance of , and is the called sample standard deviation.
    • (9.10) is an unbiased estimator population variance of (unknown mean )
    • (9.11)
  • (9.9) If is known, then the statistic (denoted also by ) is an unbiased estimator of the population variance .

Hypothesis Testing

  • A pivotal quantity is a function such that the distribution of does not depend on the unknown parameter .

    • (Example: )
  • The null hypothesis

  • The alternative hypothesis

  • The test statistic is a statistic used to test the null hypothesis .

    • The rejection region (or critical region) is the set of values of the test statistic for which the null hypothesis is rejected.
    • The acceptance region is the set of values of the test statistic for which the null hypothesis is accepted.
    • is called a critical value
  • Type I error: Rejecting when is true.

    • (significance level)
  • Type II error: Accepting when is true.

    • If is composite, then is a function of .
  • (power)

  • is the likelihood ratio (of and ).

  • is the likelihood ratio (of hypotheses and , where is the population distribution).

  • (Neyman-Pearson lemma) If and , then is the most powerful test of significance level at most for testing against .