Study Notes/Statistics

Normal Distributions

Kirina 2022. 11. 25. 04:13
반응형

목차

    1. Normal Distribution

    • has 2 parameters (=number describing whole population): mean (mu; µ) and varaince (sigma squared; σ^2)
      • they are constants (same value for all observations in the population
      • X ~ N (µ, σ^2)
      • probability of getting a single point X is 0; it is area!
    • Standard Normal Random Variable
      • variable X is transformed to "Z score"
      • mean= 0 and variance = 1
      • Z ~ N (0,1)
    • X (RV) → Z → probability 
      • z= (X-µ) / σ 
    • probability → Z → X (RV) 
      • X = (z * σ)  + µ

    2. Normal Approximation to the Bionomial (NAB)

    • when n is large, we can approximate the bionomial probabilities to normal distribution
    • De Moivre's Theorem
      • µ = np, σ^2 = npq
        • (q = 1-p)
      • n is large and p is close to .5; when np ≥5 AND nq ≥5 
      • Y ~ N(µ= np, σ^2= npq)

    3. Chi-Square Distribution (χ2)

    • it is for categorical data, continuous random variable, not symmetric
    • one-parameter distribution: (nu; v) degree of freedom
      • v = n-1
    • mean = E(χ2) = v
    • varaince = Var(χ2)=2v
    • three applications of X^2 (Chi-Square)
      • 1) Testing hypotheses about the value of the variance
        • for variance from the sample data, use s^2 (estimate of σ^2)
        • a. set null and alternative hypotheses
        • b. choose α level (if α=0.05 then α/2=0.025)
        • c. set up rejection regions using chi-square table, find critical values of χ2
          • Critical upper and lower value
        • d. calculate value of test statistics
          • χ2 observed value = [(n-1) * s^2] / σ^2
        • e. compare test statistics to critical value
      • 2) Chi-Square Goodness of Fit Test
      • 3) Chi-Square Test of Independence (of 2 categorical RV's)
        • test of "homogeneity" (= equality) of two proportions

    4. Sampling and Estimation

    • Why Xbar (sample statistics) better than the median?
      • 1) unbiased: estimator is parameter 
      • 2) consistent: as n gets to infinity, the value of the estimator gets closer to the true parameter value (Law of Large Numbers)
      • 3) Efficient
    • Mean: µ
    • Variance: σ^2/n
    • Standard Error of the Mean (standard deviation): σ/Sqrt(n)

    5. Confidence Interval

    • When we have large sample (Central Limit Theorem - distribution Xbar will be approximately normal)
    • When µ and σ is known (then use z distribution) (Law of Large Numbers- when n is large we can assume that σ^2 is known)
    • CI: (Xbar - (Zα/2)*σxbar, Xbar + (Zα/2)*σxbar)
      • σxbar= σ/Sqrt(n)
    •  
    반응형