Hypothesis Testing (Paired and Unpaired T-tests)
Hypothesis Testing
Barbara Yam
4 May 2020
# Q1
test1 <- read.csv("bp1.csv")
head(test1)
## Sample CoA CoB
## 1 1 -11.0 -32.0
## 2 2 -4.8 7.5
## 3 3 -2.5 1.8
## 4 4 -25.2 -14.0
## 5 5 -4.6 11.2
## 6 6 -8.5 -30.2
summary(test1)
## Sample CoA CoB
## Min. : 1.00 Min. :-25.200 Min. :-32.000
## 1st Qu.: 6.75 1st Qu.:-10.850 1st Qu.:-14.825
## Median :12.50 Median : -8.650 Median : -5.100
## Mean :12.50 Mean : -9.229 Mean : -6.283
## 3rd Qu.:18.25 3rd Qu.: -5.575 3rd Qu.: 3.125
## Max. :24.00 Max. : -0.300 Max. : 18.900
Based on the mean of CoA and CoB, it looks like CoA has a lower mean than CoB. However, for the hypothesis testing, it is safer to do a two tailed test.
Q1i. H0 : Drug A and Drug B have similar effects in reducing blood pressure. Mean of Drug A results - Mean of Drug B results = 0
H1: One drug is better than the other drug in reducing blood pressure. Mean of Drug A results - Mean of Drug B results is not 0.
Q1ii.
diff1 <- test1$CoA - test1$CoB
shapiro.test(diff1)
##
## Shapiro-Wilk normality test
##
## data: diff1
## W = 0.93116, p-value = 0.1035
var(test1$CoA)
## [1] 29.73433
var(test1$CoB)
## [1] 183.8545
var.test(test1$CoA,test1$CoB)
##
## F test to compare two variances
##
## data: test1$CoA and test1$CoB
## F = 0.16173, num df = 23, denom df = 23, p-value = 4.683e-05
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.06996222 0.37385588
## sample estimates:
## ratio of variances
## 0.1617275
The unpaired t test is being used because of the following criteria are checked: i. Drug A and B are administed on 2 independent sample groups. ii. Using the Shapiro-Wilk normality test, the p value=0.1035 > 0.05, implying that the distribution of the data are not significantly different from normal distribution. In other words, we can assume the normality. iii. The variances of CoA and CoB are 29.73433 and 183.8545,indicating that variances are not equal. Similarly, the F test, where p=4.683e-05 <0.05, indicates that that the variances are not equal. As a result, specifically, the Welch Two Sample t-test is used.
Q1iii.
t.test(test1$CoA,test1$CoB, alternative="two.sided", var.equal = FALSE)
##
## Welch Two Sample t-test
##
## data: test1$CoA and test1$CoB
## t = -0.98747, df = 30.25, p-value = 0.3312
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -9.036249 3.144582
## sample estimates:
## mean of x mean of y
## -9.229167 -6.283333
Since p = 0.3312 > 0.0025, at confidence level of 95%, there is no conclusive evidence to reject H0. There is no conclusive evidence that one drug is better than the other.
None of the company can claim their drug is better than the other.
Q2.
test2 <- read.csv("bp2.csv")
head(test2)
## CoA CoB
## 1 168.2 139.2
## 2 158.5 150.3
## 3 131.2 116.8
## 4 140.2 98.2
## 5 128.5 125.9
## 6 125.6 146.8
- H0 : Drug A and Drug B have similar effects in reducing blood pressure. Mean of Drug A results - Mean of Drug B results = 0 H1: One drug is better than the other drug in reducing blood pressure. Mean of Drug A results - Mean of Drug B results is not 0
summary(test2)
## CoA CoB
## Min. :104.0 Min. : 94.8
## 1st Qu.:135.0 1st Qu.:126.4
## Median :145.2 Median :140.2
## Mean :144.4 Mean :139.1
## 3rd Qu.:155.3 3rd Qu.:155.6
## Max. :168.2 Max. :193.8
diff2 <- test2$CoA - test2$CoB
shapiro.test(diff2)
##
## Shapiro-Wilk normality test
##
## data: diff2
## W = 0.96491, p-value = 0.5445
OutVals <- boxplot(diff2)$out
OutVals
## numeric(0)
var(test2$CoA)
## [1] 209.9891
var(test2$CoB)
## [1] 540.713
var.test(test2$CoA,test2$CoB)
##
## F test to compare two variances
##
## data: test2$CoA and test2$CoB
## F = 0.38836, num df = 23, denom df = 23, p-value = 0.02752
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
## 0.1680002 0.8977394
## sample estimates:
## ratio of variances
## 0.388356
The paired t test is being used because of the following criteria are checked: i. Drug A and B are administered on the same individuals over different periods. The same target group is used. ii. Using the Shapiro-Wilk normality test, the p-value is 0.5445 > 0.05, implying that the distribution of the data are not significantly different from normal distribution. In other words, we can assume the normality. iii. OutVals, and the boxplot indicates that there is no significant outliers in the difference between the two related groups. iv. The variance for CoA and CoB are 209.9891 and 540.713 indicating that variances are not equal.The F test, where p= 0.02752 < 0.05 confirms that the variances are not equal.
Q2iii.
t.test(test2$CoA,test2$CoB, alternative="two.sided",paired = TRUE)
##
## Paired t-test
##
## data: test2$CoA and test2$CoB
## t = 1.1762, df = 23, p-value = 0.2516
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## -4.012238 14.587238
## sample estimates:
## mean of the differences
## 5.2875
Q2iii. Since the p- value 0.2516 > 0.0025, at 95% confidence interval, there is no conclusive evidence to reject the null hypothesis.
Therefore, there is no conclusive evidence that any of the drug is better than the other.
More samples could be gathered or there is simply no evidence that one drug is better than the other at 95% confidence interval.