Week 2 RQ4

R version 4.0.2 (2020-06-22) -- "Taking Off Again"
## example dataset
> cnrl = read.table("http://web.stanford.edu/~rag/stat209/cnrldata", header = T)
> cnrl
      Y    X G
01 2.23 0.28 1
02 4.99 0.97 1
03 3.37 1.25 1
04 8.54 2.46 1
05 8.40 2.51 1
06 3.70 1.17 1
07 7.93 1.78 1
08 2.43 1.21 1
09 5.40 1.63 1
10 8.44 1.98 1
11 3.25 2.36 0
12 5.30 2.11 0
13 1.39 0.45 0
14 4.69 1.76 0
15 6.56 2.09 0
16 3.00 1.50 0
17 5.85 1.25 0
18 1.90 0.72 0
19 3.85 0.42 0
20 2.95 1.53 0

> # try interactions package
> library(interactions)
Warning message:
package ‘interactions’ was built under R version 4.0.3 
#start with fit of two regressions lines
> # matches what we usually get, more compact way of writing model for lm
> cnrlmod2 = lm(Y ~ G*X, data = cnrl)
> summary(cnrlmod2)

Call:
lm(formula = Y ~ G * X, data = cnrl)

Residuals:
    Min      1Q  Median      3Q     Max 
-2.0734 -1.0594 -0.2548  1.2830  2.1980 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)  
(Intercept)   2.0103     1.0501   1.914   0.0736 .
G            -1.5132     1.5403  -0.982   0.3405  
X             1.3134     0.6704   1.959   0.0677 .
G:X           1.9975     0.9544   2.093   0.0527 .
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 1.407 on 16 degrees of freedom
Multiple R-squared:  0.684,     Adjusted R-squared:  0.6247 
F-statistic: 11.54 on 3 and 16 DF,  p-value: 0.0002816

## The johnson_neyman command uses that basic fit with syntax, provides ggplot graphics
> johnson_neyman(model = cnrlmod2, pred = G, modx = X)
JOHNSON-NEYMAN INTERVAL 

When X is INSIDE the interval [1.43, 56.53], the slope of G is p < .05.

Note: The range of observed values of X is [0.28, 2.51]
## sadly this is not the simultaneous region of significance; 
## likely matches the non-simultaneous region which Potthoff (1966) showed was undesirable
## the interactions package does provide a 'false discovery rate' adjustment which moves the region towards R' but not enough
##leave too large a region in the data.
> johnson_neyman(model = cnrlmod2, pred = G, modx = X, control.fdr = TRUE)
JOHNSON-NEYMAN INTERVAL 

When X is INSIDE the interval [1.58, 4.24], the slope of G is p < .05.

Note: The range of observed values of X is [0.28, 2.51]

Interval calculated using false discovery rate adjusted t = 2.56 

> 

## probemod less successful
> install.packages("probemod")


> #setup for probemod functions
> #jn function on p.2 of manual
> #run a model, following the manual's convoluted syntax

> cnrlmod = lm('Y ~ G + X', data = cnrl)
> jncnrl = jn(cnrlmod, 'Y', 'G' , 'X')
Error in cov[interactionterm, interactionterm] : subscript out of bounds

## ignore manual example and try
> cnrlmod = lm('Y ~ G*X', data = cnrl)
> jncnrl = jn(cnrlmod, 'Y', 'G' , 'X')
## now with my fix you get a result, but sadly looks like the non-simultaneous region; you do get a plot (but nonsimultaneous it appears)

> print(jncnrl)
Call:
jn(model = cnrlmod, dv = "Y", iv = "G", mod = "X")

Conditional effects of  G  on  Y  at values of  X 
 X Effect     se      t      p    llci   ulci
 1 0.4844 0.7755 0.6246 0.5416 -1.1685 2.1372
 2 2.4819 0.8074 3.0738 0.0077  0.7609 4.2030
 3 4.4795 1.5889 2.8193 0.0129  1.0929 7.8661
> plot(jncnrl)
Values of X indicated by the shaded region
                     x        y         se       t    p          llci       ulci
Lower Bound:  1.432086  1.34749  0.6321944 2.13145 0.05 -2.442491e-15   2.694981
Upper Bound: 39.950965 78.29072 36.7312081 2.13145 0.05  0.000000e+00 156.581434