MplusAutomation:R的Mplus自動化套件

MplusAutomation這個套件讓我們可以在R的環境裡面使用Mplus，當你需要一次建立很多個模型、在模型(參數和適配度)之間中做比較、或是對估計值做進一步的分析的時候，在R的環境裡面會方便很多。以下我一樣借用Prof. Gorden Cheung的例題為大家簡單做個示範，我使用的Mplus語法(ex5a.inp)和模擬資料(Exercise5.dat)是取自 Prof. Cheung在中山大學的工作坊(2017年12月)第二天的第五個例題。

1. 安裝、載入套件

MplusAutomation是標準的CRAN套件，直接使用install.packages()就可以安裝：

# install.packages('MplusAutomation')

除了MplusAutomation之外，我們同時也載入幾個常會用到的套件：

library(MplusAutomation)
library(MASS);library(Matrix);library(reshape2)
library(magrittr);library(ggplot2)

2. (編輯)執行Mplus的指令檔

Mplus的輸入檔(.inp)其實只是一般的文字檔，直接在RStudio裡面就可以進行編輯。這裡我們直接把已經預先備好的輸入檔(ex5a.inp)和資料檔(Exercise5.dar)放在工作目錄裡面，然後執行MplusAutomation::runModels()：

runModels("ex5a.inp")

## 
## Running model: ex5a.inp 
## System command: C:\WINDOWS\system32\cmd.exe /c cd "." && "Mplus" "ex5a.inp"

它會呼叫Mplus，然後把輸出擺在ex5a.out這個檔案裏面。

3. 讀取Mplus的輸出檔

Mplus的輸出檔(.out)也只是一般的文字檔，所以在RStudio裡面也可以直接打開、進行檢視或編輯。重要的是，MplusAutomation讓我們可以一次把很多個.out檔裡面的資料讀到R的環境裡面，讓後續的分析工作變得方便。好比說 …

M = readModels('ex5a.out')

## Reading model:  ex5a.out

我們把ex5a.out讀進M之後，用MplusAutomation::coef() …

coef(M)

##                  Label    est    se    pval
## 25                M<-X  0.132 0.073   0.070
## 26                M<-W -0.491 0.089   0.000
## 27               M<-XW -0.102 0.039   0.008
## 28                Y<-M -0.158 0.044   0.000
## 29            SPROM<-Y  0.386 0.091   0.000
## 30           PERFEV<-Y  0.117 0.026   0.000
## 31              SAL<-Y  0.774 0.243   0.001
## 1        SPROM1<-SPROM  1.000 0.000 999.000
## 2        SPROM2<-SPROM  1.745 0.307   0.000
## 3        SPROM3<-SPROM  2.071 0.335   0.000
## 4                X1<-X  1.000 0.000 999.000
## 5                X2<-X  1.004 0.042   0.000
## 6                X3<-X  0.946 0.037   0.000
## 7                M1<-M  1.000 0.000 999.000
## 8                M2<-M  0.977 0.027   0.000
## 9                M3<-M  0.837 0.044   0.000
## 10               W1<-W  1.000 0.000 999.000
## 11               W2<-W  0.906 0.025   0.000
## 12               W3<-W  0.958 0.024   0.000
## 13              Y1<-YV  1.000 0.000 999.000
## 14              Y2<-YV  0.979 0.034   0.000
## 15              Y3<-YV  0.917 0.056   0.000
## 16              Y4<-YD  1.000 0.000 999.000
## 17              Y5<-YD  1.113 0.065   0.000
## 18              Y6<-YD  1.177 0.083   0.000
## 19              Y7<-YA  1.000 0.000 999.000
## 20              Y8<-YA  0.944 0.152   0.000
## 21              Y9<-YA  0.671 0.125   0.000
## 22               YV<-Y  1.000 0.000 999.000
## 23               YD<-Y  0.968 0.103   0.000
## 24               YA<-Y  0.873 0.120   0.000
## 32               W<->X -0.798 0.310   0.010
## 33         SAL<->SPROM  0.205 0.163   0.208
## 34      PERFEV<->SPROM  0.005 0.015   0.745
## 35        PERFEV<->SAL  0.036 0.084   0.665
## 36      X1<-Intercepts  3.450 0.137   0.000
## 37      X2<-Intercepts  3.466 0.133   0.000
## 38      X3<-Intercepts  3.581 0.136   0.000
## 39      W1<-Intercepts  5.209 0.136   0.000
## 40      W2<-Intercepts  5.199 0.128   0.000
## 41      W3<-Intercepts  5.329 0.128   0.000
## 42      M1<-Intercepts  3.479 0.143   0.000
## 43      M2<-Intercepts  3.558 0.145   0.000
## 44      M3<-Intercepts  3.122 0.123   0.000
## 45      Y1<-Intercepts  5.674 0.078   0.000
## 46      Y2<-Intercepts  5.710 0.078   0.000
## 47      Y3<-Intercepts  5.402 0.083   0.000
## 48      Y4<-Intercepts  5.679 0.079   0.000
## 49      Y5<-Intercepts  5.706 0.079   0.000
## 50      Y6<-Intercepts  5.254 0.092   0.000
## 51      Y7<-Intercepts  5.329 0.094   0.000
## 52      Y8<-Intercepts  5.562 0.083   0.000
## 53      Y9<-Intercepts  4.023 0.106   0.000
## 54     SAL<-Intercepts  1.315 0.246   0.000
## 55  SPROM1<-Intercepts  5.937 0.078   0.000
## 56  SPROM2<-Intercepts  3.269 0.136   0.000
## 57  SPROM3<-Intercepts  5.208 0.107   0.000
## 58  PERFEV<-Intercepts  2.132 0.027   0.000
## 59               X<->X  3.427 0.270   0.000
## 60               W<->W  3.415 0.363   0.000
## 61             X1<->X1  0.467 0.152   0.002
## 62             X2<->X2  0.316 0.155   0.042
## 63             X3<->X3  0.727 0.129   0.000
## 64             W1<->W1  0.389 0.078   0.000
## 65             W2<->W2  0.369 0.099   0.000
## 66             W3<->W3  0.171 0.056   0.002
## 67             M1<->M1  0.246 0.099   0.013
## 68             M2<->M2  0.466 0.116   0.000
## 69             M3<->M3  0.830 0.133   0.000
## 70             Y1<->Y1  0.162 0.029   0.000
## 71             Y2<->Y2  0.153 0.044   0.000
## 72             Y3<->Y3  0.496 0.078   0.000
## 73             Y4<->Y4  0.264 0.049   0.000
## 74             Y5<->Y5  0.186 0.037   0.000
## 75             Y6<->Y6  0.383 0.076   0.000
## 76             Y7<->Y7  0.591 0.116   0.000
## 77             Y8<->Y8  0.457 0.137   0.001
## 78             Y9<->Y9  1.751 0.194   0.000
## 79           SAL<->SAL 11.626 2.489   0.000
## 80     SPROM1<->SPROM1  0.782 0.185   0.000
## 81     SPROM2<->SPROM2  1.880 0.177   0.000
## 82     SPROM3<->SPROM3  0.393 0.170   0.020
## 83     PERFEV<->PERFEV  0.120 0.017   0.000
## 84       SPROM<->SPROM  0.341 0.090   0.000
## 85               M<->M  1.973 0.243   0.000
## 86             YV<->YV  0.356 0.101   0.000
## 87             YD<->YD  0.054 0.057   0.341
## 88             YA<->YA  0.219 0.067   0.001
## 89               Y<->Y  0.915 0.192   0.000

就可以看到模型係數了。

4. 蒙地卡羅模擬

資料進入R的環境之後我們就可以使用R的基本語法和各種套件，來做進一步的分析。好比說，如果我們要做工作坊裡面第二天第五個例題裡面的蒙地卡羅，這個模擬分析的目的是要：使用路徑係數(\(b_1\)、\(b_3\)、\(b_4\))和潛在調節變數的變異量(\(w_1\))，來檢視間接效果(indirect effect)的區間估計(95% confidence interval)會如何隨調節變數的值而改變。這個模擬需到用到\(b_1\)、\(b_3\)、\(b_4\)和\(w_1\)這四個模型參數(model parameters)的估計值和共變矩陣；首先我們把它們的估計值讀進mu …

cx = coef(M)                     # extract the coef we need
mu = cx$est[cx$Label %in% c(' M<-X',' M<-XW',' Y<-M',' W<->W')]
names(mu) = c('b1','b3','b4','w1'); mu

##     b1     b3     b4     w1 
##  0.132 -0.102 -0.158  3.415

然後把共變矩陣讀進Sg …

spec = M$tech1$parameterSpecification
i = c(spec$beta["M","X"],        # extract index of paramenter from TECH1
      spec$beta["M","XW"], 
      spec$beta["Y","M"], 
      spec$psi["W","W"])         # 
Sg = M$tech3$paramCov[i,i]       # extract COV matrix from TECH3
Sg = forceSymmetric(Sg,"L"); Sg

## 4 x 4 Matrix of class "dsyMatrix"
##              60           62           65           72
## 60  0.005319930 -0.000701271 -0.000415602  0.006057040
## 62 -0.000701271  0.001488860  0.000142171 -0.000248988
## 65 -0.000415602  0.000142171  0.001934840 -0.001249790
## 72  0.006057040 -0.000248988 -0.001249790  0.131426000

蒙地卡羅模擬基本上就是從一個多變量常態分佈(multivariate Normal distribution)裡面做隨機抽樣，有了估計值和共變矩陣之後，只要下一行指令：

mx = mvrnorm(100000, mu, Sg) %>% data.frame

我們把抽樣的結果擺在mx裡面之後，計算每一組模擬值的間接效果，和找到它在每一個數值之下的信心區間，分別也都只需要一行指令 …

df = sapply(seq(-2,2,0.25), function(i){    # for every (21) values of W:
  x = with(mx, b1*b4 + b3*b4*i*sqrt(w1))       # calculate indirect effect
  c(i, quantile(x, c(0.025,0.500,0.975)))      # estimate 95% CI
  }) %>% t %>% data.frame %>%               # simple transformation
  'names<-'(c('sd', 'Low', 'Est.', 'High')) # assign colnames

然後就可以畫出跟例題裡面一樣的圖了：

df %>% melt('sd') %>% 
  ggplot(aes(x=sd, y=value, color=variable)) +
  geom_line(size=1.2, alpha=0.5) + geom_point() +
  labs(title = 'Confidence Interval of Moderated Indirect Effect', 
       x = 'Moderator W (in standard errors)',
       y = 'Indirect Effects of X on Y through M',
       color = '95% CI:')

在原本例題的做法裡面，我們先是要到輸出檔去找到這些模型參數的平均值和共變矩陣(總共有有14個數值)，把散落在四處的數值手動抄出來，然後用這一些數值，另外再寫一支Mplus的程式來做蒙地卡羅模擬；之後再把模擬的結果在在讀進EXCEL，在調節變數的五個不同數值之下，分別計算每一組模擬值的間接效果，然後又要手動去找到間接效果在每一個數值之下的信心區間，之後再做圖。這樣做需要在兩個工具之間跳來跳去，手動抄數值很容易出錯，也沒辦法模擬太多個調節變數的數值(例題裡面只做5點，我們的程式是做20點。)各位可以想像一下，如果你同時要試好幾個不同的模型，試好幾組間接效果的時候(做論文的時候通常就是這樣)，使用MplusAutomation可以為我們節省多少的精力和時間。

MplusAutomation:R的Mplus自動化套件

Tony Chuo 2017/12/21

1. 安裝、載入套件

2. (編輯)執行Mplus的指令檔

3. 讀取Mplus的輸出檔

4. 蒙地卡羅模擬