(4A)單元摘要: 使用第三周個人作業裡面的資料來練習


載入套件

pacman::p_load(dplyr,ggplot2,plotly,gridExtra)

載入資料:美國(郡)人口統計資料


【A】 描述資料(敘述性統計)

統計值

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   0.00    0.60    2.10    8.88   10.18   85.90 

💡 學習重點:分布
  ■ 一種描述『變數』的方式
  ■ 分布:『變數』的值出現的『頻率』
  ■ 可以用『出現次數』或『出現比率』來呈現


數值分布

類別分布


       Alabama         Alaska        Arizona       Arkansas     California 
            67             28             15             75             58 
      Colorado    Connecticut       Delaware        Florida        Georgia 
            64              8              3             67            159 
        Hawaii          Idaho       Illinois        Indiana           Iowa 
             5             44            102             92             99 
        Kansas       Kentucky      Louisiana          Maine       Maryland 
           105            120             64             16             24 
 Massachusetts       Michigan      Minnesota    Mississippi       Missouri 
            14             83             87             82            115 
       Montana       Nebraska         Nevada  New Hampshire     New Jersey 
            56             93             17             10             21 
    New Mexico       New York North Carolina   North Dakota           Ohio 
            33             62            100             53             88 
      Oklahoma         Oregon   Pennsylvania   Rhode Island South Carolina 
            77             36             67              5             46 
  South Dakota      Tennessee          Texas           Utah        Vermont 
            65             95            253             29             14 
      Virginia     Washington  West Virginia      Wisconsin        Wyoming 
           133             39             55             72             23 


【B】 簡單資料探索(分類比較)


分類統計:

North Central     Northeast         South          West 
       710.08        746.14        611.04       3879.13 
                Metro Nonmetro
North Central  604.64   752.43
Northeast      571.08  1007.72
South          561.92   646.06
West          2741.58  4408.75
`summarise()` has grouped output by 'region'. You can override using the `.groups` argument.

`summarise()` has grouped output by 'region'. You can override using the `.groups` argument.

分類關係:

Warning: Use of `d$black` is discouraged. Use `black` instead.
Warning: Use of `d$income_per_cap` is discouraged. Use `income_per_cap` instead.
Warning: Use of `d$black` is discouraged. Use `black` instead.
Warning: Use of `d$income_per_cap` is discouraged. Use `income_per_cap` instead.
`geom_smooth()` using formula 'y ~ x'