# 一小时向非程序员介绍 R 编程语言

Python R语言 C/C++ Go 15225 次浏览

(1)下载R和RStudio

(2)控制台和脚本

```> x = 7
> x + 9
[1] 16```

(3)注释

`# 注释特别重要，所以我们学习了它`

(4)图形

```x = rnorm(1000, mean = 100, sd = 3)
hist(x)```

(5)获得帮助

```# 如果你知道函数名，但不知道怎么使用
?chisq.test
# 如果你知道要做什么，但不知道函数名
??chisquare```

(6)数据类型

```# 字符串向量
> y = c("apple", "apple", "banana", "kiwi", "bear", "strawberry", "strawberry")
> length(y)
[1] 7
# 数值向量
> numbers = rep(3, 99)
> numbers
[1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
[39] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
[77] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3```

```> mymatrix = matrix(c(10, 15, 3, 29), nrow = 2, byrow = TRUE)
> mymatrix
[,1] [,2]
[1,]   10   15
[2,]    3   29
> t(mymatrix)
[,1] [,2]
[1,]   10    3
[2,]   15   29
> solve(mymatrix)
[,1]        [,2]
[1,]  0.1183673 -0.06122449
[2,] -0.0122449  0.04081633
> mymatrix %*% solve(mymatrix)
[,1] [,2]
[1,]    1    0
[2,]    0    1
> chisq.test(mymatrix)
Pearson's Chi-squared test with Yates' continuity correction
data:  mymatrix
X-squared = 5.8385, df = 1, p-value = 0.01568```

```# 设置工作目录
setwd("~/Documents/R_intro")

# 读入一个数据集

(7)探索性数据分析

```> names(wages)
[1] "edlevel" "south"   "sex"     "workyr"  "union"   "wage"    "age"
[8] "race"    "marital"
> class(wages\$marital)
[1] "integer"
> table(wages\$union)
not union member     union member
438               96
> summary(wages\$workyr)
Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
0.00    8.00   15.00   17.82   26.00   55.00
> nrow(wages)
[1] 534
> length(which(is.na(wages\$sex)))
[1] 0
> linmod = lm(workyr ~ age, data = wages)
> summary(linmod)```

```hist(wages\$wage, xlab = "hourly wage", main = "wages in our dataset", col = "purple")
plot(wages\$age, wages\$workyr, xlab = "age", ylab="years worked", main = "age vs. years worked")
abline(lm(wages\$workyr ~ wages\$age), col="red", lwd = 2)```

• 用[]取子集. 这是个关键知识点。它可以应用于我所介绍的所有数据类型，而且极为有用。我真希望当时有时间让我妹妹做一个，比如只包含女性的工资直方图
• 编程相关的东西：循环、if语句、用户自定义函数，等等。不过我觉得不教这些东西也没问题——考虑到受众，我是把R当作一个数据分析环境而非一种编程语言来教授。
• 保存.rda文件和/或工作区
• 安装和载入包
• 其他数据类(比如列表)
• 其他(更好的？)帮助资源/提示/技巧