1. 决策
1.1 if 条件语句
在许多情况下,你需要根据条件做出决定。
为此,可以使用 if 语句。
例如:
x <- 24
if(x > 10){
print( "x is greater than 10")
}
在许多情况下,你需要根据条件做出决定。
为此,可以使用 if 语句。
例如:
x <- 24
if(x > 10){
print( "x is greater than 10")
}
你好,我是悦创。
R 语言是应用最广泛的统计编程语言。
而且 R 语言它是数据科学家和分析师的首选。
在本课程中,我们将学习R语言的基础知识,了解如何创建存储和操作数据的程序、以及如何使用各种数据集执行数据分析任务,以及如何使用图形和图表可视化结果。
本课程学习的技能可应用于任何与数据相关的领域,包括金融、数据科学、机器学习等。
Assignment 1 is designed to test your understanding of the skills practiced in weeks 1 through 4:
What kind of data structure is user_data
in the following declaration?
user_data = ("TJ", 24, "artLover123")
Netiquette is essentially rules and norms for interacting with others on the Internet.
We’re here to help with nine simple guidelines for how to be on your best behavior in an online classroom.
In order to participate in this course within Canvas, you should make sure that your technologies meet the minimal basic computer specificationsLinks to an external site..
Due Tuesday, October 10 at 11:55pm
This assignment gives you practice working with pointers, trees, and dynamic allocation. The program will be to implement the “Animal Guessing Game”. This game is a version of “20 Questions” where the only category is animals. You think of an animal and your program will try to guess the animal by asking you a series of “yes or no” questions.
(0 points) Please type your code and answers into Jupyter notebook. All visualizations should be prop-erly labelled. Submit the notebook as a pdf.
(3.5 points) Use the bwght dataset from the Wooldridge python module to answer the following question. You can find the documentation for the data online here. Import this data into your notebook.
(a) (1.5 Points) How many women are in the sample? What proportion of women with a family income higher than $50,000 are smokers? What proportion of women with a family income less than $20,000 are smokers?
(b) (1 Points) Generate a table of summary statistics for the dataframe. What is the average number of cigarettes smoked in a day? Is the mean a good measure of the typical women’s smoking habits? If no, explain why and if there is a better measure.
(c) (1 Points) Find the mode of fatheduc in the sample. Why are only 1,192 observations used to compute this statistic?
(5.5 points) Use the bwght dataset from the Wooldridge python module to answer the following question.
(a) (1 point) Generate two different histograms of bwght using Sturge’s and FD binning methods. Explain the strengths and weaknesses of each method when applied to bwght.
(b) (1 point) Create a histogram of bwght using either sturges or fd to choose the number of bins. Overlay a density curve.
(c) (2 points) Using a q-q plot, do you believe bwght is approximately normally distributed? Why are why not? What about family income?
(d) (1.5 points) Create a boxplot conditioning on whether or not the mother was a smoker. Do you observe any differences? If so, what are they?
(6 points) Use the bwght dataset from the Wooldridge python module to answer the following question.
(a) (2 points) Estimate the parameters for the following simple regression:
bwght^=β0^+β1^×packs
report the intercept and slope. What do these tell you about the association between cigarette use and birth weight?
(b) (2 points) What is the predicted value of birthweight when packs = 0? When packs = 2? What is the interpretation of the intercept?
(c) (1 point) Verify the residuals of this regression sum (approximately) to zero.
(d) (1 point) Using a scatter plot, show the observed values against the values predicted by a regression.
Create a list of your favorite 4 movies. Print out the 0th element in the list. Now set the 0th element to be “Star Wars” and try printing it out again.
创建你最喜欢的4部电影的列表。打印列表中的第0个元素。现在将第0个元素设置为“星球大战”,然后再试着打印它。
# 创建一个包含4部电影的列表
favorite_movies = ["Movie1", "Movie2", "Movie3", "Movie4"]
# 打印列表中的第0个元素
print(favorite_movies[0])
# 将第0个元素设置为“星球大战”
favorite_movies[0] = "星球大战"
# 再次打印列表中的第0个元素
print(favorite_movies[0])
使用命令行运行 Jupyter Notebook 的具体流程如下:
jupyter notebook
In the following code:「D」
size = 20
x = 100
y = 200
ball = Circle(size)
circle.set_position(x, y)
Write a function that takes one parameter - a float which represents a temperature in Celsius - and returns a float which represents that temperature in Fahrenheit.
Then, write a function that does the opposite conversion.
Here are the formulas for temperature conversion:
Which of the following are spreadsheet programs?「A、B、C、D」
A. Apple Numbers
B. Google Sheets
C. LibreOffice Calc
D. Microsoft Excel
E. Visual Studio Code
Which spreadsheet program would be best for a very large data set with a million records?「B」
A. Google Sheets
B. Microsoft Excel
Do spreadsheet applications generally allow users to write custom functions to analyze or modify the data in the spreadsheet?「A」
A. Yes
B. No
Data in a spreadsheet generally has a fixed schema.「B」
在这次作业中,你将尝试使用一个真实的数据源:NASA 的地球表面温度的历史测量数据。为了分析这些数据,一些预备工作是必要的。特别是,你将:
团队可以选择几种不同的工作流程。强烈建议实时合作,通常比异步合作更有效。但是,在实时合作不可能的情况下,有一些特定的工作流程可以帮助异步团队工作。
Database Design
在这个作业中,你将:
下面详细说明了你应该如何进行这项作业的其他要求。
这个作业是以一个在GitHub.com的仓库的形式给你的,该网站用于分享代码。仓库就是项目的一个高级别的名字。