8-pandas-exploration
pandas exploration
In this assignment you will select a data set and do some munging and analysis of it using pandas
, Jupyter Notebooks, and associated Python-centric data science tools.
Import the core data science libraries:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
Import the raw data
In this section, you will import the raw data into a pandas
DataFrame.
Data inspection
In this section, you will show enough of your data for a viewer to get a general sense of how the data is structured and any unique features of it. Complete each of the indicated tasks in a Code cell, making sure to include a Markdown cell above each Code cell that explains what is being shown by the code.
- Show 5 rows, selected at random, from the data set.
- Show each of the column names and their data types.
- Show any unique features of your chosen data set.
Feel free to add as many additional cells as you need to help explain the raw data.
Data munging
Place your data munging code and documentation within this section.
- Keep each of your Code cells short and focused on a single task.
- Include a Markdown cell above each code cell that describes what task the code within the code cell is performing.
- Make as many code cells as you need to complete the munging - a few have been created for you to start with.
- Display 5 sample rows of the modified data after each transformation so a viewer can see how the data has changed.
Note: If you believe that your data set does not require any munging, please explain in detail. Create Markdown cells that explain your thinking and create Code cells that show any specific structures of the data you refer to in your explanation.
Data analysis
Place your data analysis code and documentation within this section.
- Perform at least 5 different statistical or other analyses of different aspects of the data.
- Your analyses must be specific and relevant to your chosen data set and show interesting aspects of it.
- Include at least one analysis that includes grouping rows by a shared attribute and performing some kind of statistical analysis on each group.
- Sort the data in at least 1 of your analyses, but sort on its own does not constitute an analysis on its own.
- Keep each of your Code cells short and focused on a single task.
- Include a Markdown cell above each Code cell that describes what task the code within the Code cell is performing.
- Make as many code cells as you need to complete the analysis - a few have been created for you to start with.
Data visualization
In this section, you will create a few visualizations that show some of the insights you have gathered from this data.
- Create at least 5 different visualizations, where each visualization shows different insights into the data.
- Use at least 3 different visualization types (e.g. bar charts, line charts, stacked area charts, pie charts, etc)
- Create a Markdown cell and a Code cell for each, where you explain and show the visualizations, respectively.
- Create as many additional cells as you need to prepare the data for the visualizations.
欢迎关注我公众号:AI悦创,有更多更好玩的等你发现!
公众号:AI悦创【二维码】
AI悦创·编程一对一
AI悦创·推出辅导班啦,包括「Python 语言辅导班、C++ 辅导班、java 辅导班、算法/数据结构辅导班、少儿编程、pygame 游戏开发」,全部都是一对一教学:一对一辅导 + 一对一答疑 + 布置作业 + 项目实践等。当然,还有线下线上摄影课程、Photoshop、Premiere 一对一教学、QQ、微信在线,随时响应!微信:Jiabcdefh
C++ 信息奥赛题解,长期更新!长期招收一对一中小学信息奥赛集训,莆田、厦门地区有机会线下上门,其他地区线上。微信:Jiabcdefh
方法一:QQ
方法二:微信:Jiabcdefh
- 0
- 0
- 0
- 0
- 0
- 0