CISC-235 Data Structures W23 Assignment 3

AI悦创原创2023年4月10日大约 7 分钟...约 2245 字

March 25, 2023

General Instructions

一般的用法说明

Write your own program(s) using Python. Once you complete your assignment, place all Python files in a zip file and name it according to the same method, i.e., “235-1234-Assn3.zip”. Unzip this file should get all your Python file(s).

使用Python编写自己的程序。完成作业后，将所有Python文件放在一个zip文件中，并按照相同的方法命名，即“235-1234-Assn3.zip”。解压缩这个文件应该会得到所有的Python文件。

Then upload 235-1234-Assn3.zip into Assignment 3’s entry on onQ. You may upload several times if you wish. However, onQ keeps only the last uploaded file. The newly uploaded file will overwrite the old file. Please check your files after uploading. We will check the latest submission you made following the required naming.

然后上传235-1234-Assn3.zip到onQ上作业3的条目。如果你愿意，你可以上传几次。但是，onQ只保留最后上传的文件。新上传的文件将覆盖旧文件。上传后请检查文件。我们将检查您在要求命名后提交的最新文件。

You must ensure your code is executable and document your code to help TA mark your solution. We suggest you follow PEP81 style to improve the readability of your code.

你必须确保你的代码是可执行的，并记录你的代码，以帮助TA标记你的解决方案。我们建议您遵循PEP81风格来提高代码的可读性。

All data structures involved must be implemented by yourself, except for the built-in data types, i.e., List in Python.

所有涉及的数据结构必须由自己实现，除了内置的数据类型，如Python中的List。

An “I uploaded the wrong file” excuse will result in a mark of zero.

一个“我上传了错误的文件”的借口会导致零分。

1. Word-Frequency Hash Table (40 points)

词频哈希表(40点)

In Assignment 2, we created a modified version of AVL tree for saving word frequency (word is the key, frequency is the value associated with key) infor- mation in a given document. In this assignment, we will redesign the solution leveraging two types of hash table, i.e., closed hash table (with linear probing) and open hash table (with lists). You can not use any built-in hash-based data structure offered by Python. You should consider the “deletion” flag for the closed hash table.

在作业2中，我们创建了一个AVL树的修改版本，用于保存给定文档中的词频(词是键，频率是与键相关的值)信息。在本次作业中，我们将利用两种类型的哈希表重新设计解决方案，即封闭哈希表(带有线性探测)和开放哈希表(带有列表)。不能使用Python提供的任何内置的基于哈希的数据结构。您应该考虑关闭哈希表的“删除”标志。

Specifically, Your tasks are:

具体来说，你的任务是:

Implement two classes named ClosedHashTable and OpenHashTable. Both classes should have a load from file(self, file path) function. This function aims to read content from a file and save word-frequency infor- mation for all words appearing in the file in the hash table. Note that you cannot calculate the word frequency outside your hash table. In other words, when you scan the word sequence, you should insert the key to the hash table and update its frequency when you see another occurrence of the word. You can reuse your A2’s code to parse a document and generate word/token sequences.

实现两个类ClosedHashTable和opendhashtable。这两个类都应该有一个load from file(self, file path)函数。此函数旨在从文件中读取内容，并将文件中出现的所有单词的词频信息保存在哈希表中。注意，您不能在哈希表之外计算词频。换句话说，在扫描单词序列时，应该将键插入到哈希表中，并在看到单词再次出现时更新其频率。您可以重用A2的代码来解析文档并生成单词/令牌序列。

Test the performance of your ClosedHashTable and OpenHashTable by searching a list S containing K words, where half of the K words are in the hash table. You need first to call your load from file(self, file path) function to initialize your hash table using a given sample test file (named A3test.txt). Varying K by setting it to be 10, 20, 30, 40, 50. Print out the number of steps required for each search. For instance, K=10, means you should create a word list containing 10 words, 5 of them exist in the A3test.txt, and 5 of them do not exist. Then you perform 10 searches on the hash table for the 10 target words. Your search function should return a value indicating whether the search target exists in the hash table and, if it exists, how many steps have been taken to find the key. Hint: you can return -1, indicating the key does not exist in the hash table.

通过搜索包含K个单词的列表S来测试ClosedHashTable和OpenHashTable的性能，其中K个单词中有一半在哈希表中。首先需要调用load from file(self, file path)函数，使用给定的示例测试文件(名为A3test.txt)初始化哈希表。通过设置K为10,20,30,40,50来改变K。打印出每次搜索所需的步骤数。例如，K=10，意味着您应该创建一个包含10个单词的单词列表，其中5个单词存在于A3test.txt中，另外5个单词不存在。然后在哈希表上为这10个目标单词执行10次搜索。搜索函数应该返回一个值，该值指示搜索目标是否存在于哈希表中，如果存在，则表示已经执行了多少步来查找该键。提示:您可以返回-1，表示该键在哈希表中不存在。

To implement the above two functionalities, you need to implement at least an insert and search function for your hash table. You also need to design your hash function and determine the size of your hash table following the tips introduced in class. You are encouraged to explain your design in comments so TAs can quickly get your idea. If you decide to follow an existing hash function, explain that in comments as well.

要实现上述两个功能，您至少需要为哈希表实现一个插入和搜索函数。您还需要根据课堂上介绍的技巧设计哈希函数并确定哈希表的大小。我们鼓励你在评论中解释你的设计，这样助教就能很快理解你的想法。如果你决定使用一个现有的哈希函数，也请在评论中解释一下。

2. Graph (60 points)

2.1 Random Graph Generation: 10 points

随机图形生成:10分

Create a Graph class (and a vertex class if needed). Your graph class should have an initialization function to create a random connected graph with a specified number of vertices, and random weights on all the edges, such that the edge- weights all come from a specified range of integers.

创建一个Graph类(如果需要，还可以创建顶点类)。你的图类应该有一个初始化函数来创建一个随机连接图，它具有指定数量的顶点，并且在所有边上都有随机权重，这样边的权重都来自指定的整数范围。

The construction of random graphs is a huge topic and many approaches have been developed. For the purposes of this assignment, I suggest the following very simple method:

随机图的构造是一个巨大的课题，许多方法已经被开发出来。为了完成这项作业，我建议使用以下非常简单的方法:

Let the set of vertices be {1, 2, ..., n}

设顶点集合为{1,2，…n}

for i=2..n:

x = randint(1, i−1) # x is a random integer in the range [1 .. i−1]

let S be a randomly selected sample of x values from the set {1, ..., i−1}

for each s in S:

w = random(10,100)

add an edge between vertex i and vertex s, with weight w

2.2 Compare BFS with Prim’s algorithm: 50 points

Implement two functions within the graph class you created in 2.1. One for Breadth-First Search(BFS) (10 points) and the other for Prim’s Minimum Span- ning Tree algorithm (20 points).

Compare the efficiency of them by performing the below experiments (20 points):

forn=20,40,60:
    repeat k times: #k is an input parameter
        generate a random graph with n vertices
        use BFS to find a spanning tree ( let its total weight be B)
        use Prim to find a spanning tree ( let its total weight be P) compute Diff #Diff is the percentage by which B is larger than P
    compute the average of the values of Diff for this value of n

Report the average value of Diff for each value of n.

For example, if B=20, P=10, Diff = (20 − 10)/10 = 100%

需要答案找我付费获取，也可以私人定制重新编写，无查重的烦恼。💢

公众号：AI悦创【二维码】

AI悦创·编程一对一

AI悦创·推出辅导班啦，包括「Python 语言辅导班、C++ 辅导班、java 辅导班、算法/数据结构辅导班、少儿编程、pygame 游戏开发、Web、Linux」，全部都是一对一教学：一对一辅导 + 一对一答疑 + 布置作业 + 项目实践等。当然，还有线下线上摄影课程、Photoshop、Premiere 一对一教学、QQ、微信在线，随时响应！微信：Jiabcdefh

C++ 信息奥赛题解，长期更新！长期招收一对一中小学信息奥赛集训，莆田、厦门地区有机会线下上门，其他地区线上。微信：Jiabcdefh

方法一：QQ

方法二：微信：Jiabcdefh