CS101 PA [2]

注意:本学期 CS101 PA 的所有 C++ 代码都采用 C++20 标准。如果您使用 GCC 或 Clang,您需要在编译时设置 -std=c++20

如果您看到类似这样的报错

1
g++-9: error: unrecognized command line option ‘-std=c++20’; did you mean ‘-std=c++2a’?

说明您的编译器版本过低,请安装更高版本的编译器。目前 GCC 13 几乎已经将 C++20 除 modules 外的全部新特性实现完毕,我们推荐使用 GCC 13。

Problem 1. Multiplication of polynomials

In this problem, you are required to implement the multiplication of polynomials.

Formally, let A(x)A(x) and B(x)B(x) be two polynomials

A(x)=k=0n1akxk=a0+a1x++an1xn1,B(x)=k=0m1bkxk=b0+b1x++bn1xn1.\begin{aligned} A(x)&=\sum_{k=0}^{n-1}a_kx^k=a_0+a_1x+\cdots+a_{n-1}x^{n-1},\\ B(x)&=\sum_{k=0}^{m-1}b_kx^k=b_0+b_1x+\cdots+b_{n-1}x^{n-1}. \end{aligned}

Let C(x)=A(x)B(x)C(x)=A(x)B(x) be their product, i.e. C(x)=k=0n+m2ckxkC(x)=\sum_{k=0}^{n+m-2}c_kx^k where

ck=i+j=kaibj.c_k=\sum_{i+j=k}a_ib_j.

Given the coefficients (a0,a1,,an1)\left(a_0,a_1,\cdots,a_{n-1}\right) and (b0,b1,,bm1)\left(b_0,b_1,\cdots,b_{m-1}\right), your task is to compute (c0,c1,,cn+m2)\left(c_0,c_1,\cdots,c_{n+m-2}\right).

Input format

On the first line, two positive integers nn and mm, separated by a space.

On the second line, nn integers a0,a1,,an1a_0,a_1,\cdots,a_{n-1}, separated by spaces.

On the third line, mm integers b0,b1,,bm1b_0,b_1,\cdots,b_{m-1}, separated by spaces.

Output format

One line, consisting of n+m1n+m-1 integers c0,c1,,cn+m2c_0,c_1,\cdots,c_{n+m-2}, separated by a space.

Testing on OJ

There are 20 testcases on OJ. For the first 10 testcases, your program (assuming it is named foo.cc) is compiled with

1
g++ foo.cc -o foo -std=c++20 -Wall -Wpedantic -Wextra -Werror -DONLINE_JUDGE -fmax-errors=3 -fdiagnostics-color=always -fsanitize=undefined -fsanitize=address -g

For testcases 11 ~ 20, your program is compiled with

1
g++ foo.cc -o foo -std=c++20 -Wall -Wpedantic -Wextra -Werror -DONLINE_JUDGE -fmax-errors=3 -fdiagnostics-color=always -O2

Due to the OJ settings, there is no way of running a single testcase in pretest (运行自测). You can enter your input data there and see the output. For pretesting, the program will be compiled with the same command as for the first 10 testcases.

This OJ does not support testing of multithreaded programs. The time cost will be the sum of the time cost of each thread.

Data constraints

  • 1n,m3×1051\leqslant n,m\leqslant 3\times 10^5.
  • 0ai,bj90\leqslant a_i,b_j\leqslant 9 for every i{0,1,,n1}i\in\{0,1,\cdots,n-1\} and j{0,1,,m1}j\in\{0,1,\cdots,m-1\}.

Hint

It is possible that an1,bm1a_{n-1},b_{m-1} and cn+m2c_{n+m-2} are zero, but it does not matter. Just treat them as a normal coefficient.

Anything from the C++20 standard library is allowed. Do not waste your effort on manual memory management unless you really want to pursue extremely high efficiency. Simple arrays and std::vector are enough for passing all the tests. Do not reinvent wheels like std::complex.

A fast algorithm is needed. You may refer to FFT Algorithm or Karatsuba Algorithm.

You need to understand your implementation fully, instead of just copy-and-pasting a piece of code from some external source. There will be an offline check where you need to explain your code to TAs.

Examples

Input 1

1
2
3
2 2
1 1
2 3

Output 1

1
2 5 3

Explanation 1

(1+x)(2+3x)=2+5x+3x2.(1+x)(2+3x)=2+5x+3x^2.

Input 2

1
2
3
3 5
1 3 6
1 2 5 1 3

Output 2

1
1 5 17 28 36 15 18

Input 3

1
2
3
2 3
0 1
1 2 0

Output 3

1
0 1 2 0

Problem 2. Balanced Search Tree

In this problem, you are required to implement a data structure to support the following operations.

The files attached to this problem are shown as follows. We provide some tests to help you check your code.

1
2
3
4
5
6
7
8
9
10
.
└── attachments/
├── tests/
│ ├── insert.cpp
│ ├── erase.cpp
│ ├── querykth.cpp
│ ├── queryksum.cpp
│ └── rebuild.cpp
├── scapegoat.pdf
└── bst.hpp

NOTE: You can not use STL data structures, such as std::unordered_map, std::unordered_set, std::map and std::set, in this problem. All of your data structure should be designed by yourself.

Requirements:

Description:

You've got a set, consisting of integers. Assume before each operation, the set contains mm integers, which are s1s2s3...sms_1\le s_2\le s_3\le ... \le s_m. You are allowed to perform five operations on this set:

  1. Given xx, insert xx into the set
  2. Given xx, erase xx from the set
  3. Given xx, query whether xx is in the set
  4. Given kk, query the k-th element of set i.e. sks_k.
  5. Given kk, you need to find a position tt such that (ki=1tsi)i=t+1msi0\left (k\sum_{i=1}^ t s_i\right ) - \sum_{i=t+1}^m s_i\ge 0 , meanwhile minimize (ki=1tsi)i=t+1msi\left (k\sum_{i=1}^ t s_i\right ) - \sum_{i=t+1}^m s_i. You should return the minimum ansans. In other words:

ans=mint[1,m]i=1tksii=t+1msis.t.i=1tksii=t+1msi0ans = \min_{t\in [1,m]} \sum_{i=1}^t k\cdot s_i - \sum_{i=t+1}^m s_i \qquad \text{s.t.} \sum_{i=1}^t k\cdot s_i - \sum_{i=t+1}^m s_i \ge 0

Test inference introduction

Write your code in bst.hpp.

Our checker would first run the function void init(std::size_t n). Here, std::size_t n is the number of operations we would do in this test case. In this function, you can do anything to initialize your data structure. We would guarantee that this function would be called first and only once in each test case.

Next, our checker would call long long SetOperation(std::size_t opt, long long x) exactly nn times sequentially. You need to return answer values through this function call.
After each call, we would check your answer. Only if your ii-th answer of SetOperation is correct, could you receive the i+1i+1-th function call. This means you cannot implement any off-line algorithm method in this problem.

In SetOperation, you should do operations on your data structure according to (opt,x)(opt,x). Assume in the ii-th call, there are m(0mi1)m(0\le m\le i-1) elements in the set, which are s1s2...sms_1\le s_2\le ...\le s_m.

There are only 5 kinds of operations:

  • if opt=1opt=1, you need to insert exactly one x(1x1011)x(1\le x\le 10^{11}) into your data structure and return 00.
  • if opt=2opt=2, you need to erase exactly one xx from your data structure and return 00. It is guaranteed that there exists 1jm1\le j\le m such that sj=xs_j=x.
  • if opt=3opt=3, you need to answer whether x(1x1011)x(1\le x\le 10^{11}) is in the data structure. If xx is in the set, return 11. Otherwise, return 00.
  • if opt=4opt=4, it is guaranteed that 1xm1\le x\le m, and you should return sxs_x.
  • if opt=5opt=5, it is guaranteed that 1x1001\le x\le 100 and m1m\ge 1, and you need to return the answer of query, where ansans is

ans=mint[1,m]i=1txsii=t+1msis.t.i=1txsii=t+1msi0ans = \min_{t\in [1,m]} \sum_{i=1}^t x\cdot s_i - \sum_{i=t+1}^m s_i \qquad \text{s.t.} \sum_{i=1}^t x\cdot s_i - \sum_{i=t+1}^m s_i \ge 0

If you successfully pass all the set operations, our checker would call void clear(), where you MUST deconstruct your data structure and release your resources.

Data constraints

In all test cases, it is guaranteed that 1n400000,1opt5,1x10111\le n \le 400000, 1\le opt\le 5, 1\le x\le 10^{11}.

Please pay attention to the special time limit and memory limit.

Time complexity

If you implement your algorithm in Θ(n2)\Theta(n^2) time, you will receive about 40 pts.

If you implement your algorithm in Θ(nlog2n)\Theta(n\log ^2n) or Θ(nlogx)\Theta(n\log x)time, you will receive about 70 pts.

If you implement your algorithm in O(nlogn)O(n\log n) time, you will receive 100 pts.

Space complexity

If you implement your algorithm in Θ(nlogn)\Theta(n\log n) space, you will receive at most 70 pts.

To fully pass this problem, you should implement your algorithm in O(n)O(n) space.

Hint

You can use any self-balancing binary search tree to solve this problem, e.g. Scapegoat tree, AVL tree, Red-black tree, Splay tree, Treap and B-tree.

We recommend you implement Scapegoat tree. Scapegoat tree is a non-rotating binary tree. Its balancing is achieved by completely rebuild the "unbalanced" subtree into a complete binary tree. This makes Scapegoat tree easier to implement than AVL tree and Red-black tree.

Please see more information in Scapegoat Tree.

Problem 2. Calculate Huffman Coding Length

In this problem, you are required to solve a problem related to huffman coding.

Recall: Huffman coding is a lossless data compression algorithm that assigns variable-length codes to characters based on their frequency of occurrence in the input. An efficient algorithm for
generating it is related to the priority queue and covered in course slides.

The files attached to this problem are shown as follows. We provide a test to help you check your code.

1
2
3
4
5
.
└── attachments/
├── tests/
│ └── test_huffman.cpp
└── huffman_calculator.hpp

NOTE:

  1. You should NEVER use std::priority_queue, std::map, std::set, std::multimap, std::multiset or call heap related functions like std::make_heap().
  2. In the question, you should write a more optimal algorithm that brute-force Huffman tree building to pass all tests.

Requirements:

Your task is to complete the function size_t get_huffman_length(const std::vector<std::pair<size_t, size_t>> &data) in huffman_calculator.hpp.

Description

Assume that we have a sequence:

1
i a u e i y a a e a o o a a i e a

Count the occurrence of each character:

character occurrence
a 7
e 3
i 3
o 2
u 1
y 1

Then we generate the huffman tree (a possible solution):

Huffman Tree

Hence, we can shorten the sequence into the following format (ignore the space of storing the huffman tree):

1
110010011111101000001110101101001101110

With Huffman Coding, the sequence can be shortened to length 39 (binary).

Parameter data

In this question, we help you to ignore the process of dealing the bunch of text.
Instead, we directly give you some pairs.

The first number (of type size_t, named occurrence) indicate a possible number of occurrence of the character.
Occurrence will always be a positive number.

The second number (of type size_t, named amount) indicate the number of characters satisfies the number of occurrences given by the first number.
Amount will always be a positive number.

In the example above, we have characters with occurrences (7, 3, 3, 2, 1, 1).

So the data we pass to you in this example will be: [(7, 1), (3, 2), (2, 1), (1, 2)]

Return value

You need to return the length of the Huffman encoded sequence.

In this example, the return value is 39.

Misc.

For your convenience, all the pairs we provide are guaranteed to have different occurrences.

We also guarantee that the sum of amount is larger than 1.

The data passed to your function get_huffman_length is not guaranteed to be sorted.

Data constraints

We use mm to denote the number of pairs, KiK_i to denote the amount in the i-th pair.

In all test cases, it is guaranteed that 1m4000,1Ki1500001\le m \le 4000, 1\le K_i \le 150000.

Time complexity

If your algorithm takes Θ(nlogn)\Theta(n\log n) time, where n=Kin=\sum K_i, you will receive about 40 pts.

If your algorithm takes Θ(mlogmlogKi)\Theta(m\log m \log K_i) time, you will receive 100 pts.

Hint

The pair (10, 10) is somehow "equivalent" to (20, 5).

You can calculate the return value in the process of building the huffman tree.

A heap should be most conveniently implemented using an array (or std::vector), on which the node indexed ii has a left child indexed 2i2i and right child indexed 2i+12i+1. Do not waste your effort on memory management of linked-list-like tree structures.

评分

评分由 OJ 分数(60%)和线下 check (40%)两部分构成。线下 check 会在此次作业结束时间之后进行。

注:线下 check 也带有检查学术诚信的含义,当然这不是唯一的手段。如果被认定为抄袭, OJ 的分数也会作废,并且会有惩罚。特别强调,抄袭来自 generative AI 的代码和抄袭网上的代码是同等处理的,我们建议您在写作业时关闭一切 generative AI 工具。