GDB 配置

在 Lab 0 里我们不需要使用 GDB debug，但是在 Lab 1 里我们需要，这里是我配置 GDB 的办法，来源wine99 的博客，结合自己的系统情况修改了一下。首先将 Lab 0 的分支代码 merge 到 Lab 1 的对应分支里。然后当前 sponge 文件夹内，新建一个名为 .vscode 的文件夹，其中里面有四个 json 文件，分别是：

c_cpp_properties:

{
    "configurations": [
        {
            "name": "Linux",
            "includePath": [
                "${workspaceFolder}/**",
                "${workspaceFolder}/libsponge/**"
            ],
            "defines": [],
            "compilerPath": "/usr/bin/g++-8",
            "cStandard": "c11",
            "intelliSenseMode": "linux-clang-x64",
            "compileCommands": "${workspaceFolder}/build/compile_commands.json"
        }
    ],
    "version": 4
}

launch.json:

{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "name": "debug lab test",
            "type": "cppdbg",
            "request": "launch",
            "program": "${workspaceFolder}/build/tests/${fileBasenameNoExtension}",
            "args": [],
            "stopAtEntry": false,
            "cwd": "${workspaceFolder}",
            "environment": [],
            "externalConsole": false,
            "MIMode": "gdb",
            "setupCommands": [
                {
                    "description": "Enable pretty-printing for gdb",
                    "text": "-enable-pretty-printing",
                    "ignoreFailures": true
                }
            ],
            "miDebuggerPath": "/usr/bin/gdb"
        },
        {
            "name": "debug webget",
            "type": "cppdbg",
            "request": "launch",
            "program": "${workspaceFolder}/build/apps/webget",
            "args": ["cs144.keithw.org", "/hello"],
            "stopAtEntry": false,
            "cwd": "${workspaceFolder}",
            "environment": [],
            "externalConsole": false,
            "MIMode": "gdb",
            "setupCommands": [
                {
                    "description": "Enable pretty-printing for gdb",
                    "text": "-enable-pretty-printing",
                    "ignoreFailures": true
                }
            ],
            // "preLaunchTask": "C/C++: g++ build active file",
            "preLaunchTask": "build project",
            "miDebuggerPath": "/usr/bin/gdb"
        },
        {
            "name": "debug current file",
            "type": "cppdbg",
            "request": "launch",
            "program": "${fileDirname}/${fileBasenameNoExtension}",
            "args": [],
            "stopAtEntry": false,
            "cwd": "${workspaceFolder}",
            "environment": [],
            "externalConsole": false,
            "MIMode": "gdb",
            "setupCommands": [
                {
                    "description": "Enable pretty-printing for gdb",
                    "text": "-enable-pretty-printing",
                    "ignoreFailures": true
                }
            ],
            "preLaunchTask": "C/C++: g++ build active file",
            "miDebuggerPath": "/usr/bin/gdb"
        }
    ]
}

settings.json:

{
    "debug.allowBreakpointsEverywhere": true,
    "files.associations": {
        "chrono": "cpp",
        "random": "cpp",
        "limits": "cpp",
        "algorithm": "cpp"
    }
}

tasks.json:

{
    "tasks": [
        {
            "type": "shell",
            "label": "C/C++: g++ build active file",
            "command": "/usr/bin/g++-8",
            "args": [
                "-g",
                "${file}",
                "-o",
                "${fileDirname}/${fileBasenameNoExtension}"
            ],
            "options": {
                "cwd": "${workspaceFolder}"
            },
            "problemMatcher": [
                "$gcc"
            ],
            "group": {
                "kind": "build",
                "isDefault": true
            }
        },
        {
            "type": "shell",
            "label": "build project",
            "command": "cd build && make -j8",
            "args": [],
        },
    ],
    "version": "2.0.0"
}

在配置好这些文件之后，我们就可以愉快地（并不）使用 GDB debug 了。

Lab 1

这个 Lab 的目标是写一个 StreamReassembler，它接受一些字符串和对应的 index，这里的 index 是字符串的第一个字符在 TCP 字节流的位置（起始值为0），按顺序拼好并传递给 Lab 0 实现的 ByteStream。

Getting started

在开始写代码前，我们需要定义清楚一些问题，可以避免一些踩坑（我一开始没这么做，导致第一次快写完的时候整个重写了）。常见的需要定义清楚的问题有这么几个：

Q1：unassembled bytes 是什么？

A：是还未传递但是要传递给 ByteStream 的储存在 StreamReassembler 的字符数，其中如果不同的字符串有重叠部分的话，重叠部分只算一次；

举个例子，假设我们有两个字符串，一个是 Happy，它的 index 是 0，另一个是 py anniversary，它的 index 是 3，那么可想而知，整个算下来应该是 Happy anniversary，此时 unassembled bytes 值为17。

再举个例子，假设我们还是有两个字符串，一个是 Ha，它的 index 是 0，另一个是 y anniversary, 它的 index 是 3，此时的 unassembled bytes 值为 15，因为整个算下来是 Ha??y anniversary，我们不知道也没有存 ?? 位置的字符，所以这两个位置是不计数的。

Q2：capacity 是对什么容量的限制？

A：文档里的图解释的挺清楚的，我这里再复述一遍，是 StreamReassembler 里储存的 index 最大的字符的 index 减去在 ByteStream 里还没有被读取的 index 最小的字符的 index 的上限加一，说起来有点弯弯绕，结合下面这个例子会比较好。

假设 ByteStream 里还未读取的字符是 Happy，StreamReassembler 里只有一个字符串，ver，它的index 是 10，那么此时它占据的容量是 13，因为组合到一起，是 Happy?????ver，尽管我们不知道 ????? 的内容，但是我们要为之预留空间。推广一下，如果读取的字符串有一部分超过了我们能接受的最大容量，即使我们有足够的内存，也要舍弃它，因为超出了 capacity 的限制。

Design the core data structures

就这个 Lab 来讲，我们需要一个数据结构来维护没有读取的字符串，并且这些字符串是有序的（index），其实我一开始的想法是用堆，完成并 debug 后过了绝大部分的测试例，只有关于 unassembled bytes 的测试有问题，仔细研究定义（如上）后，决定推倒重新写。

我们需要一个容器，可以容纳来自同一个字节流的字符串，并且我们需要跟踪字符数目，权衡之后我选择用 std::deque<char> _unassembled_bytes 来维护所有未传输给 ByteStream 的字符。如果这个队列空间不够存下下一个字符串的话，我们就扩大它，在整体不超过 capacity 的限制下存下尽量多的字符。这里问题来了，我们还需要思考，这个队列里存放的字符是不是已经读取的字符，比如我们读了 Happy?????ver，这些 ? 要如何表示，一开始我的想法是用 '\0' 表示，后来觉得如果字节流里有 '\0' 就不行，因此选用了另一个 std::deque<bool> _pos_checker 来记录 _unassembled_bytes 对应位置的字节是否是已经读取的字符，他们被同步更新，并且 _pos_checker 里 true 的数量就是 unassembled bytes 的数值。

以下是一个更新字符串的实现：

void StreamReassembler::update_unassembled_bytes(const string &data, const size_t index) {
    size_t start_data_idx = max(0, static_cast<int>(_idx) - static_cast<int>(index));
    _max_idx = max(_max_idx, data.size() ? index + data.size() : 0); // fixed from 12-17
    size_t target_right_bound = min(index + data.size(), _capacity - _output.buffer_size() + _idx);
    if (target_right_bound > _right_bound) {
        // no space, need to expand it
        for (size_t i = 0; i < target_right_bound - _right_bound; i++) {
            _unassembled_bytes.push_back('\0');
            _pos_checker.push_back(false);
        }
        _right_bound = target_right_bound;
    }
    size_t start_deque_idx = start_data_idx + index - _idx;
    for (size_t i = 0; i + start_data_idx < data.size() && i + start_deque_idx < _pos_checker.size(); i++) {
        _unassembled_bytes[i + start_deque_idx] = data[i + start_data_idx];
        if (!_pos_checker[i + start_deque_idx])
            _unassembled_bytes_cnt++;
        _pos_checker[i + start_deque_idx] = true;
    }
}

这里有一些变量需要解释：

_idx：deque 里第一个位置对应的 index

_unassembled_bytes_cnt：unassembled bytes 的数值

_right_bound：deque 能储存的字符的最大 index 加一（恰好不能储存的序数）

_max_idx：已经得知存在的字符串最后一个字符的最大 index 加一，用来确定是否已经结束输入。假设我们的 StreamReassembler 要读取 Happy anniversary，且 index 为 0，可是由于容量限制，我们只能读取 Happy anniversar，由于最后一个 y 是存在的，我们只是暂时存不下，_max_idx 就是这个 y 对应的的 index 值，这个可以用来判断是否要给 ByteStream 传递输入结束的信号，不太好理解的话可以先跳过，看后续代码就清楚了。

Implement the whole class

由于我们的逻辑是尽可能地将字符传递给 ByteStream _output，所以每次更新后都要尽量 write。

参考实现如下：

void StreamReassembler::push_substring(const string &data, const size_t index, const bool eof) {
    update_unassembled_bytes(data, index);
    // read
    while (_output.remaining_capacity() && !_unassembled_bytes.empty() && _pos_checker.front()) {
        char curr = _unassembled_bytes.front();
        string to_write(1, curr);
        _unassembled_bytes.pop_front();
        _pos_checker.pop_front();
        _output.write(to_write);
        _idx++;
        _unassembled_bytes_cnt --;
    }
    if (eof) {
        _eof = true;
    }
    if (_eof && _idx == _max_idx) {
        _output.end_input();
    }
}

其他函数代码：

size_t StreamReassembler::unassembled_bytes() const { return _unassembled_bytes_cnt; }

bool StreamReassembler::empty() const { return unassembled_bytes() == 0; }

完成后我们在 build 目录下编译并测试，和之前类似：

$ CXX=g++-8 cmake ..
$ make format
$ make -j4
$ make check_lab1

通过的话会得到类似字样：

100% tests passed, 0 tests failed out of 16

Total Test time (real) =   0.66 sec

下一个实验：Lab 2

CS144, computer networking

I feedback.
Let me know what you think of this article on twitter @CreamEggMeat or leave a comment below!

CS144 Lab 1

GDB 配置

Lab 1

Getting started

Design the core data structures

Implement the whole class