现在的位置: 首页 > 综合 > 正文

copy-on-write(写时复制)

2013年09月20日 ⁄ 综合 ⁄ 共 5690字 ⁄ 字号 评论关闭

原文地址:http://blog.sina.com.cn/s/blog_76fbd24d0100zdgz.html

写时复制页面保护机制是一种优化,内存管理器利用它可以节约内存。
    当进程为一个包含读/写页面的内存区对象映射了一份写时视图,而并非在映射该视图时创建一份进程私有的拷贝(Hewlett、Packard、OpenVMS操作系统就是这样做的)时,内存管理器将页面拷贝的动作推迟到页面被写入数据的时候,所有现代的UNIX系统也都使用了这项技术,如:2个进程正在共享3个页面,每个页面都被标记为写时复制,但是这2个进程都不打算修改页面上的任何数据。
    如果这2个进程中任何一个线程对一个页面执行了写操作,则产生一个内存管理错误,内存管理器看到,此写操作作用在一个写时复制的页面上,所以,它不是将此错误报告为访问违例,而是在物理内存中分配一个新的读/写页面,并且把原始页面中的内容拷贝到新的页面中,同时也更新一下该进程中对应的页面映射信息,使它指向新的页面位置,然后解除异常,从而使得刚才产生错误的那条指令得以重新执行。这一次,写操作成功了。但是,新拷贝的页面现在对于执行写操作的那个进程来说是私有的,对于其它仍然在共享这一写时复制页面的进程来说,它是不可见的,每个往共享页面中写入数据的进程都将获得它自己的私有拷贝。
   写时复制的一个应用是:在调试器中实现断点支持。例如:在默认情况下,代码页面在起始时都是只能执行的(即:只读的),然而,如果一个程序员在调试一个程序时设置了一个断点,则调试器必须在代码中加入一条断点指令。它是这样做的:首先将该页面的保护模式改变为PAGE_EXECUTE_READWRITE,然后改变指令流。因为代码页面是所映射的内存区的一部分,所以内存管理器为设置了断点的那个进程创建一份私有拷贝,同时其它进程仍然使用原先未经修改的代码页面。
    写时复制是“延迟计算(lazy evaluation)”这一计算技术(evaluation technique)的一个例子,内存管理器广泛地使用了延迟计算的技术。延迟计算使得只有当绝对需要时才执行一个昂贵的操作——如果该操作从来也不需要的话,则它不会浪费任何一点时间。
    POSIX子系统利用写时复制来实现fork函数,当一个UNIX应用程序调用fork函数来创建另一个进程时,新进程所做的第一件事是调用exec函数,用一个可执行程序来重新初始化它的地址空间。在fork中,新进程不是拷贝整个地址空间,而是通过将页面标记为写时复制的方式,与父进程共享这些页面。如果子进程在这些页面中写入数据了,则生成一份进程私有的拷贝。如果没有写操作,则2个进程继续共享页面,不会执行拷贝动作。不管怎么样,内存管理器只拷贝一个进程试图要写入数据的那些页面,而不是整个地址空间

 

Copy On Write(写时复制)是在编程中比较常见的一个技术,面试中也会偶尔出现(好像Java中就经常有字符串写时复制的笔试题),今天在看《More Effective C++》的引用计数时就讲到了Copy On Write——写时复制。下面简单介绍下Copy On Write(写时复制),我们假设STL中的string支持写时复制(只是假设,具体未经考证,这里以Mircosoft Visual Studio 6.0为例,如果有兴趣,可以自己翻阅源码)

Copy On Write(写时复制)的原理是什么?
有一定经验的程序员应该都知道Copy On Write(写时复制)使用了“引用计数”,会有一个变量用于保存引用的数量。当第一个类构造时,string的构造函数会根据传入的参数从堆上分配内存,当有其它类需要这块内存时,这个计数为自动累加,当有类析构时,这个计数会减一,直到最后一个类析构时,此时的引用计数为1或是0,此时,程序才会真正的Free这块从堆上分配的内存。
引用计数就是string类中写时才拷贝的原理!

什么情况下触发Copy On Write(写时复制)
很显然,当然是在共享同一块内存的类发生内容改变时,才会发生Copy On Write(写时复制)。比如string类的[]、=、+=、+等,还有一些string类中诸如insert、replace、append等成员函数等,包括类的析构时。

示例代码:

// 作者:代码疯子
// 博客:http://www.programlife.net/
// 引用计数 & 写时复制
#include <iostream>
#include <string>
using namespace std;
 
int main(int argc, char **argv)
{
	string sa = "Copy on write";
	string sb = sa;
	string sc = sb;
	printf("sa char buffer address: 0x%08X\n", sa.c_str());
	printf("sb char buffer address: 0x%08X\n", sb.c_str());
	printf("sc char buffer address: 0x%08X\n", sc.c_str());
 
	sc = "Now writing...";
	printf("After writing sc:\n");
	printf("sa char buffer address: 0x%08X\n", sa.c_str());
	printf("sb char buffer address: 0x%08X\n", sb.c_str());
	printf("sc char buffer address: 0x%08X\n", sc.c_str());
 
	return 0;
}
Copyed From 程序人生 
Home Page:http://www.programlife.net 
Source URL:http://www.programlife.net/copy-on-write.html 

输出结果如下(VC 6.0):

Copy On Write(写时复制)

可以看到,VC6里面的string是支持写时复制的,但是我的Visual Studio 2008就不支持这个特性(Debug和Release都是):

Visual Studio 2008不支持Copy On Write(写时复制)
拓展阅读:(摘自《Windows Via C/C++》5th Edition,不想看英文可以看中文的PDF,中文版第442页)
Static Data Is Not Shared by Multiple Instances of an Executable or a DLL

When you create a new process for an application that is already running, the system simply opens another memory-mapped view of the file-mapping object that identifies the executable file’s image and creates a new process object and a new thread object (for
the primary thread). The system also assigns new process and thread IDs to these objects. By using memory-mapped files, multiple running instances of the same application can share the same code and data in RAM.

Note one small problem here. Processes use a flat address space. When you compile and link your program, all the code and data are thrown together as one large entity. The data is separated from the code but only to the extent that it follows the code in
the .exe file. (See the following note for more detail.) The following illustration shows a simplified view of how the code and data for an application are loaded into virtual memory and then mapped into an application’s address space.

Copy On Write(写时复制)Windows核心编程
As an example, let’s say that a second instance of an application is run. The system simply maps the pages of virtual memory containing the file’s code and data into the second application’s address space, as shown next.

Copy On Write(写时复制)Windows核心编程
If one instance of the application alters some global variables residing in a data page, the memory contents for all instances of the application change. This type of change could cause disastrous effects and must not be allowed.

The system prohibits this by using the copy-on-write feature of the memory management system. Any time an application attempts to write to its memory-mapped file, the system catches the attempt, allocates a new block of memory for the page containing the
memory the application is trying to write to, copies the contents of the page, and allows the application to write to this newly allocated memory block. As a result, no other instances of the same application are affected. The following illustration shows
what happens when the first instance of an application attempts to change a global variable in data page 2:

Copy On Write(写时复制)Windows核心编程
The system allocated a new page of virtual memory (labeled as “New page” in the image above) and copied the contents of data page 2 into it. The first instance’s address space is changed so that the new data page is mapped into the address space at the same
location as the original address page. Now the system can let the process alter the global variable without fear of altering the data for another instance of the same application.

A similar sequence of events occurs when an application is being debugged. Let’s say that you’re running multiple instances of an application and want to debug only one instance. You access your debugger and set a breakpoint in a line of source code. The
debugger modifies your code by changing one of your assembly language instructions to an instruction that causes the debugger to activate itself. So you have the same problem again. When the debugger modifies the code, it causes all instances of the application
to activate the debugger when the changed assembly instruction is executed. To fix this situation, the system again uses copy-on-write memory. When the system senses that the debugger is attempting to change the code, it allocates a new block of memory, copies
the page containing the instruction into the new page, and allows the debugger to modify the code in the page copy.

(如果是)原创文章,转载请注明(文字为系统自动添加,实际意义上本段文字仅针对原创文章而言):
本文出自程序人生 >> Copy On Write(写时复制)
作者:代码疯子

Copyed From 程序人生
Home Page:http://www.programlife.net
Source URL:http://www.programlife.net/copy-on-write.html

Copyed From 程序人生
Home Page:http://www.programlife.net
Source URL:http://www.programlife.net/copy-on-write.html

 

【上篇】
【下篇】

抱歉!评论已关闭.