现在的位置: 首页 > 综合 > 正文

用信号量做进程同步解决生产者和消费者遇到的奇怪问题

2013年10月07日 ⁄ 综合 ⁄ 共 3509字 ⁄ 字号 评论关闭

   看了APUE关于信号量部分的内容后,决定用它来实现一下生产者消费者问题,程序写好运行后,总是有问题,生产者每调用32767次就会报错,检查了semop的返回值为ERANGE。不知道是什么原因。搜到一篇具有同样问题的帖子,帖子的解答时在用信号量设置undo flag后,在每个进程中对信号量的PV操作必须“对称”,由于生产者消费者是两个不同的进程,对满槽数和空槽数的PV操作不是对称的,导致undo积累的值不断增大,最终导致记录undo累计值的变量(ushort类型)溢出。帖子内容如下:

 

Michael B Allen <mba2000@ioplex.com> writes: 

>Hi, 

>I'm seeing some strange behavior with the semop(2) SEM_UNDO flag. I have 
>a function like this: 

> int 
> svsem_wait(int semid) 
> { 
> struct sembuf wait; 

> wait.sem_num = 0; 
> wait.sem_op = -1; 
> wait.sem_flg = SEM_UNDO; 

> return semop(semid, &wait, 1); 
> } 

>In a producer/consumer test program, after precisely 32,767 calls to this 
>function it stops decrementing the semaphore value which permits the 
>producer to run uncontrolled. 

>If I remove the SEM_UNDO flag the problem does not occur and the test 
>program completes successfully. 

>Any idea what the problem might be? Considering 32768 is a power of 2 I 
>suspect I'm doing something wrong with SEM_UNDO that's causing a limit 
>to be exceeded. 

Are you doing the semop(-1) in one process and the +1 in another or 
are you doing the -1 with UNDO and the +1 w/o? 

Note then that the undo counts accumulate over a process; all operations 
are undone when the process exists. A short is the typical value to 
hold the undo aggregate. 

TYpically, using SEM_UNDO is only correct when the application using 
it must have a 0 net effect from main() to exit(). It's nearly always 
wrong when used in a producer/consumer situation. 

Casper 

 

 

 

On Sun, 16 Nov 2003 16:41:49 -0500, Casper H.S. *** wrote: 

> Michael B Allen <mba2000@ioplex.com> writes: 

>>Hi, 

>>I'm seeing some strange behavior with the semop(2) SEM_UNDO flag. I have 
>>a function like this: 

>> int 
>> svsem_wait(int semid) 
>> { 
>> struct sembuf wait; 
>> 
>> wait.sem_num = 0; 
>> wait.sem_op = -1; 
>> wait.sem_flg = SEM_UNDO; 
>> 
>> return semop(semid, &wait, 1); 
>> } 

>>In a producer/consumer test program, after precisely 32,767 calls to 
>>this function it stops decrementing the semaphore value which permits 
>>the producer to run uncontrolled. 

>>If I remove the SEM_UNDO flag the problem does not occur and the test 
>>program completes successfully. 

>>Any idea what the problem might be? Considering 32768 is a power of 2 I 
>>suspect I'm doing something wrong with SEM_UNDO that's causing a limit 
>>to be exceeded. 

> Are you doing the semop(-1) in one process and the +1 in another 

Yes. The calls are not symmetric. 

> or are 
> you doing the -1 with UNDO and the +1 w/o? 

> Note then that the undo counts accumulate over a process; all operations 
> are undone when the process exists. A short is the typical value to 
> hold the undo aggregate. 

> TYpically, using SEM_UNDO is only correct when the application using it 
> must have a 0 net effect from main() to exit(). It's nearly always 
> wrong when used in a producer/consumer situation. 

Right. Kurtis' explaination was right on. I didn't get the significance 
of per-process undo state after reading the Stevens' books. 

The code I am referring to is just a test program that looks roughly like 
the following but one process calls produce() and the other consume(): 

int 
produce(struct linkedlist *l, int mutex, int empty, int full) 

        for ( ;; ) { 
        svsem_wait(empty); 
        svsem_wait(mutex); 

                /* put something into l */ 

        svsem_post(mutex); 
        svsem_post(full); 
    } 

    return 0; 

int 
consume(struct linkedlist *l, int n, int mutex, int empty, int full) 

        for ( ;; ) { 
        svsem_wait(full); 
        svsem_wait(mutex); 

                /* remove something from l */ 

        svsem_post(mutex); 
        svsem_post(empty); 
    } 

    return 0; 

I think this counting semaphore producer/consumer example was straight 
out of the Tanenbaum book on operating systems. As Kurtis suggested 
I should not use the semaphore as the counter but use a mutex to lock, 
change a separate counter, and unlock in each process separately so that 
each calls wait an equal number of times over the lifetime of the program. 

Mike 

 

抱歉!评论已关闭.