三种线性时间O(n)排序算法 – 计数-基数-桶 – C++实现

现在的位置: 首页 > 综合 > 正文

RSS

三种线性时间O(n)排序算法 – 计数-基数-桶 – C++实现

2013年02月06日 ⁄ 综合 ⁄ 共 4946字 ⁄ 字号小中大 ⁄ 评论关闭

引言

注：由于没有启用任何公式编辑器，为表示方便：以下涉及时间复杂度表示时，其渐近符号用以下符号代替：

先来看一个定理：任意一个比较排序算法在最坏情况下，都需要做 $(nlgn)次的比较。其可用决策树模型加以证明，详见：《算法导论》第8章8.1节。

该定理指出了比较排序的时间复杂度下界，即没有比较更少的了。

故以下介绍的三种算法均不是基于比较的排序算法，其均对输入数据作了某种假设，即利用了具体数据的特点。

一、计数排序

1、问题描述：假设数组中的n个元素中的每一个都是介于0到k之间的整数，对其进行排序。此处k为某个正整数。

2、基本思想：对每一个输入元素x，利用它的所属范围大小为[0, k] ,确定出数组中小于x 的元素个数。由于x为正整数，则可以直接把x直接放到它在最终输出数组中的位置上。

3、算法描述：

输入：A[1...n], 输入数组；C[0...k]，提供临时存储区，初值：0 ---O(k)

输出：B[1...n]，存放排序结果

（1）求出输入数组A[1...n]中，其值等于A[j]的元素个数，存放于对应的C[1...k]中：C[A[i]] = C[A[i]] + 1 ---O(n)

（2）求出输入数组A[1...n]中，其值小于或等于A[j]的元素个数，迭代地存放于对应的C[1...k]中：C[j] = C[j-1] + C[j] ---O(K)

（3）经过前两步之后，C[i]中的值表示为：小于等于 i 的元素个数，即为 i 在A[1...n]中的最终位置，将其存放于B[1...n]中：B[C[A[j]]] = A[j], C[A[j]] = C[A[j]] - 1 （数组中有可能存在相等的元素） --- O(n)

4、时间复杂度：O(k) + O(n) + O(k) + O(n)，则总的运行时间为：@(k+n)，在实践中，当 k=O(n)时，其运行时间即为：@(n).

5、算法实现：

#include <stdio.h>
const int K = 5;
const int N = 6;
int input[N+1] = {-1, 2, 3, 4, 2 ,1, 5};
int output[N+1];
int count[K]={0};

void print(int array[], int n)
{
	for(int i = 1; i <=n; i++)
	{
		printf("%d ", array[i]);
	}
	printf("\n");
}
void countSort(int input[], int output[], int n)
{
	int i;
	for(i = 1; i <= n; i++)
	{//equal to input[i]
		count[input[i]] = count[input[i]] +1;
	}
	for(i = 1; i <= K; i++)
	{//less ro equal to i
		count[i] = count[i-1] + count[i];
	}
	for(i = n; i >= 1; i--) //form large to small to make sorting stable
	{
		output[count[input[i]]] = input[i];
		count[input[i]]--;
	}
}

int main()  
{
	countSort(input, output, N);
	print(output, N);
    return 0;  
}

二、基数排序

1、问题描述：给定n个d位数，每一位可能的取值有k中，对其进行排序。如，4个三位数：123，367,124,756，此时n=4, d=3，k值为10.

2、基本思想：首先按最低有效位数字进行排序，收集结果；然后按次低位有效位数字排序，收集其结果，依次类推，重复这个过程，直到对所有的d位数字都进行了排序。

看个例子（来自算法导论8.3的示例,P100）：

注意：对每一位的排序必须是一个稳定的排序，否则排序可能不正确。如上图在对最高位排序时，起初436在457的前面，由于最高位都是4，故此次排序两个数的最高位相等，如果不是稳定的排序，结果457可能会排到436的前面，这样结果就不对了。而稳定的排序则可以保证排完后，436仍然在457的前面。

3、算法描述：

输入数组：A[1...n]

RADIX-SORT(A, d)

for i <- 1 to d

do use a stable sort to sort array A on digit i //可以选择计数排序

4、时间复杂度：由一中的计数排序可知，对每一位的排序时间为：@(k+n)，此处共执行d遍，故时间复杂度：@(d*(k+n))，当d为常数、k=O(n)时，基数排序有线性运行时间：@(n)。更一般且更具体的分析可以参加《算法导论》的引理8.3和8.4，P101，其详细分析了：如何将每个关键字分解成若干位以及那些情况下时间复杂度最佳。此处只介绍一些结论与如何实现之。

5、算法实现：

#include <stdio.h>
#include <math.h>
const int N = 7;
const int D = 3;
const int K = 10;
int count[K] = {0};
//int input[N+1] = {-1, 329, 457, 657, 839, 436, 720, 355};
int output[D+1][N+1]={{-1, 329, 457, 657, 839, 436, 720, 355}};

void print(int array[], int n)
{
	for(int i = 1; i <=n; i++)
	{
		printf("%d ", array[i]);
	}
	printf("\n");
}
int getDigit(int number, int d)
{
	if(d > D)return -1;
	return number%(int)pow(10,d) / (int)pow(10,d-1); 
}
void countSort(int input[], int output[], int n, int d)
{
	int i;
	int digit;
	for(i = 1; i <= n; i++)
	{
		digit = getDigit(input[i],d);
		count[digit] = count[digit] +1;
	}
	for(i = 1; i <= K; i++)
	{
		count[i] = count[i-1] + count[i];
	}
	for(i = n; i >= 1; i--) //form large to small to make sorting stable
	{
		digit = getDigit(input[i],d);
		output[count[digit]] = input[i];
		count[digit]--;
	}
}
void initDataStruct(int count[])
{
	for(int i= 0; i < K; i++)
	{
		count[i] = 0;
	}
}
void radixSort(int output[][N+1], int n, int d)
{
	for(int i = 1; i <= d; i++)
	{
		countSort(output[i-1], output[i], n, i);
		initDataStruct(count);
	}
}

int main()  
{
	radixSort(output, N, D);
	print(output[D], N);
        return 0;  
}

三、桶排序

1、问题描述：假设输入数组有一个随机过程产生，该过程将元素均匀地分布在区间 [0, 1)上，对这个输入数组进行排序

2、基本思想：类似于散列表中解决散列冲突的拉链法，一个桶本质就是一个链表，将相同关键字的元素会被放入同一个桶中。由于输入数组中的元素均匀且独立均匀分布在 [0, 1)上，所以一般不会有很多个元素被分到同一个桶中去。散列完成后，先对各个桶中的元素进行排序，然后按次序把各桶中的元素收集出来即得到一个有序的数组。

3、算法描述：

输入：A[1...n]；B[0...n-1]：辅助数组链表，用于散列元素

输出：排序好的A[1...n]

（1）把区间 [0, 1) 划分成 n个相同大小的子区间，或称为桶（可用辅助数组链表B[0...n-1) 实现之）。

（2）已知，散列函数为：f(x) = floor(n*x) ，向下取整，则数组元素A[i] 分配至桶（链表）B[floor(n*A[i])]中 ---@(n)

（3）分别对每个桶B[i]中的元素进行排序（可用插入排序实现之） ---n*O(2-1/n)

（4）按次序收集各桶中的结果。

4、时间复杂度：根据3中的算法描述部分可知，除了第（3）步外，其他各部分在最坏情况下的时间都是O(n)。运用数理统计的知识可以求得，第（3）步中的对每个桶进行排序，运行时间的期望值为：2-1/n（详见：《算法导论》8.4节,P103），则第（3）步的总时间：n*(2-1/n)，故桶排序的期望运行时间：@(n) + n*(2-1/n) =@(n)

5、算法实现：

#include <stdio.h>
#include <math.h>

const int N =10;
double input[N+1] = {-1, 0.78, 0.17, 0.39, 0.26, 0.72, 0.94, 0.21, 0.12, 0.23, 0.68};
typedef struct BucketNode
{
	double data;
	struct BucketNode* next;
}* bucketList, bucketNode;
bucketList bucketArr[N];

void print(bucketList bucketArr[])
{	
	for(int i = 0; i < N; i++)
	{
		bucketNode* pb = bucketArr[i];
		while(pb)
		{
			printf("%e ", pb->data);
			pb = pb->next;
		}
		printf("\n");
		
	}
	printf("\n");
}

void initBucketList(bucketList bucketArr[])
{
	for(int i = 0; i < N; i++)
	{
		bucketArr[i] = NULL;
	}	
}
bool insertBucketWithSorting(bucketList& bucket, double data)
{//sorting while inserting
	bucketNode* pb = new bucketNode;
	if(NULL == pb)
		return false;
	pb->data = data;
	if(NULL == bucket || pb->data < bucket->data)
	{//insert before the first element
		pb->next = bucket;
		bucket = pb;		
		return true;
	}

	bucketNode* ptemp = bucket;
	bucketNode* ptempn = bucket->next;
	while(ptempn)
	{//insert after the first element(that is in the middle)
		if(pb->data < ptempn->data)
		{
			pb->next = ptempn;
			ptemp->next = pb;
			break;
		}
		ptemp = ptempn;
		ptempn = ptempn->next;
	}
	if(NULL == ptempn)
	{//insert after the last element
		ptemp->next = pb;
		pb->next = NULL;
	}

	return true;
}

void destroyBucketList(bucketList bucketArr[])
{
	for(int i = 0; i < N; i++)
	{
		while(bucketArr[i]!= NULL)
		{
			bucketNode* pb = bucketArr[i];
			bucketArr[i] = bucketArr[i]->next ;
			delete pb;
			pb = NULL;
		}
	}
	
}
void bucketSort(double input[], bucketList bucketArr[],int n)
{	
	for(int i = 1; i <= n; i++)
	{
		int index = (int)floor(input[i]*n);
		insertBucketWithSorting(bucketArr[index], input[i]);
	}	

  }

int main()  
{
	initBucketList(bucketArr);
	bucketSort(input, bucketArr, N);
	print(bucketArr);

	destroyBucketList(bucketArr);

    return 0;  
}

【上篇】关于网站可用性的一点思考
【下篇】ASP 动态数组

作者: yiwugongyipin

该日志由 yiwugongyipin 于11年前发表在综合分类下，最后更新于 2013年02月06日.
转载请注明: 三种线性时间O(n)排序算法 – 计数-基数-桶 – C++实现 | 学步园 +复制链接

抱歉!评论已关闭.

学步园

三种线性时间O(n)排序算法 – 计数-基数-桶 – C++实现

作者: yiwugongyipin

书签

最新文章New

本站推荐

返回首页