Wav编码转换

现在的位置: 首页 > 综合 > 正文

2013年09月11日 ⁄ 综合 ⁄ 共 14556字 ⁄ 字号小中大 ⁄ 评论关闭

首先我要说明一下，我下面要说的不是对于音频数据算法处理的内容，而是对于Wav文件编码头信息的解释以及一些简单的处理。

如果你已经完全了解了，就可以不用看了。

---------------------------------------------------------------------

百度百科：

WAV为微软公司(Microsoft)开发的一种声音文件格式，文件作为多媒体中使用的声波文件格式之一，它是以RIFF(Resource Interchange File Format)格式为标准的。每个WAV文件的头四个字节便是“RIFF”。WAV文件由文件头和数据体两大部分组成。其中文件头又分为RIFF/WAV文件标识段和声音数据格式说明段两部分，包含了音频流的编码参数。

WAV对音频流的编码没有硬性规定，除了PCM（Pulse Code Modulation脉冲编码调制）之外，还有几乎所有支持ACM规范的编码都可以为WAV的音频流进行编码，如MP3编码同样也可以运用在WAV中，只要安装好了相应的Decode（指令解码），就可以欣赏这些WAV了。

在windows平台下，基于PCM编码的WAV是被支持得最好的音频格式，所有音频软件都能完美支持，由于本身可以达到较高的音质要求，因此，WAV也是音乐编辑创作的首选格式，适合保存音乐素材。因此，基于PCM编码的WAV被作为了一种中介的格式，常常使用在其他编码的相互转换之中，例如MP3转换成WMA。

WAV文件可以存储大量格式的数据，通常采用的音频编码方式是脉冲编码调制(PCM)。由于WAV格式源自Windows/Intel环境，因而采用Little-Endian(小字节序、低字节序)字节顺序进行存储。

看了百科，基本上什么是Wav文件，应该没什么问题了。

下面开始正题，

1. Wav编码格式的转换。

如果你用的是XP或者其他什么windows 系统，估计系统都有自带的录音机。利用这个录音机就可以很方便的转化 Wav文件的编码，这里的选择也非常多，什么采样频率，什么8位16位都有。

这个的前提是你只需要手动转化，或者在PC这个windows平台上，一般都可以通过代码实现。如果在其他平台上，比如，手机，相机。那么你就需要真正的代码实现。

如果你够勤劳，那么你应该可以通过google 或者百度，找到标准的转化代码。

我在这里就举个最简单的例子比如 U-LAW编码格式的WAV文件转化为PCM编码的WAV文件。通过查询这个编码标准G711 ，可以在googl代码中搜索到 g711的编码转换源代码。

下面就是g711的源代码。其他的编码也有G72x等，

/*
 * This source code is a product of Sun Microsystems, Inc. and is provided
 * for unrestricted use.  Users may copy or modify this source code without
 * charge.
 *
 * SUN SOURCE CODE IS PROVIDED AS IS WITH NO WARRANTIES OF ANY KIND INCLUDING
 * THE WARRANTIES OF DESIGN, MERCHANTIBILITY AND FITNESS FOR A PARTICULAR
 * PURPOSE, OR ARISING FROM A COURSE OF DEALING, USAGE OR TRADE PRACTICE.
 *
 * Sun source code is provided with no support and without any obligation on
 * the part of Sun Microsystems, Inc. to assist in its use, correction,
 * modification or enhancement.
 *
 * SUN MICROSYSTEMS, INC. SHALL HAVE NO LIABILITY WITH RESPECT TO THE
 * INFRINGEMENT OF COPYRIGHTS, TRADE SECRETS OR ANY PATENTS BY THIS SOFTWARE
 * OR ANY PART THEREOF.
 *
 * In no event will Sun Microsystems, Inc. be liable for any lost revenue
 * or profits or other special, indirect and consequential damages, even if
 * Sun has been advised of the possibility of such damages.
 *
 * Sun Microsystems, Inc.
 * 2550 Garcia Avenue
 * Mountain View, California  94043
 */

/*
 * g711.c
 *
 * u-law, A-law and linear PCM conversions.
 */
#define SIGN_BIT        (0x80)          /* Sign bit for a A-law byte. */
#define QUANT_MASK      (0xf)           /* Quantization field mask. */
#define NSEGS           (8)             /* Number of A-law segments. */
#define SEG_SHIFT       (4)             /* Left shift for segment number. */
#define SEG_MASK        (0x70)          /* Segment field mask. */

static short seg_end[8] = {0xFF, 0x1FF, 0x3FF, 0x7FF,
                            0xFFF, 0x1FFF, 0x3FFF, 0x7FFF};

/* copy from CCITT G.711 specifications */
unsigned char _u2a[128] = {                     /* u- to A-law conversions */
        1,      1,      2,      2,      3,      3,      4,      4,
        5,      5,      6,      6,      7,      7,      8,      8,
        9,      10,     11,     12,     13,     14,     15,     16,
        17,     18,     19,     20,     21,     22,     23,     24,
        25,     27,     29,     31,     33,     34,     35,     36,
        37,     38,     39,     40,     41,     42,     43,     44,
        46,     48,     49,     50,     51,     52,     53,     54,
        55,     56,     57,     58,     59,     60,     61,     62,
        64,     65,     66,     67,     68,     69,     70,     71,
        72,     73,     74,     75,     76,     77,     78,     79,
        81,     82,     83,     84,     85,     86,     87,     88,
        89,     90,     91,     92,     93,     94,     95,     96,
        97,     98,     99,     100,    101,    102,    103,    104,
        105,    106,    107,    108,    109,    110,    111,    112,
        113,    114,    115,    116,    117,    118,    119,    120,
        121,    122,    123,    124,    125,    126,    127,    128};

unsigned char _a2u[128] = {                     /* A- to u-law conversions */
        1,      3,      5,      7,      9,      11,     13,     15,
        16,     17,     18,     19,     20,     21,     22,     23,
        24,     25,     26,     27,     28,     29,     30,     31,
        32,     32,     33,     33,     34,     34,     35,     35,
        36,     37,     38,     39,     40,     41,     42,     43,
        44,     45,     46,     47,     48,     48,     49,     49,
        50,     51,     52,     53,     54,     55,     56,     57,
        58,     59,     60,     61,     62,     63,     64,     64,
        65,     66,     67,     68,     69,     70,     71,     72,
        73,     74,     75,     76,     77,     78,     79,     79,
        80,     81,     82,     83,     84,     85,     86,     87,
        88,     89,     90,     91,     92,     93,     94,     95,
        96,     97,     98,     99,     100,    101,    102,    103,
        104,    105,    106,    107,    108,    109,    110,    111,
        112,    113,    114,    115,    116,    117,    118,    119,
        120,    121,    122,    123,    124,    125,    126,    127};

static int
search(
        int             val,
        short           *table,
        int             size)
{
        int             i;

        for (i = 0; i < size; i++) {
                if (val <= *table++)
                        return (i);
        }
        return (size);
}

/*
 * linear2alaw() - Convert a 16-bit linear PCM value to 8-bit A-law
 *
 * linear2alaw() accepts an 16-bit integer and encodes it as A-law data.
 *
 *              Linear Input Code       Compressed Code
 *      ------------------------        ---------------
 *      0000000wxyza                    000wxyz
 *      0000001wxyza                    001wxyz
 *      000001wxyzab                    010wxyz
 *      00001wxyzabc                    011wxyz
 *      0001wxyzabcd                    100wxyz
 *      001wxyzabcde                    101wxyz
 *      01wxyzabcdef                    110wxyz
 *      1wxyzabcdefg                    111wxyz
 *
 * For further information see John C. Bellamy's Digital Telephony, 1982,
 * John Wiley & Sons, pps 98-111 and 472-476.
 */
unsigned char
linear2alaw(
        int             pcm_val)        /* 2's complement (16-bit range) */
{
        int             mask;
        int             seg;
        unsigned char   aval;

        if (pcm_val >= 0) {
                mask = 0xD5;            /* sign (7th) bit = 1 */
        } else {
                mask = 0x55;            /* sign bit = 0 */
                pcm_val = -pcm_val - 8;
        }

        /* Convert the scaled magnitude to segment number. */
        seg = search(pcm_val, seg_end, 8);

        /* Combine the sign, segment, and quantization bits. */

        if (seg >= 8)           /* out of range, return maximum value. */
                return (0x7F ^ mask);
        else {
                aval = seg << SEG_SHIFT;
                if (seg < 2)
                        aval |= (pcm_val >> 4) & QUANT_MASK;
                else
                        aval |= (pcm_val >> (seg + 3)) & QUANT_MASK;
                return (aval ^ mask);
        }
}

/*
 * alaw2linear() - Convert an A-law value to 16-bit linear PCM
 *
 */
int
alaw2linear(
        unsigned char   a_val)
{
        int             t;
        int             seg;

        a_val ^= 0x55;

        t = (a_val & QUANT_MASK) << 4;
        seg = ((unsigned)a_val & SEG_MASK) >> SEG_SHIFT;
        switch (seg) {
        case 0:
                t += 8;
                break;
        case 1:
                t += 0x108;
                break;
        default:
                t += 0x108;
                t <<= seg - 1;
        }
        return ((a_val & SIGN_BIT) ? t : -t);
}

#define BIAS            (0x84)          /* Bias for linear code. */

/*
 * linear2ulaw() - Convert a linear PCM value to u-law
 *
 * In order to simplify the encoding process, the original linear magnitude
 * is biased by adding 33 which shifts the encoding range from (0 - 8158) to
 * (33 - 8191). The result can be seen in the following encoding table:
 *
 *      Biased Linear Input Code        Compressed Code
 *      ------------------------        ---------------
 *      00000001wxyza                   000wxyz
 *      0000001wxyzab                   001wxyz
 *      000001wxyzabc                   010wxyz
 *      00001wxyzabcd                   011wxyz
 *      0001wxyzabcde                   100wxyz
 *      001wxyzabcdef                   101wxyz
 *      01wxyzabcdefg                   110wxyz
 *      1wxyzabcdefgh                   111wxyz
 *
 * Each biased linear code has a leading 1 which identifies the segment
 * number. The value of the segment number is equal to 7 minus the number
 * of leading 0's. The quantization interval is directly available as the
 * four bits wxyz.  * The trailing bits (a - h) are ignored.
 *
 * Ordinarily the complement of the resulting code word is used for
 * transmission, and so the code word is complemented before it is returned.
 *
 * For further information see John C. Bellamy's Digital Telephony, 1982,
 * John Wiley & Sons, pps 98-111 and 472-476.
 */
unsigned char
linear2ulaw(
        int             pcm_val)        /* 2's complement (16-bit range) */
{
        int             mask;
        int             seg;
        unsigned char   uval;

        /* Get the sign and the magnitude of the value. */
        if (pcm_val < 0) {
                pcm_val = BIAS - pcm_val;
                mask = 0x7F;
        } else {
                pcm_val += BIAS;
                mask = 0xFF;
        }

        /* Convert the scaled magnitude to segment number. */
        seg = search(pcm_val, seg_end, 8);

        /*
         * Combine the sign, segment, quantization bits;
         * and complement the code word.
         */
        if (seg >= 8)           /* out of range, return maximum value. */
                return (0x7F ^ mask);
        else {
                uval = (seg << 4) | ((pcm_val >> (seg + 3)) & 0xF);
                return (uval ^ mask);
        }

}

/*
 * ulaw2linear() - Convert a u-law value to 16-bit linear PCM
 *
 * First, a biased linear code is derived from the code word. An unbiased
 * output can then be obtained by subtracting 33 from the biased code.
 *
 * Note that this function expects to be passed the complement of the
 * original code word. This is in keeping with ISDN conventions.
 */
int
ulaw2linear(
        unsigned char   u_val)
{
        int             t;

        /* Complement to obtain normal u-law value. */
        u_val = ~u_val;

        /*
         * Extract and bias the quantization bits. Then
         * shift up by the segment number and subtract out the bias.
         */
        t = ((u_val & QUANT_MASK) << 3) + BIAS;
        t <<= ((unsigned)u_val & SEG_MASK) >> SEG_SHIFT;

        return ((u_val & SIGN_BIT) ? (BIAS - t) : (t - BIAS));
}

/* A-law to u-law conversion */
unsigned char
alaw2ulaw(
        unsigned char   aval)
{
        aval &= 0xff;
        return ((aval & 0x80) ? (0xFF ^ _a2u[aval ^ 0xD5]) :
            (0x7F ^ _a2u[aval ^ 0x55]));
}

/* u-law to A-law conversion */
unsigned char
ulaw2alaw(
        unsigned char   uval)
{
        uval &= 0xff;
        return ((uval & 0x80) ? (0xD5 ^ (_u2a[0xFF ^ uval] - 1)) :
            (0x55 ^ (_u2a[0x7F ^ uval] - 1)));
}

有了这个文件，数据方面的转化就没有问题了，都可以靠它来解决。

接下去，光有数据，任何播放器都无法识别这对数据到底是什么编码方式，什么采样率的Wav文件，因此无法播放。

在网上同样可以找到关于wav头文件的内容介绍，其中一篇介绍的还是挺详细的，可以在百度文库里面查到。

对比网上的各种代码，各种介绍头文件的文档，发现真是千奇百怪，各式各样都有。

如果你有兴趣的话，可以去搜索一下Wav头文件格式看看。

按照官方的说法WAV文件是遵循RIFF文件格式标准的：

下图是我按照我的理解画的一张示意图。

对照上图，

可以把Wav头文件格式如下：

RootID ：4个BYTE     “RIFF”

Size ： 4个BYTE   数据块的大小

Data ：这个Data 包括了， Format，以及下面的所有Chunk。

Format ： 4个BYTE   “WAVE”

ChunkID ：4个BYTE     “fmt”

ChunkSize ： 4个BYTE     当前Chunk的Data部分大小

ChunkData ：

audioFormat：2个BYTE form of compression.（0x01 是 PCM编码格式， 0x07是U-LAW编码格式）

numChannels: 2个BYTE 声道数量

sampleRate: 4个BYTE 采样频率 8000, 44100, etc.

byeRate:4个 BYTE , 值=SampleRate * NumChannels * BitsPerSample/8

blockAlign: 2个BYTE, 值=NumChannels * BitsPerSample/8

bitsPerSample: 2个BYTE（PCM编码格式）,或者4个BYTE（U-LAW编码格式）

ChunkID ：4个BYTE     “Data”

ChunkSize ： 4个BYTE     当前Chunk的Data部分大小

ChunkData ：声音数据。

这个是最简单的WAV头文件信息，有些头文件在Data和fmt之间还有各种其他信息，比如U-LAW 编码格式的“fact”段，里面就是记录最后Data段的大小。

PCM编码格式的也是如此，只是稍微有些不同。

因此在编码格式的转换上，要注意的是头信息之间的转换，比如8bit 44.1HZ的U-LAW编码格式的Wav文件转化之后为 PCM 格式编码为16bit的44.1HZ的Wav文件。

下面是参考的方法。
/* 1 - 3 */
static int ConvertFromULAW2PCM ( FILE * input, FILE * output) {
	int Ret = Ret_OK;
	int i = 0, j = 0;

	WavHeader_ULAW sULAWHead;
	WavHeader_PCM sPCMHead;
	unsigned char tULaw;
	short tPcm;
	//Extend wav format
	unsigned char *pBuffer = NULL; 
	WavHead_ChunkHead * pSChunkHead = NULL;
	int iBufferSize = 0;
	int iChunkCount = 0;

	//malloc the memory to save the extend info
	pSChunkHead = (WavHead_ChunkHead *) malloc ( sizeof(WavHead_ChunkHead) * ExtendChunkSize );
	pBuffer = (unsigned char *) malloc ( sizeof(unsigned char) * ExtendChunkDataSize);
	//malloc check
	if ( NULL != pBuffer && NULL != pSChunkHead) {
		//start to read the first data
		fread(&sULAWHead, sizeof(WavHeader_ULAW), 1, input);
#if (Debug_Show <= Debug_Level5)
		printf("-----------Read File's Stand info : \n");
		showWavHeadRiff( &sULAWHead.mWavRiff );
		showWavHeadFmt( (WavHead_PCM_Fmt *)&sULAWHead.mWavFmt );
		printf("-----------Read File's Extend info : \n");
#endif
		for (i = 0; i < ExtendChunkSize && sULAWHead.mWavData.subchunk2ID != WAV_DATA; i++) {
			// copy the chunk info
			pSChunkHead[i].chunkID = sULAWHead.mWavData.subchunk2ID;
			pSChunkHead[i].chunkSize = sULAWHead.mWavData.subchunk2Size;
			// copy the chunk data
			if ( (iBufferSize + pSChunkHead[i].chunkSize) <= ExtendChunkDataSize) {
				fread(pBuffer + iBufferSize, sizeof(char) * pSChunkHead[i].chunkSize, 1, input);
			} else {
				printf(" the extend info out of memory --->fail!\n");
				Ret = Ret_Fail;
				break;
			}
#if (Debug_Show <= Debug_Level5)	
			showWavHeadChunk(&pSChunkHead[i], &pBuffer[iBufferSize]);
#endif
			iBufferSize += pSChunkHead[i].chunkSize;
			iChunkCount++;
			// read next chunk head
			fread(&sULAWHead.mWavData, sizeof(WavHead_Data), 1, input);
		}
#if (Debug_Show <= Debug_Level1)
		showWavHeadChunkData(pBuffer, iBufferSize);
#endif
		//---------------file head convert-------------------
		if (0 == Ret) {
			sPCMHead.mWavRiff.chunkID = sULAWHead.mWavRiff.chunkID;
			sPCMHead.mWavRiff.format = sULAWHead.mWavRiff.format;
			sPCMHead.mWavRiff.chunkSize = sULAWHead.mWavRiff.chunkSize + sULAWHead.mWavData.subchunk2Size - 2;
			//copy the commen Fmt info
			copyWavHeadFmt(&sPCMHead.mWavFmt.commFmt, &sULAWHead.mWavFmt.commFmt);
			sPCMHead.mWavFmt.commFmt.subchunk1Size = 0x10;	// -2BYTE
			sPCMHead.mWavFmt.commFmt.audioFormat = 0x01;	// pcm code
			sPCMHead.mWavFmt.bitsPerSample = 16;			// 16 bit pcm
			sPCMHead.mWavFmt.commFmt.byteRate = sPCMHead.mWavFmt.commFmt.sampleRate * sPCMHead.mWavFmt.bitsPerSample * sPCMHead.mWavFmt.commFmt.numChannels / 8;
			sPCMHead.mWavFmt.commFmt.blockAlign = sPCMHead.mWavFmt.bitsPerSample * sPCMHead.mWavFmt.commFmt.numChannels / 8;
			sPCMHead.mWavData.subchunk2ID = sULAWHead.mWavData.subchunk2ID;
			sPCMHead.mWavData.subchunk2Size = sULAWHead.mWavData.subchunk2Size + sULAWHead.mWavData.subchunk2Size;
#if (Debug_Show <= Debug_Level1)
			printf("-----------Write File's Stand info : \n");
			showWavHeadRiff( &sPCMHead.mWavRiff );
			showWavHeadFmt( &sPCMHead.mWavFmt );
#endif
			//---------------file head write-------------------
			fwrite(&sPCMHead.mWavRiff, sizeof(WavHead_Riff), 1, output);
			fwrite(&sPCMHead.mWavFmt, sizeof(WavHead_PCM_Fmt), 1, output);
			for (i = 0, j = 0; i < iChunkCount && j < iBufferSize; i++) {
				fwrite(&pSChunkHead[i], sizeof(WavHead_ChunkHead), 1, output);
				fwrite(pBuffer + j, sizeof(unsigned char) * pSChunkHead[i].chunkSize, 1, output);
				j += pSChunkHead[i].chunkSize;
			}
			fwrite(&sPCMHead.mWavData, sizeof(WavHead_Data), 1, output);
			printf(" write extend info --->OK!\n");
			
			//---------------convert the data------------------
			while ( 0 != fread(&tULaw, sizeof(unsigned char), 1, input) ) {
				//Decode a buffer of u-Law values into 16 bit uniform PCM values 
				tPcm = ulaw2linear(tULaw);
				fwrite(&tPcm, sizeof(short), 1, output);
			}
		} else {
			// to return the value.
		}
	} else {
		Ret = Ret_Fail;
	}

	//---------------free the memory------------------
	free(pSChunkHead);
	free(pBuffer);
	return Ret;
}
最后，注意点，编码数据转换的时候，一定要严格按照BYTE读写。

再提供一份编码格式列表：

#define wave_format_g723_adpcm               0x0014         /* antex electronics corporation */
#define wave_format_antex_adpcme              0x0033         /* antex electronics corporation */
#define wave_format_g721_adpcm               0x0040         /* antex electronics corporation */
#define wave_format_aptx                0x0025         /* audio processing technology */
#define wave_format_audiofile_af36              0x0024         /* audiofile, inc. */
#define wave_format_audiofile_af10              0x0026         /* audiofile, inc. */
#define wave_format_control_res_vqlpc             0x0034         /* control resources limited */
#define wave_format_control_res_cr10             0x0037         /* control resources limited */
#define wave_format_creative_adpcm              0x0200         /* creative labs, inc */
#define wave_format_dolby_ac2               0x0030         /* dolby laboratories */
#define wave_format_dspgroup_truespeech             0x0022         /* dsp group, inc */
#define wave_format_digistd                0x0015         /* dsp solutions, inc. */
#define wave_format_digifix                0x0016         /* dsp solutions, inc. */
#define wave_format_digireal               0x0035         /* dsp solutions, inc. */
#define wave_format_digiadpcm               0x0036         /* dsp solutions, inc. */
#define wave_format_echosc1                0x0023         /* echo speech corporation */
#define wave_format_fm_towns_snd              0x0300         /* fujitsu corp. */
#define wave_format_ibm_cvsd               0x0005         /* ibm corporation */
#define wave_format_oligsm                0x1000         /* ing c. olivetti & c., s.p.a. */
#define wave_format_oliadpcm               0x1001         /* ing c. olivetti & c., s.p.a. */
#define wave_format_olicelp                0x1002         /* ing c. olivetti & c., s.p.a. */
#define wave_format_olisbc                0x1003         /* ing c. olivetti & c., s.p.a. */
#define wave_format_oliopr                0x1004         /* ing c. olivetti & c., s.p.a. */
#define wave_format_ima_adpcm               (wave_form_dvi_adpcm)     /* intel corporation */
#define wave_format_dvi_adpcm               0x0011         /* intel corporation */
#define wave_format_unknown                0x0000         /* microsoft corporation */
#define wave_format_pcm                 0x0001         /* microsoft corporation */
#define wave_format_adpcm                0x0002         /* microsoft corporation */
#define wave_format_alaw                0x0006         /* microsoft corporation */
#define wave_format_mulaw                0x0007         /* microsoft corporation */
#define wave_format_gsm610                0x0031         /* microsoft corporation */
#define wave_format_mpeg                0x0050         /* microsoft corporation */
#define wave_format_nms_vbxadpcm              0x0038         /* natural microsystems */
#define wave_format_oki_adpcm               0x0010         /* oki */
#define wave_format_sierra_adpcm              0x0013         /* sierra semiconductor corp */
#define wave_format_sonarc                0x0021         /* speech compression */
#define wave_format_mediaspace_adpcm             0x0012         /* videologic */
#define wave_format_yamaha_adpcm              0x0020         /* yamaha corporation of america */

下面提供一份，我写的测试代码，U-LAW编码和PCM编码的Wav文件之间的相互转换方法。

http://download.csdn.net/detail/gqjjqg/3685230

【上篇】SQL Server 临时禁用和启用所有外键约束
【下篇】一公升的眼泪（有感）

作者: yushegen

该日志由 yushegen 于11年前发表在综合分类下，最后更新于 2013年09月11日.
转载请注明: Wav编码转换 | 学步园 +复制链接

抱歉!评论已关闭.

学步园

Wav编码转换

作者: yushegen

书签

最新文章New

本站推荐

返回首页