现在的位置: 首页 > 综合 > 正文

Compressing Wav file to MP3(代码实现:将Wav格式压缩成Mp3 )

2013年03月03日 ⁄ 综合 ⁄ 共 16172字 ⁄ 字号 评论关闭

Compressing Wav file to MP3

?? 1. Introduction
First, I don't meant to give you informations about how to understand the mp3 algorithm. My
goal is to explain how to use an already existing encoder with
BCB.

?? 2. Choosing the mp3 encoder
There are tons of mp3 encoders. Some of them are free others are not. Some are fast but produce
an awful result. Others are slow but with excellent result
and give a high audio quality. The ideal would be a free, reasonably fast encoder giving a high
audio quality, all at the same time.
Enjoy! This pearl exists. But we have to look at it in the GNU world. There is a GNU project,
called LAME, for Lame Ain't a Mp3 Encoder, under the GPL license.
The official web site of the LAME project is http://www.mp3dev.org/mp3/
Moreover, as it is a GNU project, we have access to the source and there is a version compiled
for Win32 in a DLL.
Among all the other encoders, I want to quote two of them. The first, FRAUNHOFER, because it is
a fast and excellent encoder : http://www.iis.fhg.de/ but it's
not  free though.
The second because it's a very fast encoder but the audio result is awful. So don't use it
except if you are looking for a fast encoder. It's the encoder from Xing
Tech : http://www.xingtech.com/
Note : The Lame encoder has a limitation. The sample rate must be 32000, 44100 or 48000.?

?? 3. Some informations about the WAV format
A wav file is just a collection of chunks. There is a format chunk wich contains all the
informations about the samples. For instance, the bitrate, the number of
channels, if it's stereo or mono... There is also a chunk containing the data. In other words,
this chunk contains all the samples. In front of the file, there are
12 characters indicating that the file is a wav file.
The two chunks given above must be present in the file.
There could be other chunk but we just ignore them. They are not needed for our purpose. If you
want to know more about wav file, take a look at
http://www.wotsit.org/ for a complete description.
The format chunk :

struct FormatChunk
{
    char            chunkID[4];
    long            chunkSize;
    short           wFormatTag;
    unsigned short  wChannels;
    unsigned long   dwSamplesPerSec;
    unsigned long   dwAvgBytesPerSec;
    unsigned short  wBlockAlign;
    unsigned short  wBitsPerSample;
    // Note: there may be additional fields here, depending upon wFormatTag.
};

Above, you can see the struct representing the format chunk. The chunkID is always "fmt " with
an ending space (4 characters). It's the identification of the
chunk. All other chunk have such an ID. The chunkSize parameter contains the number of bytes of
the chunk, the ID and chunkSize excluded.
The format chunk must be the first chunk in the file.

The data chunk :
struct Chunk
{
    char chunkID[4];
    long chunkSize;
};
In the case of the data chunk, the chunkID contains "data". The chunkSize parameters contains
the size of the raw data (samples). The data begins just after
chunkSize.

In the case of the data chunk, the chunkID contains "data". The chunkSize parameters contains
the size of the raw data (samples). The data begins just after
chunkSize.
Dans le cas du bloc de donn¨?es, chunkID contient "data". Le param¨¨tre chunkSize contient la
taille du bloc de donn¨?es proprement dites. Celles-ci commencent
juste apr¨¨s chunkSize.
So, when we read a wav file, all we have to do is :
- read the first 12 characters to check if it's a real wav file.
- read the format chunk in a struct similar to the formatChunk struct.
- skip the extra parameters in the format chunk, if any.
- find the data chunk, read the raw data and carry out with the encoding.
-skip all other chunks.
Donc, ce que nous devons faire est :
- lire les 12 premiers caract¨¨res pour d¨?terminer si on est bien en pr¨?sence d'un fichier
wav.
- lire le bloc de format dans une structure similaire ¨¤ la structure formatChunk.
- ignorer les caract¨¨res suppl¨?mentaires dans le bloc de format, s'il y en a.
- ignorer tous les blocs qui ne sont pas le bloc de donn¨?es.
- trouver le bloc de donn¨?es, lire ces donn¨?es et lancer l'encodage.

?? 4. Importing the DLL
The DLL used for the encoding  is called lame_enc.dll.
Unfortunately, this DLL was build with VC 6 from Microsoft. If we just create a lib file from
the DLL and try to import the library in BCB, we'll get an 'Unresolved
external error' at link time for each function we'll try to use. Due to the declaration type,
BCB is expecting a function name with a leading underscore and the
function names doesn't have such a leading underscore.
To resolve this issue, we must, first, create a def file from our DLL. Open a console windows
and type :

IMPDEF lame_enc.def lame_enc.dll

Open the lame_enc.def file with an editor (Notepad for instance) and modify it like this. This
will create aliases for the functions :
    LIBRARY     LAME_ENC.DLL
    EXPORTS
    _beCloseStream = beCloseStream
    _beDeinitStream = beDeinitStream
    _beEncodeChunk = beEncodeChunk
    _beInitStream = beInitStream
    _beVersion = beVersion
    _beWriteVBRHeader = beWriteVBRHeader
    beCloseStream                  @4
    beDeinitStream                 @3
    beEncodeChunk                  @2
    beInitStream                   @1
    beVersion                      @5
    beWriteVBRHeader               @6

Now, we can create the lib file from our def file. We'll import that lib file in our project.
To create the lib file, type :
implib lame_enc.lib lame_enc.def

?? 5. The code
First, you have to import the libary in your project. Next, include the header file of the DLL
into your unit. In the DLL header file, you have to add extern "C" in
front of all exported function.
Here is the header with the moifications (lame_enc.h) :
/*  bladedll.h
    +++++++++++++++++++++++++++
    +   Blade's Encoder DLL   +
    +++++++++++++++++++++++++++

    ------------------------------------------------------
  - Version 1.00 (7 November 1998) - Jukka Poikolainen -
  ------------------------------------------------------
*/
#ifndef ___BLADEDLL_H_INCLUDED___
#define ___BLADEDLL_H_INCLUDED___

#pragma pack(push)
#pragma pack(1)

/* encoding formats */

#define BE_CONFIG_MP3 0
#define BE_CONFIG_LAME 256

/* type definitions */

typedef    unsigned long      HBE_STREAM;
typedef    HBE_STREAM         *PHBE_STREAM;
typedef    unsigned long      BE_ERR;

/* error codes */

#define BE_ERR_SUCCESSFUL                0x00000000
#define BE_ERR_INVALID_FORMAT            0x00000001
#define BE_ERR_INVALID_FORMAT_PARAMETERS 0x00000002
#define BE_ERR_NO_MORE_HANDLES           0x00000003
#define BE_ERR_INVALID_HANDLE            0x00000004
#define BE_ERR_BUFFER_TOO_SMALL          0x00000005

/* other constants */

#define BE_MAX_HOMEPAGE 256

/* format specific variables */

#define BE_MP3_MODE_STEREO      0
#define BE_MP3_MODE_JSTEREO     1
#define BE_MP3_MODE_DUALCHANNEL 2
#define BE_MP3_MODE_MONO        3

#define MPEG1 1
#define MPEG2 0

#ifdef _BLADEDLL
#undef FLOAT
    #include <Windows.h>
#endif

enum  MPEG_QUALITY
{
    NORMAL_QUALITY = 0,
    LOW_QUALITY,
    HIGH_QUALITY,
    VOICE_QUALITY
};

typedef struct
{
    DWORD  dwConfig;      // BE_CONFIG_XXXXX
                          // Currently only BE_CONFIG_MP3 is supported
    union
    {
        struct
        {
            DWORD  dwSampleRate;  // 48000, 44100 and 32000 allowed
            BYTE   byMode;        // BE_MP3_MODE_STEREO, BE_MP3_MODE_DUALCHANNEL
                                  // BE_MP3_MODE_MONO
            WORD   wBitrate;      // 32, 40, 48, 56, 64, 80, 96, 112, 128,
                                  // 160, 192, 224, 256 and 320 allowed
            BOOL  bPrivate;
            BOOL  bCRC;
            BOOL  bCopyright;
            BOOL  bOriginal;
        }mp3;                     // BE_CONFIG_MP3
        struct
        {
            // STRUCTURE INFORMATION
            DWORD      dwStructVersion;
            DWORD      dwStructSize;
            // BASIC ENCODER SETTINGS
            DWORD      dwSampleRate;   // ALLOWED SAMPLERATE VALUES DEPENDS
                                       // ON dwMPEGVersion
            DWORD      dwReSampleRate; // DOWNSAMPLERATE, 0=ENCODER DECIDES
            INT        nMode;          // BE_MP3_MODE_STEREO, BE_MP3_MODE_DUALCHANNEL
                                       // BE_MP3_MODE_MONO
            DWORD      dwBitrate;      // CBR bitrate, VBR min bitrate
            DWORD      dwMaxBitrate;   // CBR ignored, VBR Max bitrate
            MPEG_QUALITY  nQuality;    // Quality setting (NORMAL,HIGH,LOW,VOICE)
            DWORD      dwMpegVersion;  // MPEG-1 OR MPEG-2
            DWORD      dwPsyModel;     // FUTURE USE, SET TO 0
            DWORD      dwEmphasis;     // FUTURE USE, SET TO 0

            // BIT STREAM SETTINGS
            BOOL      bPrivate;        // Set Private Bit (TRUE/FALSE)
            BOOL      bCRC;            // Insert CRC (TRUE/FALSE)
            BOOL      bCopyright;      // Set Copyright Bit (TRUE/FALSE)
            BOOL      bOriginal;       // Set Original Bit (TRUE/FALSE)
      
            // VBR STUFF
            BOOL      bWriteVBRHeader; // WRITE XING VBR HEADER (TRUE/FALSE)
            BOOL      bEnableVBR;      // USE VBR ENCODING (TRUE/FALSE)
            INT        nVBRQuality;    // VBR QUALITY 0..9
            BYTE      btReserved[255]; // FUTURE USE, SET TO 0
        }LHV1;                         // LAME header version 1

        struct
        {
            DWORD  dwSampleRate;
            BYTE  byMode;
            WORD  wBitrate;
            BYTE  byEncodingMethod;
        }aac;
    }format; 
}BE_CONFIG;

struct BE_VERSION
{
    // BladeEnc DLL Version number
    BYTE  byDLLMajorVersion;
    BYTE  byDLLMinorVersion;
    // BladeEnc Engine Version Number
    BYTE  byMajorVersion;
    BYTE  byMinorVersion;
    // DLL Release date
    BYTE  byDay;
    BYTE  byMonth;
    WORD  wYear;
    // BladeEnc Homepage URL
    CHAR  zHomepage[BE_MAX_HOMEPAGE + 1];
};

#ifndef _BLADEDLL

typedef BE_ERR  (*BEINITSTREAM)    (BE_CONFIG*, PDWORD, PDWORD, PHBE_STREAM);
typedef BE_ERR  (*BEENCODECHUNK)   (HBE_STREAM, DWORD, PSHORT, PBYTE, PDWORD);
typedef BE_ERR  (*BEDEINITSTREAM)  (HBE_STREAM, PBYTE, PDWORD);
typedef BE_ERR  (*BECLOSESTREAM)   (HBE_STREAM);
typedef VOID    (*BEVERSION)       (BE_VERSION*);

#define TEXT_BEINITSTREAM   "beInitStream"
#define TEXT_BEENCODECHUNK  "beEncodeChunk"
#define TEXT_BEDEINITSTREAM "beDeinitStream"
#define TEXT_BECLOSESTREAM  "beCloseStream"
#define TEXT_BEVERSION      "beVersion"

/*
BE_ERR beInitStream(BE_CONFIG *beConfig, PDWORD dwSamples, PDWORD dwBufferSize,
    PHBE_STREAM phbeStream);
BE_ERR beEncodeChunk(HBE_STREAM hbeStream, DWORD nSamples, PSHORT pSamples, PBYTE pOutput,
    PDWORD pdwOutput);
BE_ERR beDeinitStream(HBE_STREAM hbeStream, PBYTE pOutput, PDWORD pdwOutput);
BE_ERR beCloseStream(HBE_STREAM hbeStream);
VOID beVersion(BE_VERSION *beVersion);
*/

#else

extern "C" __declspec(dllexport) BE_ERR beInitStream(BE_CONFIG *beConfig,
    PDWORD dwSamples, PDWORD dwBufferSize, PHBE_STREAM phbeStream);
extern "C" __declspec(dllexport) BE_ERR  beEncodeChunk(HBE_STREAM hbeStream,
    DWORD nSamples, PSHORT pSamples, PBYTE pOutput, PDWORD pdwOutput);
extern "C" __declspec(dllexport) BE_ERR  beDeinitStream(HBE_STREAM hbeStream,
    PBYTE pOutput, PDWORD pdwOutput); extern "C" __declspec(dllexport) BE_ERR  beCloseStream
(HBE_STREAM hbeStream);
extern "C" __declspec(dllexport) VOID    beVersion(BE_VERSION *beVersion);

#endif
#pragma pack(pop)
#endif

As you can see in the header above, you have to add  #define _BLADEDLL into your .cpp file
before including the header.

Below, you'll find the code of a little application which takes a wav file in input and encode
the file to mp3. I don't give more explanations because the code is
very straightforward and commented. It is not very elegant but it's just to show how to use the
DLL.

File Format.h
//---------------------------------------------------------------------------
#ifndef Format_H
#define Format_H
//---------------------------------------------------------------------------
struct FormatChunk
{
    char            chunkID[4];
    long            chunkSize;
    short           wFormatTag;
    unsigned short  wChannels;
    unsigned long   dwSamplesPerSec;
    unsigned long   dwAvgBytesPerSec;
    unsigned short  wBlockAlign;
    unsigned short  wBitsPerSample;
    // Note: there may be additional fields here, depending upon wFormatTag.
};

// This is the start ID of a Wave file
// must contains 'RIFF' and 'WAVE'
char startID[12];

// contains the chunk id ('data', 'cue ' ...) and the chunk size
struct Chunk
{
    char            chunkID[4];
    long            chunkSize;
};

// a pointer to the samples in the data chunk
unsigned char     *WaveformData;

//---------------------------------------------------------------------------
#endif

File Unit1_H.h
//---------------------------------------------------------------------------
#ifndef Unit1H
#define Unit1H
//---------------------------------------------------------------------------
#include <Classes.hpp>
#include <Controls.hpp>
#include <StdCtrls.hpp>
#include <Forms.hpp>
#include <Dialogs.hpp>
//---------------------------------------------------------------------------
class TForm1 : public TForm
{
__published:  // IDE-managed Components
    TOpenDialog *OpenDialog1;
    TEdit *FileEdit;
    TLabel *Label1;
    TButton *Browse;
    TButton *Encode;
    void __fastcall BrowseClick(TObject *Sender);
    void __fastcall EncodeClick(TObject *Sender);
private:  // User declarations
    AnsiString OutputFileName;
public:    // User declarations
    __fastcall TForm1(TComponent* Owner);
};
//---------------------------------------------------------------------------
extern PACKAGE TForm1 *Form1;
//---------------------------------------------------------------------------
#endif

Fiel Unit1.cpp
//---------------------------------------------------------------------------
#include <vcl.h>
#pragma hdrstop

#include <fstream>
#include <iostream>
#include "Unit1.h"

#define _BLADEDLL           // Don't forget it
#include "lame_enc.h"
#include "format.h"
//---------------------------------------------------------------------------
#pragma package(smart_init)
#pragma resource "*.dfm"
TForm1 *Form1;
//---------------------------------------------------------------------------
__fastcall TForm1::TForm1(TComponent* Owner)
        : TForm(Owner)
{
}
//---------------------------------------------------------------------------
void __fastcall TForm1::BrowseClick(TObject *Sender)
{
    OpenDialog1->InitialDir = ExtractFileDir(Application->ExeName);
    if(OpenDialog1->Execute())
    {
        FileEdit->Text = OpenDialog1->FileName;
        OutputFileName = ChangeFileExt(OpenDialog1->FileName, ".mp3");
    }
}
//---------------------------------------------------------------------------
void __fastcall TForm1::EncodeClick(TObject *Sender)
{
    if(FileEdit->Text == "")
        return;
    std::ifstream fin(FileEdit->Text.c_str(), std::ios::binary);
    if(!fin)
        return;
    // read the 12 character in front of the file
    fin.read((char*)&startID, sizeof(startID));

    // get the format chunk
    FormatChunk fc;
    fin.read((char*)&fc, sizeof(FormatChunk));
    // the first chunk MUST be the format chunk
    if(strncmp(fc.chunkID, "fmt ", 4) != 0)
    {
        Application->MessageBox("This is not a valid Wave file",
                            "Wav2Mp3 ERROR", MB_OK);
        return;
    }
    if(fc.wFormatTag!=1)
    {
        Application->MessageBox("Cannot handle compressed Wave file",
                            "Wav2Mp3 ERROR", MB_OK);
        return;
    }
    // initialization of Mp3 encoder
    BE_CONFIG bc;
    bc.dwConfig = BE_CONFIG_MP3;
    // 32000, 44100 and 48000 are the only sample rate authorized
    // due to encoding limitations
    if(fc.dwSamplesPerSec == 32000 || fc.dwSamplesPerSec == 44100 ||
            fc.dwSamplesPerSec == 48000)
        bc.format.mp3.dwSampleRate = fc.dwSamplesPerSec;
    else
    {
        Application->MessageBox("Unsuported sample rate",
                            "Wav2Mp3 ERROR", MB_OK);
        return;
    }
    if(fc.wChannels == 1)
        bc.format.mp3.byMode = BE_MP3_MODE_MONO;
    else
        bc.format.mp3.byMode = BE_MP3_MODE_STEREO;
    // the resulting file length depends on this parameter
    // higher the bitrate, better the result
    bc.format.mp3.wBitrate = 192;
    bc.format.mp3.bCopyright = false;
    bc.format.mp3.bCRC = false;
    bc.format.mp3.bOriginal = false;
    bc.format.mp3.bPrivate = false;
    // skip extra formatchunk parameter, if any
    if(sizeof(FormatChunk) < int(8 + fc.chunkSize))
    {
        char c;
        for(int i=0; i< int(8 + fc.chunkSize - sizeof(FormatChunk)); i++)
            fin.get(c);
    }
    // get next chunk
    Chunk chunk;
    fin.read((char*)&chunk, sizeof(Chunk));
    // check if it's the data chunk
    while(strncmp(chunk.chunkID, "data", 4) != 0)
    {
        char c;
        for(int i=0; i<chunk.chunkSize; i++)
            fin.get(c);
        fin.read((char*)&chunk,sizeof(Chunk));
    }
    // process with the encoding
    DWORD dwNumberOfSamples;
    DWORD dwOutputBufferLength;
    HBE_STREAM hStream;
    if(beInitStream(&bc, &dwNumberOfSamples, &dwOutputBufferLength,
            &hStream) != BE_ERR_SUCCESSFUL)
    {
        Application->MessageBox("Cannot perform compression",
                            "Wav2Mp3 ERROR", MB_OK);
        return;
    }
    std::ofstream fout(OutputFileName.c_str(), std::ios::binary);
    char *Mp3Buffer = new char[dwOutputBufferLength];
    SHORT *InputBuffer = new SHORT[dwNumberOfSamples];      // SHORT = short = 16 bits

    int nSamplesPerformed=0;
    DWORD dwNumberOfSamplesEncoded;
    while(nSamplesPerformed < chunk.chunkSize)
    {
        fin.read((char*)InputBuffer, dwNumberOfSamples * 2);
        nSamplesPerformed += dwNumberOfSamples * 2;
        if(beEncodeChunk(hStream, dwNumberOfSamples, InputBuffer,
                (BYTE*)Mp3Buffer, &dwNumberOfSamplesEncoded) != BE_ERR_SUCCESSFUL)
        {
            Application->MessageBox("Cannot perform compression",
                              "Wav2Mp3 ERROR", MB_OK);
            return;
        }
        fout.write(Mp3Buffer, dwNumberOfSamplesEncoded);
    }
    beDeinitStream(hStream, (BYTE*)Mp3Buffer, &dwNumberOfSamplesEncoded);
    beCloseStream(hStream);

    delete Mp3Buffer;
    delete InputBuffer;
    return;
}

 

【上篇】
【下篇】

抱歉!评论已关闭.