现在的位置: 首页 > 综合 > 正文

qualcomm mdp4 reading notes

2013年09月25日 ⁄ 综合 ⁄ 共 15303字 ⁄ 字号 评论关闭
文章目录

1  Key Features:

offline 2d rotator block

uses an overlay model for average bandwidth optimization

Complete LCDC, with ASIC, Gamma/color correction

uses line-based processing to reduce page breaks

support for two concurrent RGB interfaces

CABL for power saving (content adaptive backlight) - similar to CABC

support for hardware cursor on two RGB interfaces

Blending:

Support for up to four hardware overlays plus cursor

a lot of blending methods

src/des color key with range support

Arbitrary scaling with a range of 8x expansion to 1/8th contraction [kən'trækʃ(ə)n](scale down(按比例缩小) and shrink down)

Video quality improvement

Content-adaptive sharpening and denoising

Color control(hue, saturation, intensity) and contrast enhancement

1.1  MDP pipelines

]
The MDP4 is composed of two configurable layer processing/mixer cores, display processing  units, and interface timing generators. The primary layer processing/mixer core performs layer
 processing
(for video and graphics), layer mixing, and LCD processing for the main LCD panel.  The external layer processing/mixer core performs
layer processing, layer mixing, and external  output processing.  In addition to the overlay, MDP4 also provides a rich set of features for format conversion,  quality
enhancement
,blending, and power savings. Those features are performed by the modules  in the video/graphics layer processor, graphics (RGB) layer
processor, layer mixer, LCD  processor, and TV processor. MDP4 provides support for a wide range of input formats with  different color spaces, color depths, scan types, tile sizes, pixel packing memory layouts,  resolutions, and frame rates for different
content from video decoder, video front end, camera  sensor, graphics, and the UI. It also supports a wide range of output formats for the requirements  from different display panels and the external interface. 

1.2  Video/graphics layer processor

The video/graphics layer processor (VG pipe) is responsible for the data fetch and unpack, format  conversion, scaling, and video quality enhancement for a video or graphics
image.
It can get the  input as video (YUV) or RGB format and generate output as RGB format. Table below lists the  input formats supported by MDP. For detailed format information. 
Color space Bit depth per pixel Pixel packing memory layout  Tile
YCbCr420   888  Pseudoplanar  64 x 32 or No 
                               Planar No 
YCbCr422   888  Pseudoplanar H2V1  No
                               Interleaved H2V1 No
RGB   888, 565, 444, 666  Interleaved  No
ARGB   8888, 4444, 15555, 6666  Interleaved  No
                                 Table MDP input format for VG pipe 

1.3  Graphics (RGB) layer processor

The graphics layer processor (RGB pipe) is responsible for graphics fetch/unpack, resolution conversion, and inverse Gamma correction/Gamma correction. Table below
lists the input formats supported by the graphics layer. 

Color space   Bit depth per pixel        Pixel packing memory layout  Tile
RGB   888, 565, 444, 666        Interleaved                                         No
ARGB   8888, 4444, 15555, 6666  Interleaved  No

                                 Table MDP input format for RGB pipe

1.4  Layer mixer (overlay processor)  (Combine with 1.3  and 1.4: RGB layer and V/G layer processor)

The layer mixer mixes all video and graphics layers together with transparency specified by the alpha. Color key is also performed in the layer mixer. The
layer mixer takes RGB from each pipeline as inputs, performs blending and color key in the linear space, and reduces the color depth from 36 bpp to 24 bpp. 


Actually, the processor can be disabled. If so, the process can be falled back to GPU. (hardware/qcom/display/libhwcomposer/hwc.cpp)
static int hwc_prepare(hwc_composer_device_1 *dev, size_t numDisplays,
                       hwc_display_contents_1_t** displays)
{
    int ret = 0;
    hwc_context_t* ctx = (hwc_context_t*)(dev);
    Locker::Autolock _l(ctx->mBlankLock);
    reset(ctx, numDisplays, displays);

    ctx->mOverlay->configBegin();
    ctx->mRotMgr->configBegin();
    ctx->mNeedsRotator = false;

    for (int32_t i = numDisplays; i >= 0; i--) {
        hwc_display_contents_1_t *list = displays[i];
        switch(i) {
            case HWC_DISPLAY_PRIMARY:
                ret = hwc_prepare_primary(dev, list);
                break;
            case HWC_DISPLAY_EXTERNAL:
            case HWC_DISPLAY_VIRTUAL:
                ret = hwc_prepare_external(dev, list, i);
                break;
            default:
                ret = -EINVAL;
        }
    }

    ctx->mOverlay->configDone();
    ctx->mRotMgr->configDone();

    return ret;
}

static int hwc_prepare_primary(hwc_composer_device_1 *dev,
        hwc_display_contents_1_t *list) {
    hwc_context_t* ctx = (hwc_context_t*)(dev);
    const int dpy = HWC_DISPLAY_PRIMARY;
    if(UNLIKELY(!ctx->mBasePipeSetup))
        setupBasePipe(ctx);
    if (LIKELY(list && list->numHwLayers > 1) &&
            ctx->dpyAttr[dpy].isActive) {
        reset_layer_prop(ctx, dpy, list->numHwLayers - 1);
        uint32_t last = list->numHwLayers - 1;
        hwc_layer_1_t *fbLayer = &list->hwLayers[last];
        if(fbLayer->handle) {
            if(list->numHwLayers > MAX_NUM_LAYERS) { // the max number is 32.
                ctx->mFBUpdate[dpy]->prepare(ctx, list);
                return 0;
            }
            setListStats(ctx, list, dpy);
            bool ret = ctx->mMDPComp->prepare(ctx, list); // mdp composition for RGB
            if(!ret) { // If mdp composition fails, give the work to V/G layer processor
                // IF MDPcomp fails use this route
                ctx->mVidOv[dpy]->prepare(ctx, list); // V/G layer processor
                ctx->mFBUpdate[dpy]->prepare(ctx, list); // commit the screen size 
                // Use Copybit, when MDP comp fails
                if(ctx->mCopyBit[dpy])
                    ctx->mCopyBit[dpy]->prepare(ctx, list, dpy); // GPU
                ctx->mLayerCache[dpy]->updateLayerCache(list);
            }
        }
    }
    return 0;
}

static int hwc_device_open(const struct hw_module_t* module, const char* name,
                           struct hw_device_t** device)
{
    int status = -EINVAL;

    if (!strcmp(name, HWC_HARDWARE_COMPOSER)) {
        struct hwc_context_t *dev;
        dev = (hwc_context_t*)malloc(sizeof(*dev));
        memset(dev, 0, sizeof(*dev));

        //Initialize hwc context
        initContext(dev);

        //Setup HWC methods
        dev->device.common.tag          = HARDWARE_DEVICE_TAG;
        dev->device.common.version      = HWC_DEVICE_API_VERSION_1_1;
        dev->device.common.module       = const_cast<hw_module_t*>(module);
        dev->device.common.close        = hwc_device_close;
        dev->device.prepare             = hwc_prepare;
        dev->device.set                 = hwc_set;
        dev->device.eventControl        = hwc_eventControl;
        dev->device.blank               = hwc_blank;
        dev->device.query               = hwc_query;
        dev->device.registerProcs       = hwc_registerProcs;
        dev->device.dump                = hwc_dump;
        dev->device.getDisplayConfigs   = hwc_getDisplayConfigs;
        dev->device.getDisplayAttributes = hwc_getDisplayAttributes;
        *device = &dev->device.common;
        status = 0;
    }
    return status;
}

static struct hw_module_methods_t hwc_module_methods = {
    open: hwc_device_open
};

hwc_module_t HAL_MODULE_INFO_SYM = {
    common: {
        tag: HARDWARE_MODULE_TAG,
        version_major: 2,
        version_minor: 0,
        id: HWC_HARDWARE_MODULE_ID,
        name: "Qualcomm Hardware Composer Module",
        author: "CodeAurora Forum",
        methods: &hwc_module_methods,
        dso: 0,
        reserved: {0},
    }
};
SurfaceFlinger Code: frameworks/native/services/surfaceflinger

For Video/Graphics processor, the process is as follows:

-----------> VideoOverlayLowRes::prepare
-----------> VideoOverlayLowRes::configure
-----------> configureLowRes
-----------> configMdp
-----------> Overlay::commit(utils::eDest dest)
-----------> GenericPipe::commit()
-----------> Ctrl::commit()
-----------> MdpCtrl::set()
-----------> setOverlay(int fd, mdp_overlay& ov)   mdp_wrapper::setOverlay(mFd.getFD(), mOVInfo)
-----------> ioctl(fd, MSMFB_OVERLAY_SET, &ov)

1.5  Main panel display processor (DMA_P)

The DMA_P subblock converts a blended frame from 24-bit RGB to the format suitable for LCD output. The output color depth varies from 12 bpp to 24 bpp depending on the LCD
panel. To reduce the banding artifacts(带状干扰) introduced by bit-depth reduction, a destination dithering may be added. This processor also provides modules to do color correction, destination HSIC control, and adaptive backlight control.

1.6  External display processor (DMA_E)

The DMA_E subblock is used to convert the content from 24-bit RGB to the format suitable for analog or digital TV Out. To remove the flickering artifacts introduced by progressive-to-interlace
conversion, a deflickering filter is included in the pipe for TV Out.

1.7  MDP4 modes of operation

MDP4 can be operated in three modes. In Direct Out mode, the MDP gets the input and sends it directly to the LCD or external interface. The Framebuffer
mode
and the BLT mode write back the image after the MDP pipes do image processing and then send image data via DMA from that memory.

2  bootloader show pic, color and char

static void show_color(unsigned int x, unsigned int y, unsigned char pixel[4])
{
	unsigned int i = 0;
	unsigned int j = 0;

	for (i = y; i < (y + disphieght); i++) {
		for (j = x; j < (x + dispwidth); j++) { // ARGB: 32bits; RGB: 24bits
			*((unsigned char *)((int)config->base + (i * config->width + j) * 3) + 0) = pixel[0];   /* B */
			*((unsigned char *)((int)config->base + (i * config->width + j) * 3) + 1) = pixel[1];   /* G */
			*((unsigned char *)((int)config->base + (i * config->width + j) * 3) + 2) = pixel[2];   /* R */
			*((unsigned char *)((int)config->base + (i * LCD_PANEL_RESOLUTION_X + j) * 4) + 3) = pixel[3]; /* A */
		}
	}
}

void show_white_color(unsigned int x, unsigned int y)
{
	unsigned char pixel[4];

	/* white: 0xff, 0xff, 0xff; green: 0x00, 0xff, 0x00; red: 0x00, 0x00, 0xff; black: 0x00, 0x00, 0x00 */
	pixel[0] = 0xFF;   /* B */
	pixel[1] = 0xFF;   /* G */
	pixel[2] = 0xFF;   /* R */
	pixel[3] = 0xFF;   /* A */

	show_color(x, y, pixel);
}

static void show_pic(gimp_image info) // picture, using some tools for transforming image from jpg, bmp to binary file
{
	unsigned int i = 0;
	unsigned int j = 0;

	char *lcd_buf = (char *)config->base;

	for (i = (info.mode ? (config->height - info.height) / 2 : info.y);
		i < ((info.mode ? (config->height - info.height) / 2 : info.y) + info.height); i++) {
		for (j = (config->width - info.width)/2; j < ((config->width - info.width)/2 + info.width); j++) {
			*(lcd_buf + (i * config->width + j) * 3 + 0) = *(info.pixel_data++);
			*(lcd_buf + (i * config->width + j) * 3 + 1) = *(info.pixel_data++);
			*(lcd_buf + (i * config->width + j) * 3 + 2) = *(info.pixel_data++);
		}
	}
}

void display_char(int x, int y, const unsigned char *playchar)
{
	int j = 0;
	int k = 0;
	int row = 16;
	int list = 29;
	unsigned char temp = 128;
	char *pixels ;
	int arr_char[row * list];

	for (j = 0; j < row * list / 8; j++) {
		temp = 128;
		for (k = 0; k < 8; k++) {
			if ((playchar[j]) & (temp))
				arr_char[j*8 + k] = 0xFFFFFF; /* white */
			else
				arr_char[j*8 + k] = 0x000000;/* black */
			 temp >>= 1;
		}
	}

	pixels = config->base; /* framebuffer base address */
	pixels += (x + y * config->width) * 3;

	for (j = 0; j < list; j++) {
		for (k = 0; k < row; k++) {
			*pixels++ = (arr_char[j*row+k]) & 0xff;
			*pixels++ = (arr_char[j*row+k]>>8) & 0xff;
			*pixels++ = (arr_char[j*row+k]>>16) & 0xff;
		}
		pixels += (config->width - row) * 3;
	}
}

3  DSI clock

3.1  Types of DSI clocks

DSI bit clock:
This is the actual DSI clock which goes from the host processor to the LCD driver. This DSI bit clock is used as the source synchronous bit clock for capturing serial data bits in the receiver PHY; 
DSI byte clock
The DSI byte clock is used in the lane management layer for high-speed data transmission. During HS transmission, each byte of data is accompanied by a byte clock.
DSI clock
This is the core clock of DSI controller
DSI ESC clock
For MSM8960, the ESC clock is 27MHz. It is used in the lane management layer for escape mode transmission.
DSI pixel clock
The pixel clock is always required to run when transmitting data over the DSI link.

3.2  Clock relation requirement

Clock relation --- Frequency Ratio
Bit clock to Byte clock 8:1
Byte clock to DSI clock 1: #lanes (Number of Lanes)
DSI clock to pixel clock(video) Video mode pixel depth:1
DSI clock to pixel clock(no video) DSI clock <= pixel clock x command mode pixel depth
Table:
frame rate 60 FPS reserved
lane config 2 lanes  
pixel format BPP 3 bytes/pixel  
width 540 pixels  
height 960 lines  
Hsync Pulse Width 8 pclks  
HBP 48 pclks  
HFP 40 pclks  
Vsync Pulse Width 1 lines  
VBP 16 lines  
VHP 15 lines  
DSI reference clock(mxo=27MHz) 27 MHz  
actual vco frequency 908 MHz  
actual bit clock 454 MHz  
       

Hsync period = Display Width + Hsync Pulse Width + HBP + HFP = 636 dclks/line

Vsync period = Display Height + Vsync Pulse Width + VBP + VFP =992 lines/frame
bitclk = Hsync period * Vsync period * frame rate * Byte per pixel * 8 / number of lanes = 454.26MHz
actual bitclk = 454MHz
byteclk = bitclk / 8 = 56.75MHz
dsiclk = byteclk * number of lanes = 56.75 * 2 = 113.50MHz
pclk = dsiclk / Byte per pixel = 113.50 / 3 = 37.83MHz
DSI PHY PLL vco frequency (actual) = 908.00MHz
Blanking(Dot clock overhead)  = (636 * 992 - 540 * 960) / (540 * 960) = 21.7037037%
line time = Hsync period / pclk = 636 / 37.83MHz = 16.81us

4  Display types

There are two kinds of panel types, smart and dumb, as follow:
The smart display panel has its own memory for graphics, i.e., GRAM, on its panel size. The panel module consists of driver IC, which contains the GRAM, and controls the image refresh to the glass panel
from the GRAM. In this case, the MSM needs to update the new image to the GRAM when the new image update is needed. Between these, there is no signal interaction between the MSM and LCD module. The MDDI interface (except Auto Refresh mode with MDDI1.2) and
DSI Command mode can be used for the smart panel.
The dump display panel does not have its own memory and timing control. Therefore, the MSM needs to control all these and the image that is refreshed on the glass panel is fetched from the memory inside the MSM with a certain refresh rate. For this process,
the MDP has a timing generator to control it. In this case, even if there is no new image updated like an idle screen, i.e., no image change but backlight is on, the MDP needs to keep fetching the image from the internal memory to peripheral. The RGB interface
or DSI Video mode can be used for this kind of dump panel.  Whether the smart or dump display is used depends on the source of the benefit. If the smart panel is used, the power benefits because the MSM does not have to be on if there is no new image update
(so all MSM modules can go into power collapse[kə'læps]). However, the LCD module with this type costs more than the dump LCD module. If the dump panel is used, the cost benefits but power will be consumed more than the smart panel type.
DSI modes
There are two modes of operations for DSI-compliant peripherals, Command and Video.  
1). Command mode 
The command mode refers to transactions taking the form of sending commands and data to peripherals, i.e., LCD driver IC. Typically, this is used for the smart panel with external RAM out of MSM and external LCDC, which can self-refresh in a static image
update case. The MSM can go to TCXO shutdown and save more power but there is an additional cost on external RAM and LCDC. The signal flow of information is bidirectional on Lane 0 in Command mode, so the host can write or read data to or from peripherals.
The host can synchronize the flow of the data using a TE signal (Vsync) from the panel to avoid tearing effects. 
2). Video mode
The Video mode refers to the transactions taking the form of a real-time pixel stream. The DSI host inside the MSM needs to refresh the image data continuously. Typically, this is used for the dumb(adj. 哑的,无说话能力的;不说话的,无声音的) panel without external RAM. The host
provides video data, i.e., pixel values, and  synchronization information, i.e., Vsync, Hsync, data enable, and pixel clock. The Video mode behavior is very similar as the RGB interface but costs only a couple of pins, so there is also less EMI and radio desensing. 

5  MDP clocks

6  MDP advanced Features

6.1  Contents Adaptive Backlight Control (CABL)
6.2  Postprocessing
    a. Source scaling, sharpening, and Denoising
    b. Source dithering
    c. Source color space conversion(CSC)
6.3  Destination processing
    a. Destination CSC and HSIC
    b. Destination dithering
    c. Display color calibration

7  Rotator

8  SurfaceFlinger and Gralloc

SurfaceFlinger is the compositor in Android, meaning that it takes each application surface (or window) and blends them together with a variety of operations to form the final image ready to be displayed by the LCD. The final composited
image is known as the framebuffer, and this is updated to the actual primary LCD or external display. SurfaceFlinger runs as a server process and works with many client applications at once. Communications are done over the binder mechanism, which is a type
of IPC. Applications link to a SurfaceFlinger client library (known as libui) tofacilitate [fə'sɪlɪteɪt] (促进;推进) communications with the server. Without the server running,
the SurfaceFlinger client library has limited use, allowing applications direct access to the framebuffer. Thus,OpenGL ® ES applications running from the shell can bypass the compositor altogether.The SurfaceFlinger client
gives the GPU direct access to the framebuffer as if it were a regular surface. This is called the SurfaceFlinger composition, where the input images are composed onto the framebuffer. Once the framebuffer is ready to be updated, SurfaceFlinger calls the rendering
API to send the MSM FB driver update command.
Gralloc is an Android HAL module, which is a small library designed toallocate memory for use as image buffers and also to control the framebuffer. It supports basic memory operations, such as alloc and
free, and also locking operations, which are described later. The image buffers that gralloc allocates are two-dimensional image buffers, otherwise known as surfaces. In addition to alloc and free, gralloc also supports lock and unlock operations. The locking
of a memory region is designed to protect the pixels in the buffer against access conflicts. For example, it may be possible for multiple readers to lock a region at the same time, but only one writer can lock the region at once. When a writer locks the region,
no other readers are permitted. As with alloc and free, the implementations of lock and unlock are vendor-specific; e.g., with a purely software renderer, lock and unlock may do nothing. 

抱歉!评论已关闭.