FFMPEG处理音频时间戳的主要逻辑

现在的位置: 首页 > 综合 > 正文

FFMPEG处理音频时间戳的主要逻辑

2013年12月09日 ⁄ 综合 ⁄ 共 2226字 ⁄ 字号小中大 ⁄ 评论关闭

FFMPEG处理音频时间戳的主要逻辑是：

1. demux读取AVPacket。以输入flv为例，timebase是1/1000，第一个音频包可能是46，代表0.046秒。

2. decoder解码AVPacket为AVFrame，frame的pts为NOPTS，需要设置，否则后面都会有问题。主要是调用：av_rescale_delta：

AVRational in_tb = decoded_frame_tb;
AVRational fs_tb = (AVRational){1, ist->codec->sample_rate};
int duration = decoded_frame->nb_samples;
AVRational out_tb = (AVRational){1, ist->codec->sample_rate};

decoded_frame->pts = av_rescale_delta(in_tb, decoded_frame->pts, fs_tb, duration, &rescale_last_pts, out_tb);

相当于下面的逻辑：

// init the rescale_last_pts, set to 0 for the first decoded_frame->pts is 0
if (rescale_last_pts == AV_NOPTS_VALUE) {
    rescale_last_pts = av_rescale_q(decoded_frame->pts, in_tb, fs_tb) + duration;
}
// the fs_tb equals to out_tb, so decoded_frame->pts equals to rescale_last_pts
decoded_frame->pts = av_rescale_q(rescale_last_pts, fs_tb, out_tb);;
rescale_last_pts += duration;

还可以简化为：

    /**
    * for audio encoding, we simplify the rescale algorithm to following.
    */
    if (rescale_last_pts == AV_NOPTS_VALUE) {
        rescale_last_pts = 0;
    }
    decoded_frame->pts = rescale_last_pts;
    rescale_last_pts += decoded_frame->nb_samples; // duration

实际上就是以nb_samples为时长，让pts为这个的总和，累积的samples就可以。因为默认把tb设置为sample_rate，所以samples数目就是pts。

3. filter过滤，实际上没有处理。

        // correct the pts
        int64_t filtered_frame_pts = AV_NOPTS_VALUE;
        if (picref->pts != AV_NOPTS_VALUE) {
            // rescale the tb, actual the ofilter tb equals to ost tb,
            // so this step canbe ignored and we always set start_time to 0.
            filtered_frame_pts = av_rescale_q(picref->pts, ofilter->inputs[0]->time_base, ost->codec->time_base) 
                - av_rescale_q(start_time, AV_TIME_BASE_Q, ost->codec->time_base);
        }
        
        // convert to frame
        avfilter_copy_buf_props(filtered_frame, picref);
        printf("filter -> picref_pts=%"PRId64", frame_pts=%"PRId64", filtered_pts=%"PRId64"\n", 
            picref->pts, filtered_frame->pts, filtered_frame_pts);
        filtered_frame->pts = filtered_frame_pts;

4. encoder编码，主要是生成dts。

5. muxer输出前，需要做处理。譬如输出rtmp流，要将tb变为1/1000，flv的tb，也就是毫秒单位。

另外，时间戳从零开始。

    // correct the output, enforce start at 0.
    static int64_t starttime = -1;
#if 1
    if (starttime < 0) {
        starttime = (pkt.dts < pkt.pts)? pkt.dts : pkt.pts;
    }
    pkt.dts -= starttime;
    pkt.pts -= starttime;
#endif

#if 1
    // rescale audio ts to AVRational(1, 1000) for flv format.
    AVRational flv_tb = (AVRational){1, 1000};
    pkt.dts = av_rescale_q(pkt.dts, ost->codec->time_base, flv_tb);
    pkt.pts = av_rescale_q(pkt.pts, ost->codec->time_base, flv_tb);
#endif

6. 最后一步，写入：

    ret = av_interleaved_write_frame(oc, &pkt);

就OK了。

【上篇】关于Shell脚本的二个位置参数
【下篇】TortoiseGit使用入门

作者: parson

该日志由 parson 于10年前发表在综合分类下，最后更新于 2013年12月09日.
转载请注明: FFMPEG处理音频时间戳的主要逻辑 | 学步园 +复制链接

抱歉!评论已关闭.

学步园

FFMPEG处理音频时间戳的主要逻辑

作者: parson

书签

最新文章New

本站推荐

返回首页