"Audio and Video: Parsing H264 Bare Streams from RTP Packets"

I. Introduction

This article aims to parse out h264 bare stream data from transmitting h264 encoded RTP packets. For rtp data packets that transmit h264 data, there are generally three types, single packet, single-time combined packet (STAP-A), and FU-A fragmented packet. This article extracts these three types of data and saves them to a file. The obtained h264 bare stream file can be played by ffmpeg.

2. RTP packet format

2.1 RTP header

V: version number, usually 2, 2bits
P: padding, 1bits
X: Extended, 1bits
CC: CSRC count, 4bits
M: Flag, 1bits
PT: payload type, 7bits. For H264 the general value is 100
sequence number: serial number, 16bits
timestamp: timestamp, 32bits
SSRC: Sync source flag, 32bits
CSRC: Generally not used. Therefore, the RTP packet header is generally 12 bytes (96bits).

Through wireshark capture, you can see the following rtp packet header information

2.2 RTP load

The rtp payload is the bare stream data of h264 transmitted, and what we need to do now is to restore the bare stream data of h264 from these loads.

Third, the H264 data format in the RTP packet payload

The H264 data in RTP packet transmission consists of NALU Header+NALU Payload. For NALU Header, the NALU Header of single packet, combined packet and the NALU Header of fragmented packet are different. The NALU Payload of single packet, fragmented packet and the payload of combined packet are different.

3.1 H264 data type in RTP packet

NAl typeNALU unit type
1-23NAL unitA single NALU package
24STAP-AOne time bundle
28FU-Asharded unit

3.2 Types of NALU Units

1Fragments that do not use data partitioning in non-IDR images
2Segmentation of Class A data in non-IDR images
3Fragmentation of Class B data in non-IDR images
4Segmentation of Class C data in non-IDR images
5Fragment of an IDR image
6Supplemental Enhancement Information (SEI)
7Sequence parameter set (sps)
8Picture Parameter Set (PPS)
9separator
10sequence terminator
11stream terminator
12Data input

3.3 NALU Header for Single Packet and Combined Packet

The 12-byte header is removed from an RTP packet, and the thirteenth byte is the NALU Header, which mainly depends on the last five bits. "00111" indicates the type of the H264 data, that is, the type of the NALU unit.

3.4 NALU Header of Fragmented Packet

The NALU Header of the fragmented packet consists of FU identifier+FU Header. The 12-byte header is removed from an RTP packet, the thirteenth byte is the FU identifier, and the last five bits "00111" indicate the type of the H264 data packet. The fourteenth byte is the FU Header, and the next five bits "00001" indicate the type of the NALU unit. Finally, the first three bits of the FU identifier and the last five bits of the FU Header form a NALU Header written into the bare stream file. There are also fragmentation packets, start packets, intermediate packets and end packets. It can be judged by the M flag in the RTP header whether the packet is an end packet, M is 1 means the end packet, M is 0 means it may be a start packet or an intermediate packet, and then judged by the seventh bit of the FU Header (counting from 0) Whether it is the starting package. It can also be judged by the seventh and sixth bits of the FU Header (counting from 0). 10 is the starting packet, 00 is the intermediate packet, and 01 is the ending packet.

3.5 NALU Payload for Single Packet and Fragment Packet

The NALU Payload of a single packet is the data of this NALU unit. The payloads of multiple fragmented packets together form the data of one NALU unit.

3.6 NALU Payload of Combination Package

There are multiple NALU unit data in the combination package, which need to be extracted separately, and the start code "00 00 00 01" is added to form one NALU unit to write into the file.


For example, there is such a section of RTP packet, "80 64 94 1d 00 04 e3 3c 9d 9c 66 fc" is the RTP packet header. 78 is the NAL Header, and the following five bits "11000" = 24 indicate that the packet is a combined packet. "00 0f" indicates the length of the first NALU unit (NALU header+payload), "67 42 c0 29 43 23 50 16 87 a4 03 c2 21 1a 80", 67 is the NALU header, and the next five bits "00111" indicate The NALU unit is of type sps. "00 04" indicates the length of the second NALU unit (NALU header+payload) "68 48 e3 c8" 68 is the NALU header, and the last five bits "01000" indicate that the NALU unit is of the PPS type.

/*
        80 64 94 1d 00 04 e3 3c 9d 9c 66 fc
        78 00 0f 67 42 c0 29 43 23 50 16 87
        a4 03 c2 21 1a 80 00 04 68 48 e3 c8
*/

Fourth, the H264 data format in the file

For the h264 bare stream file, it consists of NALU units one by one, and the NALU unit consists of the start code of 00 00 00 01 + NALU Header+NALU data.

Five, unpack the program to generate h264 bare stream file

The final generated receive.h264 file can be played directly using ffmplay receive.h264.

#define PARESERTP
#define TEST_PARESERTP

#ifdef TEST_PARESERTP
static FILE* poutfile = NULL;
const char* outputfilename = "G:\\receive.h264";

int initTest()
{
    poutfile = fopen(outputfilename, "ab+");
    if (!poutfile)
    {
        return -1;
    }
    return 0;
}

void uninitTest()
{
    if (poutfile)
    {
        fclose(poutfile);
        poutfile = NULL;
    }
}

#endif

#ifdef PARESERTP

#define DC_PRINT_DEBUG qDebug() << __FUNCTION__ << __LINE__
#define  MAXDATASIZE 2048


typedef struct
{
    unsigned char version;          	//!< Version, 2 bits, MUST be 0x2
    unsigned char padding;			 	//!< Padding bit, Padding MUST NOT be used
    unsigned char extension;			//!< Extension, MUST be zero
    unsigned char cc;       	   		//!< CSRC count, normally 0 in the absence of RTP mixers 		
    unsigned char marker;			   	//!< Marker bit
    unsigned char pt;			   		//!< 7 bits, Payload Type, dynamically established
    unsigned int seq_no;			   	//!< RTP sequence number, incremented by one for each sent packet 
    unsigned int timestamp;	       //!< timestamp, 27 MHz for H.264
    unsigned int ssrc;			   //!< Synchronization Source, chosen randomly
    unsigned char* payload;      //!< the payload including payload headers
    unsigned int paylen;		   //!< length of payload in bytes
} RTPpacket_t;

typedef struct
{
    /* byte 0 */
    unsigned short csrc_len : 4;        /* expect 0 */
    unsigned short extension : 1;       /* expect 1, see RTP_OP below */
    unsigned short padding : 1;         /* expect 0 */
    unsigned short version : 2;         /* expect 2 */
   /* byte 1 */
    unsigned short payloadtype : 7;     /* RTP_PAYLOAD_RTSP */
    unsigned short marker : 1;          /* expect 1 */
   /* bytes 2,3 */
    unsigned short seq_no;
    /* bytes 4-7 */
    unsigned int timestamp;
    /* bytes 8-11 */
    unsigned int ssrc;              /* stream number is used here. */
} RTP_FIXED_HEADER;


typedef struct
{
    unsigned char forbidden_bit;           //! Should always be FALSE
    unsigned char nal_reference_idc;       //! NALU_PRIORITY_xxxx
    unsigned char nal_unit_type;           //! NALU_TYPE_xxxx  
    unsigned int startcodeprefix_len;      //! prefix bytes
    unsigned int len;                      //! contains the nal length of the nal header, the length from the first 00000001 to the next 000000001
    unsigned int max_size;                 //! Do one more length of nal
    unsigned char* buf;                   //! contains nal data for the nal header
    unsigned int lost_packets;             //! reserved
} NALU_t;

typedef struct
{
    //byte 0
    unsigned char TYPE : 5;
    unsigned char NRI : 2;
    unsigned char F : 1;
} NALU_HEADER; // 1 BYTE 


typedef struct
{
    //byte 0
    unsigned char TYPE : 5;
    unsigned char NRI : 2;
    unsigned char F : 1;
} FU_INDICATOR; // 1 BYTE 


typedef struct
{
    //byte 0
    unsigned char TYPE : 5;
    unsigned char R : 1;
    unsigned char E : 1;
    unsigned char S : 1;
} FU_HEADER;   // 1 BYTES 


NALU_t* AllocNALU(int buffersize)
{
    NALU_t* n;

    if ((n = (NALU_t*)calloc(1, sizeof(NALU_t))) == NULL)
    {
        //DC_PRINT_DEBUG<<Str.sprintf("AllocNALU Error: Allocate Meory To NALU_t Failed ");
        exit(0);
    }
    return n;
}

void FreeNALU(NALU_t* n)
{
    if (n)
    {
        free(n);
    }
}


void rtp_unpackage(char* bufIn, int len, BYTE* outbuff, unsigned int* dataLen)
{
    unsigned char recvbuf[1500] = { 0 };
    RTPpacket_t* p = NULL;
    RTP_FIXED_HEADER* rtp_hdr = NULL;
    NALU_HEADER* nalu_hdr = NULL;

    FU_INDICATOR* fu_ind = NULL;
    FU_HEADER* fu_hdr = NULL;
    int total_bytes = 0;
    static int total_recved = 0;
    
    QString Str;
    static BYTE byHeadLong[] = { 0x00, 0x00, 0x00, 0x01 };
    int offest = 0;

    memcpy(recvbuf, bufIn, len);          //copy rtp package    

    //begin rtp_payload and rtp_header
    if ((p = (RTPpacket_t*)malloc(sizeof(RTPpacket_t))) == NULL)
    {
        DC_PRINT_DEBUG << Str.sprintf("RTPpacket_t MMEMORY ERROR\n");
    }

    if ((p->payload = (unsigned char*)malloc(MAXDATASIZE)) == NULL)
    {
        DC_PRINT_DEBUG << Str.sprintf("RTPpacket_t payload MMEMORY ERROR\n");
    }

    if ((rtp_hdr = (RTP_FIXED_HEADER*)malloc(sizeof(RTP_FIXED_HEADER))) == NULL)
    {
        DC_PRINT_DEBUG << Str.sprintf("RTP_FIXED_HEADER MEMORY ERROR\n");
    }

    // rtp_hdr =(RTP_FIXED_HEADER*)&recvbuf[0]; 
    memcpy((void*)rtp_hdr, (void*)&recvbuf[0], sizeof(RTP_FIXED_HEADER));

    p->version = rtp_hdr->version;
    p->padding = rtp_hdr->padding;
    p->extension = rtp_hdr->extension;
    p->cc = rtp_hdr->csrc_len;
    p->marker = rtp_hdr->marker;
    p->pt = rtp_hdr->payloadtype;
    p->seq_no = rtp_hdr->seq_no;
    p->timestamp = rtp_hdr->timestamp;
    p->ssrc = rtp_hdr->ssrc;
    // DC_PRINT_DEBUG << Str.sprintf("my rtp decode versoin: %d,payload type: 0x%x,seq number: 0x%x,timestamp: 0x%x,ssrc: 0x%x\n",
    //     rtp_hdr->version, rtp_hdr->payloadtype, rtp_hdr->seq_no, rtp_hdr->timestamp, rtp_hdr->ssrc);

    //end rtp_payload and rtp_header
    //
    //begin nal_hdr

    if ((nalu_hdr = (NALU_HEADER*)malloc(sizeof(NALU_HEADER))) == NULL)
    {
        DC_PRINT_DEBUG << Str.sprintf("NALU_HEADER MEMORY ERROR\n");
    }

    memcpy((void*)nalu_hdr, (void*)&recvbuf[12], 1);

    // DC_PRINT_DEBUG << Str.sprintf("forbidden_zero_bit: %d\n", nalu_hdr->F);
    // DC_PRINT_DEBUG << Str.sprintf("nal_reference_idc:  %d\n", nalu_hdr->NRI);
    // DC_PRINT_DEBUG << Str.sprintf("nal type: %d\n", nalu_hdr->TYPE);

    //end nal_hdr

    //Start unpacking There are three types for h264 CB, single packet, STAP-A combination packet and FU-A sub-packet
    if (nalu_hdr->TYPE == 0)
    {
        DC_PRINT_DEBUG << Str.sprintf("pkt error no 0 type\n");
    }
    else if (nalu_hdr->TYPE > 0 && nalu_hdr->TYPE < 24)  //Single packet 0x00 < type < 0x18
    {
        DC_PRINT_DEBUG << Str.sprintf("cur pkt is singal\n");

#ifdef TEST_PARESERTP
        putc(0x00, poutfile);
        putc(0x00, poutfile);
        putc(0x00, poutfile);
        putc(0x01, poutfile);	//Write into the start byte 0x00000001
#endif
        memcpy(outbuff + total_bytes, byHeadLong, sizeof(byHeadLong));
        total_bytes += sizeof(byHeadLong);

#ifdef TEST_PARESERTP
        if(poutfile)
            fwrite(nalu_hdr, 1, 1, poutfile);	//write NAL_HEADER
#endif
        memcpy(outbuff + total_bytes, nalu_hdr, sizeof(NALU_HEADER));
        total_bytes += sizeof(NALU_HEADER);

        memcpy(p->payload, &recvbuf[13], len - 13);
        p->paylen = len - 13;

#ifdef TEST_PARESERTP
        if(poutfile)
            fwrite(p->payload, 1, p->paylen, poutfile);	//write NAL data
#endif
        memcpy(outbuff + total_bytes, p->payload, len - 13);
        total_bytes += p->paylen;

    }
    else if (nalu_hdr->TYPE == 24)                    //STAP-A single time combination package 0x18
    {
        int pktLen = len - 13;
        /*
        80 64 94 1d 00 04 e3 3c 9d 9c 66 fc
        78 00 0f 67 42 c0 29 43 23 50 16 87
        a4 03 c2 21 1a 80 00 04 68 48 e3 c8
        */
        unsigned short naluSize;
        unsigned int nextNaluSizeIndex = 13;
        while (pktLen)
        {
            naluSize = recvbuf[nextNaluSizeIndex] << 8 | recvbuf[nextNaluSizeIndex + 1];

#ifdef TEST_PARESERTP            
            putc(0x00, poutfile);
            putc(0x00, poutfile);
            putc(0x00, poutfile);
            putc(0x01, poutfile);
#endif
            memcpy(outbuff + total_bytes, byHeadLong, sizeof(byHeadLong));
            total_bytes += sizeof(byHeadLong);
#ifdef TEST_PARESERTP
            if(poutfile)
                fwrite(&recvbuf[nextNaluSizeIndex + 2], naluSize, 1, poutfile);
#endif
            memcpy(outbuff + total_bytes, &recvbuf[nextNaluSizeIndex + 2], naluSize);
            total_bytes += naluSize;

            nextNaluSizeIndex = nextNaluSizeIndex + naluSize + 2;
            pktLen = pktLen - naluSize - 2;            
        }
    }
    else if (nalu_hdr->TYPE == 28 || nalu_hdr->TYPE == 29)                     //FU-A, FU-B fragmentation package
    {
        if ((fu_ind = (FU_INDICATOR*)malloc(sizeof(FU_INDICATOR))) == NULL)
        {
            DC_PRINT_DEBUG << Str.sprintf("FU_INDICATOR MEMORY ERROR\n");
        }
        if ((fu_hdr = (FU_HEADER*)malloc(sizeof(FU_HEADER))) == NULL)
        {
            DC_PRINT_DEBUG << Str.sprintf("FU_HEADER MEMORY ERROR\n");
        }

        // fu_ind=(FU_INDICATOR*)&recvbuf[12]; //The fragmented package uses FU_INDICATOR instead of NALU_HEADER
        memcpy((void*)fu_ind, (void*)&recvbuf[12], 1);  // 0x7c

        
        // DC_PRINT_DEBUG << Str.sprintf("FU_INDICATOR->F     :%d\n", fu_ind->F);
        // DC_PRINT_DEBUG << Str.sprintf("FU_INDICATOR->NRI   :%d\n", fu_ind->NRI);
        // DC_PRINT_DEBUG << Str.sprintf("FU_INDICATOR->TYPE  :%d\n", fu_ind->TYPE);


        // fu_hdr=(FU_HEADER*)&recvbuf[13]; //FU_HEADER assignment
        memcpy((void*)fu_hdr, (void*)&recvbuf[13], 1);  //0x85
        
        // DC_PRINT_DEBUG << Str.sprintf("FU_HEADER->S        :%d\n", fu_hdr->S);
        // DC_PRINT_DEBUG << Str.sprintf("FU_HEADER->E        :%d\n", fu_hdr->E);
        // DC_PRINT_DEBUG << Str.sprintf("FU_HEADER->R        :%d\n", fu_hdr->R);
        // DC_PRINT_DEBUG << Str.sprintf("FU_HEADER->TYPE     :%d\n", fu_hdr->TYPE);


        if (rtp_hdr->marker == 1)                      //The last packet of the fragmented packet
        {
            DC_PRINT_DEBUG << Str.sprintf("cur pkt is FU-A laster pkt\n");

            memcpy(p->payload, &recvbuf[14], len - 14);
            p->paylen = len - 14;
#ifdef TEST_PARESERTP
            if(poutfile)
                fwrite(p->payload, 1, p->paylen, poutfile);	//write NAL data
#endif
            memcpy(outbuff + total_bytes, p->payload, p->paylen);
            total_bytes += p->paylen;
            DC_PRINT_DEBUG << Str.sprintf("payload len = %d\n", p->paylen);

            // Save all fu-a subpackage data to memory
            memcpy(fuData + fuDataLen, outbuff, total_bytes);
            fuDataLen += total_bytes;
            fuReviceEnd = 1; // Received all FU packet data
        }
        else if (rtp_hdr->marker == 0)                 //Fragmented packets but not the last packet
        {
            if (fu_hdr->S == 1)                        //The first packet of the shard
            {
                unsigned char F;
                unsigned char NRI;
                unsigned char TYPE;
                unsigned char nh;
                DC_PRINT_DEBUG << Str.sprintf("cur pkt is FU-A first pkt\n");

#ifdef TEST_PARESERTP                
                putc(0x00, poutfile);
                putc(0x00, poutfile);
                putc(0x00, poutfile);
                putc(0x01, poutfile);
#endif
                memcpy(outbuff + total_bytes, byHeadLong, sizeof(byHeadLong));
                total_bytes += sizeof(byHeadLong);

                F = fu_ind->F << 7;
                NRI = fu_ind->NRI << 5;
                TYPE = fu_hdr->TYPE;
                nh = F | NRI | TYPE;   //NAL HEADER
#ifdef TEST_PARESERTP
                putc(nh, poutfile);
#endif
                memcpy(outbuff + total_bytes, &nh, 1);
                total_bytes += 1;

                memcpy(p->payload, &recvbuf[14], len - 14);
                p->paylen = len - 14;
#ifdef TEST_PARESERTP                
                if(poutfile)
                    fwrite(p->payload, 1, p->paylen, poutfile);
#endif
                memcpy(outbuff + total_bytes, p->payload, p->paylen);
                total_bytes += p->paylen;

                // DC_PRINT_DEBUG << Str.sprintf("payload len = %d\n", p->paylen);
                // Save all fu-a subpackage data to memory
                memset(fuData, 0, MAXFREAMLEN);
                isFU = 1; //first packet received
                memcpy(fuData + fuDataLen, outbuff, total_bytes);
                fuDataLen += total_bytes;

            }
            else                                      //sharded tundish
            {
                memcpy(p->payload, &recvbuf[14], len - 14);
                p->paylen = len - 14;
#ifdef TEST_PARESERTP                
                if(poutfile)
                    fwrite(p->payload, 1, p->paylen, poutfile);
#endif
                memcpy(outbuff + total_bytes, p->payload, p->paylen);
                total_bytes += p->paylen;
                // DC_PRINT_DEBUG << Str.sprintf("payload len = %d\n", p->paylen);

                // Save all fu-a subpackage data to memory
                memcpy(fuData + fuDataLen, outbuff, total_bytes);
                fuDataLen += total_bytes;
            }
        }
    }
    else
    {
        DC_PRINT_DEBUG << Str.sprintf("this pkt is error 30-31 no define\n");
    }

    // DC_PRINT_DEBUG<<"rtp decode total_bytes: "<<total_bytes;
    *dataLen = total_bytes;
    total_bytes = 0;
    memset(recvbuf, 0, 1500);
    free(fu_ind);
    free(fu_hdr);
    free(nalu_hdr);
    free(rtp_hdr);
    free(p->payload);
    free(p);

    //end unpacking
    //
    return;
}
#endif

Tags: network rtc

Posted by eektech909 on Sun, 02 Oct 2022 10:28:15 +1030