Let see how to parse the AVCVIDEOPACKET accordingly with FLV video tag specfication. As an example I took one of Youtube videos to show where is the record. On the picture below you can see the first video packet inside red rectangle (1).
1. FLV Tag. First red rectangle.
09 - TagType(UB[5]): Video Packet. Previous fields Reserved(UB[2]) = 0, Filter(UB[1]) = 0. Sum of these fields length = 1Byte = UB[5]+UB[2]+UB[1].
00 00 2F - DataSize(UI[24]): Length of the message. 00 00 2F = 47Bytes. Number of bytes after StreamID to end of tag (Equal to length of the tag – 11).
00 00 00 - Timestamp(UI[24]): Time in milliseconds at which the data in this tag applies. This value is relative to the first tag in the FLV file, which always has a timestamp of 0.
00 - TimestampExtended(UI[8]): Extension of the Timestamp field to form a SI32 value.
00 00 00 - StreamID (UI[24]): Always 0.
2. VideoTagHeader. Yellow rectangle.
17 - Composition of two fields. Frame Type(UI[4]) = 1 means It's key frame. And CodecID(UI[4]) = 7 means It's AVC encoded frame.
00 - AVCPacketType(UI[8]). 00 = 0 = AVC sequence header.
00 00 00 - CompositionTime (UI[24]). Defined only for AVCPacketType == 1. Otherwise always = 0.
3. VideoTagBody. White rectangle. If AVCPacketType == 0 AVCDecoderConfigurationRecord.
01 - configurationVersion (UI[8]) = 1
4D - AVCProfileIndication (UI[8]) = 77 contains the profile code as defined in ISO/IEC 14496-10.
40 - profile_compatibility (UI[8]) = 64 is a byte defined exactly the same as the byte which occurs between the profile_IDC and level_IDC in a sequence parameter set (SPS), as defined in ISO/IEC 14496-10.
1E - AVCLevelIndication (UI[8]) = 30 contains the level code as defined in ISO/IEC 14496-10.
FF - Composition of two fields. reserved = ‘111111’b. And lengthSizeMinusOne (UI[2]) = 11b = 3, indicates the length in bytes of the NALUnitLength field in an AVC video sample or AVC parameter set sample.
E1 - Composition of two fields. reserved = ‘111’b. And numOfSequenceParameterSets (UI[5]) = 000001b = 1 indicates the number of SPSs that are used as the initial set of SPSs for decoding the AVC elementary stream.
Next fields are included into a cycle.
for (i=0; i< numOfSequenceParameterSets; i++) {
unsigned int(16) sequenceParameterSetLength ;
bit(8*sequenceParameterSetLength) sequenceParameterSetNALUnit; }
------ Start Cycle ------
4. Green rectangle. 00 1B - sequenceParameterSetLength (UI[16]) = 27 indicates the length in bytes of the SPS NAL unit as defined in ISO/IEC 14496-10.
5. Green rectangle. 67 4D 40 ... B1 72 40 - pictureParameterSetNALUnit (bit(8*pictureParameterSetLength)=8*27b=27Bytes). contains a SPS NAL unit, as specified in ISO/IEC 14496-10. SPSs shall occur in order of ascending parameter set identifier with gaps being allowed.
------ End Cycle ------
01 - numOfPictureParameterSets (UI[8]). indicates the number of picture parameter sets (PPSs) that are used as the initial set of PPSs for decoding the AVC elementary stream.
Next fields are included into a cycle.
for (i=0; i< numOfPictureParameterSets; i++)
{ unsigned int(16) pictureParameterSetLength;
bit(8*pictureParameterSetLength) pictureParameterSetNALUnit; }
------ Start Cycle ------
6. Green rectangle. 00 04 - pictureParameterSetLength (UI[16]) = 4 indicates the length in bytes of the PPS NAL unit as defined in ISO/IEC 14496-10.
7. White rectangle. 68 EE 32 C8 - pictureParameterSetNALUnit (bit(8*pictureParameterSetLength) = 8*4b = 4Bytes) contains a PPS NAL unit, as specified in ISO/IEC 14496-10. PPSs shall occur in order of ascending parameter set identifier with gaps being allowed.
------ End Cycle ------
8. Red rectangle. 00 00 00 3A - PreviousTagSize1 (UI[32]) = 58Bytes. Size of previous tag, including its header, in bytes. Size of previous tag, including its header, in bytes. For FLV version 1, this value is 11 plus the DataSize of the previous tag You may check the length of the tag pictured sections from 1 to 7. It's exactly 58Bytes.
Summary
H264 profile should give the following data:
AVCProfileIndication
profile_compatibility
AVCLevelIndication
lengthSizeMinusOne (length in bytes of the NALUnitLength)
numOfSequenceParameterSets
sequenceParameterSetLength
sequenceParameterSetNALUnit
numOfPictureParameterSets
pictureParameterSetLength
pictureParameterSetNALUnit
Download
1. Adobe Flash Video File Format Specification Version 10.1
2. Advanced Video Coding (AVC) file format
1. FLV Tag. First red rectangle.
09 - TagType(UB[5]): Video Packet. Previous fields Reserved(UB[2]) = 0, Filter(UB[1]) = 0. Sum of these fields length = 1Byte = UB[5]+UB[2]+UB[1].
00 00 2F - DataSize(UI[24]): Length of the message. 00 00 2F = 47Bytes. Number of bytes after StreamID to end of tag (Equal to length of the tag – 11).
00 00 00 - Timestamp(UI[24]): Time in milliseconds at which the data in this tag applies. This value is relative to the first tag in the FLV file, which always has a timestamp of 0.
00 - TimestampExtended(UI[8]): Extension of the Timestamp field to form a SI32 value.
00 00 00 - StreamID (UI[24]): Always 0.
2. VideoTagHeader. Yellow rectangle.
17 - Composition of two fields. Frame Type(UI[4]) = 1 means It's key frame. And CodecID(UI[4]) = 7 means It's AVC encoded frame.
00 - AVCPacketType(UI[8]). 00 = 0 = AVC sequence header.
00 00 00 - CompositionTime (UI[24]). Defined only for AVCPacketType == 1. Otherwise always = 0.
3. VideoTagBody. White rectangle. If AVCPacketType == 0 AVCDecoderConfigurationRecord.
01 - configurationVersion (UI[8]) = 1
4D - AVCProfileIndication (UI[8]) = 77 contains the profile code as defined in ISO/IEC 14496-10.
40 - profile_compatibility (UI[8]) = 64 is a byte defined exactly the same as the byte which occurs between the profile_IDC and level_IDC in a sequence parameter set (SPS), as defined in ISO/IEC 14496-10.
1E - AVCLevelIndication (UI[8]) = 30 contains the level code as defined in ISO/IEC 14496-10.
FF - Composition of two fields. reserved = ‘111111’b. And lengthSizeMinusOne (UI[2]) = 11b = 3, indicates the length in bytes of the NALUnitLength field in an AVC video sample or AVC parameter set sample.
E1 - Composition of two fields. reserved = ‘111’b. And numOfSequenceParameterSets (UI[5]) = 000001b = 1 indicates the number of SPSs that are used as the initial set of SPSs for decoding the AVC elementary stream.
Next fields are included into a cycle.
for (i=0; i< numOfSequenceParameterSets; i++) {
unsigned int(16) sequenceParameterSetLength ;
bit(8*sequenceParameterSetLength) sequenceParameterSetNALUnit; }
------ Start Cycle ------
4. Green rectangle. 00 1B - sequenceParameterSetLength (UI[16]) = 27 indicates the length in bytes of the SPS NAL unit as defined in ISO/IEC 14496-10.
5. Green rectangle. 67 4D 40 ... B1 72 40 - pictureParameterSetNALUnit (bit(8*pictureParameterSetLength)=8*27b=27Bytes). contains a SPS NAL unit, as specified in ISO/IEC 14496-10. SPSs shall occur in order of ascending parameter set identifier with gaps being allowed.
------ End Cycle ------
01 - numOfPictureParameterSets (UI[8]). indicates the number of picture parameter sets (PPSs) that are used as the initial set of PPSs for decoding the AVC elementary stream.
Next fields are included into a cycle.
for (i=0; i< numOfPictureParameterSets; i++)
{ unsigned int(16) pictureParameterSetLength;
bit(8*pictureParameterSetLength) pictureParameterSetNALUnit; }
------ Start Cycle ------
6. Green rectangle. 00 04 - pictureParameterSetLength (UI[16]) = 4 indicates the length in bytes of the PPS NAL unit as defined in ISO/IEC 14496-10.
7. White rectangle. 68 EE 32 C8 - pictureParameterSetNALUnit (bit(8*pictureParameterSetLength) = 8*4b = 4Bytes) contains a PPS NAL unit, as specified in ISO/IEC 14496-10. PPSs shall occur in order of ascending parameter set identifier with gaps being allowed.
------ End Cycle ------
8. Red rectangle. 00 00 00 3A - PreviousTagSize1 (UI[32]) = 58Bytes. Size of previous tag, including its header, in bytes. Size of previous tag, including its header, in bytes. For FLV version 1, this value is 11 plus the DataSize of the previous tag You may check the length of the tag pictured sections from 1 to 7. It's exactly 58Bytes.
Summary
H264 profile should give the following data:
AVCProfileIndication
profile_compatibility
AVCLevelIndication
lengthSizeMinusOne (length in bytes of the NALUnitLength)
numOfSequenceParameterSets
sequenceParameterSetLength
sequenceParameterSetNALUnit
numOfPictureParameterSets
pictureParameterSetLength
pictureParameterSetNALUnit
Download
1. Adobe Flash Video File Format Specification Version 10.1
2. Advanced Video Coding (AVC) file format
Ремонт компьютеров
 FLV Video tag
FLV Video tag 
Hi Vadym,
ReplyDeleteThis post was very helpful for a project I'm working on. I also need to parse AVCVIDEOPACKET when AVCPacketType == 1. In this case, the spec says that AVCVIDEOPACKET contains "One or more NALUs (Full frames are required)". Do you know how to parse those NALUs? Are they somehow framed? If you could put an example it would be great.
Thanks!
Good day Fernando!
ReplyDeleteThere are two formats for the NALU packaging:
first is annex-b that contains SPS, PPS data. And you can parse it. If I understand correctly this data populated in whole h264 stream.
second one is AVCC format. In this case we don have SPS and PPS data for any NALU. In this case decoder "needs to know what the is the size of the nal_size field, in bytes".
http://aviadr1.blogspot.com/2010/05/h264-extradata-partially-explained-for.html
Also you have to consider what container you need to parse. Those formats are not codec related because It are used to build frames inside a container. For example FLV container uses AVCC format that requires start AVC Decoder Record and decoder doesn't need to parse each NALU header.
Hi Vadym,
ReplyDeleteThank you for the post. I am also working on a related project and was hoping that you could help me to understand something.
My FLV stream plays fine assuming that I play from the start and everything in order... however I want to accommodate "late joiners" to the feed. I have been able to make this work with H.263 and AAC codec... where I needed to send the FLV header, Meta, and first AAC packet so late joiners could start the stream correctly.
However for AVC...
My feed is sending out first video packet AVCPaketType = 0 then all the rest are AVCPaketType = 1. I have tried making sure that the AVCPacketType = 0 is the first video packet into the stream.. but this does not seem to work.
Any idea on what I am missing here?
Thanks :)