In this section, practical aspects of VAD/CNG packetization, interoperability, bandwidth saving, and testing aspects are given.
VAD/CNG is packetized similar to any other voice payload with some minor exceptions. The deviations are in payload type, marker bit setting, and multiple frames handling. VAD algorithms work in a frame of 10 ms or with the basic frame of the codec. With an implementation of 5 ms for the smaller frames, these algorithms may be made to operate on two 5-ms frames. Simple power-based VAD-I can be made to operate with flexible frame sizes of 5, 10, and 20 ms.
The VAD packet can appear immediately after the speech packet, which coincides with speech-to-silence transition. Immediately after the VAD packet, for at least two more voice frames duration, VAD packets are not delivered on the network, which is similar to hangover operation, and after completing VAD hangover time, VAD packets can be sent on the network. The DTX algorithm will decide on when to send the next updated VAD packet. A speech packet can be an adjacent packet to the VAD packet, which means the speech packet can precede or succeed the VAD packet.
Voice solutions use multiple frames up to 80 ms in RTP payload duration. At the input of RTP, several frames are collected to form packets with a required duration. In multiframe packetization, to send any VAD packet, RTP has to release the available voice frames without ...