Network address translators (NAT) are used to overcome the lack of IPv4 address availability by hiding an enterprise or even an operator's network behind one or few IP addresses. The devices behind the NAT use private IP addresses that are not routable in the public Internet.The Session Initiation Protocol (SIP) has established itself as the de facto standard for voice over IP (VoIP) communication.[1] In order to establish a call, a caller sends a SIP message, which contains its own IP address. The callee is supposed to reply back with a SIP message destined to the IP addresses included in the received SIP message. This will obviously not work if the caller is behind a NAT and is using a private IP address.
Probably the single biggest mistake in SIP design was ignoring the existence of NATs. This error came from a belief in IETF leadership that IP address space would be exhausted more rapidly and would necessitate global upgrade to IPv6 and eliminate the need for NATs. The SIP standard has assumed that NATs do not exist, an assumption, which turned out to be a failure. SIP simply didn't work for the majority of Internet users who are behind NATs. At the same time it became apparent that the standardization life-cycle is slower than how the market ticks: Session Border Controllers (SBC)[2] were born, and began to fix what the standards failed to do: NAT traversal.
In case a user agent is located behind a NAT then it will use a private IP address as its contact address in the contact and via headers as well as the SDP part. This information would then be useless for anyone trying to contact this user agent from the public Internet. There are different NAT traversal solutions such as STUN, TURN and ICE.[3] Which solution to use depends on the behavior of the NAT and the call scenario. When using an SBC to solve the NAT traversal issues the most common approach for SBC is to act as the public interface of the user agents.[4] This is achieved by replacing the user agent's contact information with those of the SBC.
In order for a user agent to be reachable through the public interfaces of an SBC, the SBC will manipulate the registration information of the user agent. The user includes its private IP address as its contact information in the REGISTER requests. Calls to this address will fail, since it is not publicly routable. The SBC replaces the information in the contact header with its own IP address. This is the information that is then registered at the registrar. Calls destined to the user will then be directed to the SBC.
In order for the SBC to know which user agent is actually being contacted the SBC can keep a local copy of the user agent's registration. The local copy includes the private IP address and the user's SIP URI as well as the public IP address included in the IP header that was assigned to the SIP message by the NAT.
Alternatively the SBC can store this information in the forwarded SIP messages. This is displayed in the figure here. The user's contact information is combined in a special format and added as an additional parameter to the contact header. The contact information include the user's private IP address and SIP URI as well as the public IP address in the IP header of the SIP message. When the registrar receives a request for the user, the registrar will return the complete contact information to the proxy, which will include this information in the SIP message. The SBC can then retrieve this information from the SIP request and use it to properly route the request to the user.
Adding the user agent's contact information to the registered contact information has many advantages. As the SBC does not have to keep local registration information this solution is simple to implement and does not require memory for keeping the information. Further, requests destined to the user agent do not necessarily have to traverse the SBC that has processed the user agent's registration messages. Any SBC that can reach the user agent can correctly route messages destined to the user agent based on the information included in the SIP request. This advantage applies, however, only in some cases. In case the NAT used in front of the user agent accepts traffic only from the IP addresses which the user agent has contacted previously then only the SBC that has processed the user agent's REGISTER requests will be able to contact the user agent.
The other option is to keep a local copy of the registration information which can, however, increase the processing requirements on the SBC. The SBC will have to manage a local registration database. Beside the memory requirements the SBC will have to replicate this information to a backup system if it is to be highly available. This will further increase the processing requirements on the SBC and increase the bandwidth consumption.
However, keeping a local copy of the registration information has its advantages as well. When receiving a message from a user agent a network address translator binds the private IP address of the user agent to a public IP address. This binding will remain active for a period of time –binding period. In case the user agent does not send or receive any messages for a period of time longer than the binding period then the NAT will delete the binding and the user agent will no longer be reachable from the outside. To keep the binding active, the user agent will have to regularly refresh it. This is achieved by sending REGISTER requests at time intervals shorter than the binding period. As REGISTER messages have to be usually authenticated, having to deal with REGISTER messages sent at a high frequency would impose a high performance hit on the operator's infrastructure. SBCs can help to offload this load. When a user agent sends the first REGISTER request, the SBC forwards the REGISTER request to the operator's registration servers. Once the registration was successfully authenticated and accepted by the operator, the SBC will keep a local copy of the registration information. Instead of forwarding each incoming REGIETER request to the operator's registration servers, the SBC will only send REGISTER requests to the registration servers at rather large time intervals (in the range of hours). Registration requests arriving from the user agent that do not change the content registration information will be replied to by the SBC itself. The SBC will also inform the registration server once the local registration expires or changes.
Similar to the registration case, the SBC will also include itself in the path of INVITE and other request messages. When receiving an INVITE from a user agent behind a NAT, the SBC will include a via header with its own address, replace the information in the contact header with its own address and also replace the address information in the SDP body with its own address. Thereby, all SIP messages and media packets will traverse the SBC.
After the establishment of a call using SIP, media packets, namely voice, video or data are exchanged -usually using the Real-time Transport Protocol (RTP)While NAT traversal of SIP messages may appear complicated after all, the yet more complex task is enabling media to traverse NATs. The initial problem statement is the same. If SIP devices behind NATs advertise their IP addresses, their peers on the other side of NATs cannot route traffic to them. The solution SBCs came with simply ignores the way SIP works. Instead of sending media to the IP address and port number advertised in the SIP SDP bodies, SBCs send media for a user agent symmetrically back to where the agent has sent its own media from. This symmetric communication typically works because it is the traffic pattern NAT manufactures have been used to before the arrival of VoIP.
It is important to know that while this mostly works, it has several limitations. First of all, it only works with clients that are built "symmetric way", i.e., they use the same port for sending and receiving media. Nowadays that's fortunately the majority of available equipment.
The other noticeable disadvantage is "triangular routing": an SBC must relay all VoIP traffic for a call, to make the paths caller-SBC and SBC-callee symmetric. That is in fact quite an overhead for a VoIP operator. With the most common codec, G.711, a relayed call consumes four 87.2 kbit/s streams: two outbound, two inbound.
Some other disturbing limitations may occur too. For example, if a SIP device uses voice activity detection (VAD) and fails to send any voice packets initially, the SBC will not learn its address and will not forward incoming media to it as well.