In computer networking, a flit (flow control unit or flow control digit) is a link-level atomic piece that forms a network packet or stream.[1] The first flit, called the header flit holds information about this packet's route (namely the destination address) and sets up the routing behavior for all subsequent flits associated with the packet. The header flit is followed by zero or more body flits, containing the actual payload of data. The final flit, called the tail flit, performs some book keeping to close the connection between the two nodes.
A virtual connection holds the state needed to coordinate the handling of the flits of a packet. At a minimum, this state identifies the output port of the current node for the next hop of the route and the state of the virtual connection (idle, waiting for resources, or active). The virtual connection may also include pointers to the flits of the packet that are buffered on the current node and the number of flit buffers available on the next node.[2]
The growing need for performance from computing systems drove the industry into the multi-core and many-core arena. In this setup, the execution of a kernel (a program) is split across multiple processors and the computation happens in parallel, thus ensuring performance with respect to execution time. This however implies that the processors must now be able to communicate with each other and exchange data and control signals seamlessly. One straightforward approach is the bus based interconnect, a group of wires connecting all the processors. This approach is however not scalable as the number of processors in the system increase. Hence, a scalable high performance interconnect network lies at the core of parallel computer architecture.
The formal definition of an interconnection network
"An interconnection network I is represented by a strongly connected directed multigraph, I = G(N,C). The set of vertices of the multigraph N includes the set of processing element nodes P and the set of router nodes RT. The set of arcs C represents the set of unidirectional channels (possibly virtual) that connect either the processing elements to the routers or the routers to each other".[3]
The primary expectation of an interconnection network is to have as low a latency as possible, that is the time taken to transfer a message from one node to another should be minimal, while allowing a large number of such transactions to take place concurrently.[4] As with any other engineering design trade offs, the interconnection network must accomplish these traits while keeping the cost of implementation as low as possible. Having discussed what is expected of a network, let us look at a few design points that can be tweaked to obtain the necessary performance.
The basic building blocks of an interconnection network are its topology, routing algorithm, switching strategy and the flow control mechanism.
Topology: This refers to the general infrastructure of the interconnection network; the pattern in which multiple processors are connected. This pattern could either be regular or irregular, though many multi-core architectures today use highly regular interconnection networks.
Routing algorithm: This determines which path the message must take in order to ensure delivery to the destination node. The choice of the path is based on multiple metrics such as latency, security and number of nodes involved etc. There are many different routing algorithms, providing different guarantees and offering different performance trade-offs.
Switching strategy: The routing algorithm only determines the path that a message must take to reach its destination node. The actual traversal of the message within the network is the responsibility of the switching strategy. There are basically two types of switching strategies, a circuit switched network is a network where a path is reserved and blocked off from other messages, till the message is delivered to its destination node. A famous example of circuit switched network is the telephone services, which establish a circuit through many switches for a call. The alternative approach is the packet switched network where messages are broken down into smaller compact entities called packets. Each packet contains a part of data in addition to a sequence number. This implies that each packet can now be transferred individually and assembled at the destination based on the sequence number.
Flow control: Note that we have previously established the fact that multiple messages can flow through the interconnect network at any given time. It is the responsibility of the flow control mechanism implemented at the router level to decide which message gets to flow and which message is held back.
Every network has a width w
It is important to note that flits represent logical units of information, while phits represent the physical domain, that is, phits represent the number of bits that can be transferred in parallel in a single cycle. Consider the Cray T3D.[5] It has an interconnection network which uses flit level message flow control wherein each flit is composed of eight 16-bit phits. That means its flit size is 128bits and phit size is 16bits. Also consider the IBM SP2 switch.[6] It also uses the flit level message flow control, but its flit size is equal to its phit size, which is set to 8 bits.
Note that the message size is the dominant deciding factor (among many others) in deciding the flit widths. Based on the message size, there are two conflicting design choices:
Based on the size of the packets, the width of the physical link between two routers has to be decided. Meaning, if the packet size is large, the link width also has to be kept large, however, a larger link width implies more area and higher power dissipation. In general, link widths are kept to a minimum. The link width (which also decides the phit width) now factors into deciding the flit width.[7]
At this point, it is important to note that though inter-router transfers are necessarily constructed in terms of phits, the switching techniques deal in terms of flits. For more details regarding the various switching techniques refer wormhole switching and cut-through switching. Since the majority of switching techniques work on flits, they also have a major impact in deciding the flit width. Other determining factors include reliability, performance and implementation complexity.
Consider an example of how packets are transmitted in terms of flits. In this case we have a packet transmitting between A and B in the figure. The packet transmitting process is happening in the following steps.
A flit (flow control units/digits) is a unit amount of data when the message is transmitting in link-level. The flit can be accepted or rejected at the receiver side based on the flow control protocol and the size of the receive buffer. The mechanism of link-level flow control is allowing the receiver to send a continuous signals stream to control if it should keep sending flits or stop sending flits. When a packet is transmitted over a link, the packet will need to be split into multiple flits before the transmitting begin.