Option trading software for nse12 comments
Royal de bank binary option brokers
This invention first presents SRAM based pipeline IP lookup architectures including an SRAM based systolic array architecture that utilizes multi-pipeline parallelism idea and elaborates on it as the base architecture highlighting its advantages. In this base architecture a multitude of intersecting and different length pipelines are constructed on a two dimensional array of processing elements in a circular fashion. The architecture supports the use of any type of prefix tree instead of conventional binary prefix tree.
The invention secondly proposes a novel use of an alternative and more advantageous prefix tree based on binomial spanning tree to achieve a substantial performance increase. The new approach, enhanced with other extensions including four-side input and three-pointer implementations, considerably increases the parallelism and search capability of the base architecture and provides a much higher throughput than all existing IP lookup approaches making, for example, a 7 Tbps router IP lookup front end speed possible.
Although theoretical worst-case lookup delay in this systolic array structure is high, the average delay is quite low, large delays being observed only rarely. The structure in its new form is scalable in terms of processing elements and is also well suited for the IPv6 addressing scheme. IP lookup solutions can be categorized into two main groups as software and hardware based approaches.
For software based solutions of LPM, the most commonly encountered data structure is the binary tree, in which the prefixes in the routing table are represented by marked tree nodes and the path from root to a node represents a prefix. IP lookup is performed by traversing the binary trie using the bits in the searched IP address. When a leaf node or a null pointer is reached, search operation terminates and the last matched prefix is selected as the longest matched prefix.
If all the prefix nodes are pushed to leaves, then the binary trie is called as leaf-pushed binary trie. In a leaf-pushed trie, a non-leaf node contains only pointers to its children and a leaf node contains only port number corresponding to its associated prefix.
Single SRAM based IP lookup solutions need multiple memory accesses during the trie traversal in order to complete a single search process. During the search for an IP address, a new incoming search request waits for the ongoing lookup to finish up. In order to speed this up, various SRAM based pipeline architectures have been proposed , , , , , , , , , , .
In these architectures, the trie is partitioned and mapped onto pipelines. These pipelines are composed of separate memory banks that are connected in sequence.
The trie traversal is then performed on these separate and multiple memory elements through the pipeline. There are enough memory elements and no stage is accessed more than once during a search in a conventional one dimensional pipeline architecture.
Although throughput is improved using a pipeline, an ordinary mapping of the binary trie onto the pipeline stages yields unbalanced memory utilization.
Various different solutions have been proposed to address the memory balancing problem , , , , . This approach is based on dividing the binary trie into subtrees and allocating each subtree starting point to a different pipeline stage to create a balanced pipeline. The starting stage for each search is then determined by a hash function.
The subtrees are stored in pipeline stages by using a level-based mapping where the trie nodes in same level are stored in the same pipeline stage. The depth of each subtree is selected to be less than or equal to the number of pipeline stages but the pipelines may roll back from the last stage to the first hence being called as ring structure.
This pipeline architecture has two virtual data paths. The first path is active during the odd clock cycles and it is used for finding the starting stage.
In other words, when a search is started, the pipeline is traversed until the relevant subtree root is found. The second data path is active during the even clock cycles and it correspond to the actual search continuing until the last stage of the pipeline. When the search is completed, the result propagates to the final stage for output through the remaining stages. Hence each pipeline stage works with a speed twice the maximum input packet rate. The throughput of the described Baboescu et al.
CAMP has a circular pipeline of memory blocks. The first block performs a direct lookup on the first r-bits of the address to find the stage where the root node of the corresponding subtree is stored. CAMP has multiple packet entry and exit points in order to improve the throughput.
Initially the trie is split into one root sub-trie and multiple leaf subtries. Root subtrie handles first r- bits of the IP address implemented as a table and a maximum of 2 r subtries are independently mapped to the pipeline.
Each pipeline stage has two entry and one exit points. Access conflicts are then solved by using a FIFO queue in each stage for the incoming requests. A request waits for an idle cycle to enter a pipeline stage. CAMP architecture can achieve a throughput of 0. With an alternative mapping based on the height rather than the levels of the search tree, a worst case per stage memory bound has been obtained .
Since the height of the nodes changes when the prefix distribution changes upon route updates, this mapping becomes dynamic. The trie is initially partitioned into subtrees which are then mapped onto the pipelines. In order to perform a balanced mapping, they proposed an approximation algorithm. Also within each pipeline, node to stage mapping is done and nops no operations  are used in some stages to balance the trie node distribution.
This approach is efficient in terms of memory consumption but the search process is complicated. This pipeline is the one that stores the corresponding subtree and the address of the subtree's root in the first stage of the pipeline. POLP architecture also uses prefix caching for short term traffic bias, whereas the long term traffic among pipelines is balanced by an exchange based algorithm. This algorithm remaps the subtrees to the pipelines dynamically, bringing an extra disruption to the search process.
POLP is improved further in later studies. For example a bidirectional linear pipeline is introduced in  and the idea of flow caching from Layer 4 switching is introduced in . Flow caching eliminates long waiting times for cache updates. US 7,, B2 "Processor having systolic array pipeline for processing data packets". The "IP lookup unit" of the network processor presented in the above invention is also designed by using systolic array architecture.
But each stage of the systolic array architecture is composed of a small register file and functional units and exhibits a single pipelining behavior in one dimension only. The new architecture presented in this invention is a systolic array architecture that employs parallel pipelines in two dimensions. A multitude of intersecting and different length pipelines are constructed on the two dimensional array of PEs in a circular fashion and hence more benefit is obtained from parallelism.
US 7,, Bl "Processor having systolic array pipeline for processing data packets". In the existing patent, a single on-chip memory unit is used for storing the whole routing table. In the structure developed within the present invention, each PE in the systolic array architecture maintains a smaller memory unit and therefore by employing more than one memory unit, it is possible to make parallel and independent memory accesses to each unit that increases the parallelism further.
In the existing patent, all units are programmable. In the new structure within the present invention, each PE in the systolic array triggers the next one and hence there is no need for programmability. US 7,, Bl "Redundant packet routing and switching device and method".
The structure proposed in the existing patent operates at line speeds of 40 Gbps. On the other hand, the performance level achieved with the new structure developed within this invention is about 7 Tbps. The growth of the number of hosts on Internet and the increase in line speeds have resulted in powerful forwarding engines that can operate at line data rates and can cope with increasing numbers of routing table entries. In early days of Internet the simple class-based addressing scheme was sufficient but with the introduction of the classless inter domain routing CIDR scheme IP route lookup has become a major task for an Internet router.
CIDR has two major benefits. First, prefixes can be of arbitrary length in CIDR and hence the address space can be used more efficiently.
Second, CIDR allows arbitrary aggregation of networks and therefore the number of routing table entries decreases, which slows down the growth of forwarding tables. In CIDR, addresses may match two or more entries in a routing table because of prefix overlap. In such cases for a correct decision, the router must find the most specific match, which is referred as the longest prefix matching LPM. LPM is harder than finding the exact match because the destination address of an arriving packet does not carry any information about the length of the longest matching prefix.
For LPM, there are both software and hardware based solutions proposed in the literature , . For measuring the success of these IP route lookup solutions various metrics such as number of memory accesses, memory size, update time, scalability and flexibility are used. A key can be searched in an entire list of pre-stored memory entries in one clock cycle using a content addressable memory CAM.
The conventional CAMs store binary values and hence can be searched only for an exact match. Ternary content addressable memory TCAM , in which each cell can take values 0, 1 or don't care, is more powerful because don't cares may act as wildcards during a search and hence LPM can be solved naturally in one cycle .
Although TCAMs are quite popular, they have high cost and high power consumption as major drawbacks , , , , , , , , , . Other handicaps are their slow updating rate and low scalability. Internet requires a forwarding table to be updated frequently to reflect the route changes and a single update, which includes either adding or deleting a prefix, may involve multiple TCAM entry moves in order to maintain the order of the TCAM entries.
Low scalability arises from the need for a priority encoder at the TCAM output stage. But if LPM is solved using an SRAM, the number of memory accesses is determined by the average depth of the binary tree that stores prefixes in the routing table. The tree traversal process on a single SRAM needs multiple memory accesses and hence multiple clock cycles for finding the longest matched prefix.
Therefore, SRAM based pipeline architectures, which improve the throughput, have also become popular , . The first parallel multi-pipeline SRAM based architecture for IP lookup, in which each pipeline can operate concurrently, has appeared in  and enhanced in . A systolic array is a natural structure for multiple and intersecting pipelines. For this purpose, i a special systolic array processing element PE composed of a basic SRAM, FIFO queue and associated peripheral logic circuitry is designed, ii invented PEs are organized in a two dimensional circular structure to get different length and intersecting pipelines, iii a suitable and efficient tree mapping scheme is devised iv the corresponding IP lookup process is presented.
In SAAFFIL, instead of using n parallel pipelines, each of which has m stages with a total of nm resources, a circular structure in the form of rn x mn square grid of resources is used.
Overall, in the invention, the following major contributions are made:. For this purpose in the invention;. For this purpose, the following are considered;.
The effectiveness of the invention is demonstrated through simulations using both synthetic and real life IP traces. The prefix tree of any kind can be mapped onto the invented two dimensional array of PEs. Use of binomial spanning tree instead of binary search tree, which necessitates modifications in the architecture, brings the following advantages in addition to SAAFFIL advantages:.
To benefit from the binomial spanning tree, efficient implementations of it should be employed. In this invention, a multi-pointer binomial spanning tree implementation is used as an extension.