• Signal processors target high-performance, portable applications

    With the performance and portability requirements of critical applications in markets such as industrial automation, embed-ded vision, video surveillance, and medical imaging on the rise, embedded designers are looking for low-power, cost-optimized components. With these applications in mi…

  • Real-time performance: Build or buy?

    Ever-growing demands and challenges could render in-house OS development a thing of the past.

  • Roving Reporter: Verifying code performance on multicore devices

    As designers transition to the latest generation multicore processors, software must be divided into separate partitions to gain the performance benefits of parallel execution.  With Intel® Core™ processors, developers have access to multiple techniques to enable this performance gain including symmetric or asymmetric multiprocessing and virtualization. In the symmetric multiprocessing (SMP) configuration, a single operating system allocates threads or tasks across the available cores while managing common memory and hardware resources. Asymmetric multiprocessing (AMP) allows each core to run independent software so that a single system can easily combine real-time, deterministic tasks with a graphical user interface. With virtualization, a hypervisor isolates and allocates system resources between the operating environments so that real-time, general-purpose, and legacy software can be readily integrated in a multicore system.


    Each of these performance improvement techniques has some risks especially for safety-critical applications. For example, when parallel, multithreaded applications become part of the software structure they are vulnerable to “race conditions” which occur when concurrent routines have access to, and one of them modifies, a shared memory location. Software failures and bugs caused by these race conditions are not deterministic and may be extremely difficult to locate with normal testing procedures. When developing parallel software, programmers must anticipate these simultaneous activities and once a routine begins to access memory, it must lock out other activities until the transaction is completed.  Although this approach sounds feasible, in practice it may run into problems if two routines are trying to “lock out” other activities simultaneously. To tackle these somewhat subtle coding problems, a number of software vendors have integrated analysis tools into their development packages to optimize the transition from serial applications to parallelism.


    For C, C++, Java, and C# software development projects, Klocwork offers the Insight static code analysis tools to automatically locate critical programming bugs and security vulnerabilities in source code (See figure 1).  Klocwork has developed several approaches to locate and eliminate coding problems associated with multicore architectures. In a whitepaper entitled “Developing Software in a Multicore & Multiprocessor World”, Klocwork CTO Gwyn Fisher discusses techniques to identify unique software problems – specifically concurrency errors and endian incompatibilities. The paper also cites a VDC Research report that multicore and multiprocessor software projects are 4.5 times more expensive and have 25% longer schedules than the single core equivalents. In addition, Klocwork offers an on-demand webinar outlining the complexity of porting software to multiprocessor architectures and demonstrating the use of their Truepath automated, whole-program analysis engine.




    Unlike other source code analyzers that run as separate tools, DoubleCheck from Green Hills Software is an integrated static analyzer built directly into the MULTI Integrated Development Environment allowing compilation and defect analysis in the same pass. DoubleCheck evaluates potential execution paths through the code to determine how the values of program variables could change across these paths. To simplify the debugging of complex projects with multiple threads of execution, multiple cores, or multiple boards, the Green Hills Integrated Target List displays all system components hierarchically, making it easier to see relationships among applications, address spaces, tasks, and threads (See figure 2). Status information is displayed for all components, so you can quickly check the system state. The target list in the debugger allows you to follow application execution from one context to another with a single click. You can watch as different threads interact and sort out complex interdependencies easily.




    After software has been successfully divided into secure code segments using static analysis tools, the 2nd generation Intel® Core™ architecture provides multiple features that augment the performance benefits of parallel execution.  For example, specialized Intel® functions such as Extended Page Tables (EPT) and Page Attribute Table (PAT) provide a hardware assist to the partitioning and allocation of physical memory among multiple cores. The processors also feature Intel® Virtualization Technology for flexible virtualization and Intel® QuickPath Technology to maximize multi-core performance.  If you are starting a new multicore project with and you have questions about static analysis tools, please share your concerns with fellow followers of the Intel® Embedded Community. You can also keep up with the latest technical details and product announcements at the Embedded Computing Design archives on Multicore static code analysis.


    To view other community content on interoperability, see “Interoperability – Top Picks


    Warren Webb
    OpenSystems Media®, by special arrangement with Intel® Embedded Alliance


    Klocwork and Green Hills Software are Affiliate members of the by Intel® Embedded Alliance.

  • Embedded goes virtual

    Virtualization software facilitates the simplified design, easy upgradability, and increased optimization of embedded systems.

  • Techniques, tools, and tips impart design insight

    Editorial Director Warren Webb walks the tight rope between performance and power dissipation, a topic featured in this issue of Embedded Computing Design.

  • Think long term on design decisions, day-to-day or app-specific

    Editorial Director Warren Webb’s take on the importance of memory architecture selection in embedded device design and a roundup of this edition of Embedded Computing Design.

  • Bridge joins interconnect protocols

    As embedded designers start each new project, they face the potential challenge of matching the latest high-performance computing devices with legacy peripherals. Targeting these challenges, Integrated Device Technology (IDT) recently announced a protocol-conversion bridge allowing designe…

  • Roving Reporter: Breaking Networking Performance Barriers with Multi-core Packet Acceleration

    Telecom infrastructure networks are experiencing dramatic increases in data traffic, largely driven by multimedia content from wireless and wired devices. This growth is challenging service providers to meet network performance demands while growing their average revenue per user. From the hardware perspective, the latest Intel Xeon® processors provide the efficient performance needed to deal with new demands while allowing network equipment manufacturers to consolidate application, control, and packet processing on the same platform for a more efficient solution.  Although traditional software solutions have not been optimized for network performance, telecom equipment providers must find new techniques to combine operating systems, networking software and multi-core silicon to address business and performance challenges in today’s highly competitive market.


    Offering a software solution for network workload consolidation, Wind River recently announced a networking acceleration stack for Linux and VxWorks aimed at accelerating IP packet forwarding on carrier-grade telecommunications equipment. The Wind River Network Acceleration Platform manages processing operations over multiple cores to accelerate control and data plane activities and deliver multiple gigabit Ethernet wire-speed performance (See figure 1). By adopting the multi-core asymmetric multiprocessing (AMP) approach the Network Acceleration Platform enhances the standard Linux networking stack to support high-performance network acceleration capabilities exceeding what is possible using symmetric multiprocessing (SMP) mode alone. Offering 10 times the performance of standard Linux configurations, Wind River Network Acceleration Platform helps eliminate bottlenecks not only in moving packets through the silicon itself, but throughout the entire networking platform. The latest release expands hardware support for the Intel Xeon next-generation multicore processors and allows network performance to scale efficiently with the number of cores in a processor. In addition, Wind River is providing Network Acceleration developers with multi-core enabled development, testing, debugging, and simulation tools required to simulate and test systems in complex multi-core environments.




    Extending packet processing performance, Intel recently announced its 32nm-fabricated Xeon® 5600 processors as the next generation upgrade to the Xeon® 5500 family. In addition to several enterprise-class versions of the Xeon® 5600, Intel introduced four models with seven-year lifecycle support for embedded applications. The 2.4GHz, 80 Watt Xeon® E5645 and the 2.0GHz, 60 Watt Xeon® L5638 each provide six cores and 12 threads. There are two quad-core embedded versions including the 4GHz, 80 Watt Xeon® E5620 and the 1.86GHz, 40 Watt Xeon® L5618. The Intel Xeon® processor 5600 series includes Intel® AES New Instructions (Intel® AES-NI), improving performance for disk and database encryption plus secure Internet transactions. The processors also feature Intel® Virtualization Technology for flexible virtualization, Intel® QuickPath Technology to maximize multi-core performance, plus Intel® Hyper-Threading Technology to deliver top performance for bandwidth-intensive applications.


    6WINDGate from 6WIND is another software solution that provides packet processing optimization for networking equipment, wireless infrastructure, security appliances and data centers (See figure 2). The software provides up to ten times the packet processing performance of a standard networking stack and significantly improves the price-performance and power-performance ratios of networking equipment. 6WINDGate is compatible with standard operating system APIs to ensure that clients can migrate either from a single-core to a multi-core platform, or from one multi-core platform to another, without needing to rewrite their existing software.  On a dual-core Intel® Xeon® processor E5645 platform with a clock speed of 3.33GHz, 6WINDGate delivers over 16 million packets per second, per core of IP forwarding performance, thereby forwarding 10Gbps of network traffic in each core. This performance scales linearly with the number of cores configured to run 6WINDGate until the maximum bandwidth of the hardware platform is reached. Processor cores not used to run 6WINDGate are available to run value-added application software or Virtual Machines (VMs), resulting in an efficient and flexible system for advanced networking equipment.




    Providing off-the-shelf hardware to support the transition to integrated network workloads, Emerson Network Power recently introduced a new AdvancedTCA server blade featuring a six-core processor from the Intel® Xeon® processor 5600 series to deliver optimized virtualization and power management. The dual-processor ATCA-7365 is the company’s highest-performance 10Gbps ATCA server blade to date and supports up to 96GB DDR3 memory (See figure 3).  Designed to enable network and service providers to lower their capital and operating expenses, the off-the-shelf ATCA-7365 supports a broad array of communications applications that require high network throughput such as telecom. The ATCA-7365 is available with a variety of rear transition module (RTM) variants to support different I/O configurations and can be configured with a variety of software offerings including Red Hat Enterprise Linux 5.4, Wind River Platform for Network Equipment Linux Edition 3.0 and Microsoft Windows Server 2008.




    With new software solutions to complement the latest Xeon processors, telecom equipment system providers have fresh options to consolidate processing workloads and improve network infrastructure performance. You can find more information and technical articles on Intel network acceleration architecture at the Intel® Embedded Community page on Xeon processors.  If you are starting a new telecom or packet acceleration project with and you have questions, please share your concerns with fellow followers of the Intel® Embedded Community. You can also keep up with the latest technical details and product announcements at the Embedded Computing Design archives on Multi-core Packet Acceleration.


    To view other community content on workload consolidation, see “Workload Consolidation – Top Picks


    Warren Webb
    OpenSystems Media®, by special arrangement with Intel® Embedded Alliance


    Emerson Network Power is a Premier member of the by Intel® Embedded Alliance. Wind River Systems is an Associate member and 6WIND is an Affiliate member of the Alliance.

  • Building universal connectivity

    Warren’s thoughts on the maturation of ubiquitous pervasive-computing, and an introduction to the Annual Resource Guide edition of Embedded Computing Design.

  • Verifying vital designs

    Device certification, software analysis, reduced power consumption are all ubiquitous issues in the embedded design industry. Here, Warren introduces the remedies contained in this edition of Embedded Computing Design.


Subscribe to Multicore updates