Monday, March 15. 2010
This series of posts is dedicated to the Altium NanoBoard 3000. I had the opportunity to play with it and like to share my experience with you.
The NanoBoard was released in September 2009 by Altium Ltd. It's a rapid prototyping platform for digital electronic designs consisting of an evaluation board, design software and royality free IP for use in the onboard FPGA. In short, all the tools you need to start implementing your ideas. You can stop reading now, buy it from e.g. Newark for $395 and get started. The 3000 is part of Altium's NanoBoard family a complementary product line besides the well know EDA tools. Maybe you're already working with Designer.
When I received the delivery and removed the outer packing I was impressed. Note, I had not opened the box yet. It reminded me of some consumer lifestyle product. It could have been a mobile phone or high-end notebook. Apple is well know for such a kind of great packaging. Not bad for an evaluation board!
Opening the box reveals the board + software. The black PCB with gold contacts has an undeniable elegance. Underneath is a separate box containing accessories, desktop stand, speaker board, IR remote (yes, a remote control) and the power supply.
The Quickstart Guide provides instructions for mounting the desktop stand and connecting the speaker board. After 10 minutes I had the final setup sitting on my desk. The software is shipped on a DVD. Installation on Windows completed successfully after a couple of minutes. No license is required. The eval board is your license/dongle.
Without having worked with it, this is clearly a highlight among the evaluation kits on my shelf. It's obvious that this product was designed by a team of professionals with a precise and common vision in mind. Every detail has been fine-tuned, from the packaging to the PCB. A good example for holistic product design. Thumbs up.
And now, you know why it is worth to consider the in the box experience. More details in the next post...
Sunday, October 4. 2009
With the growing popularity of single board computers (SBC), especially those based on x86-compatible CPUs, more and more applications pave their way into the embedded systems domain. What we're experiencing is the reach of traditional desktop-like applications into areas, that were restricted to full custom developed and RTOS-driven hardware a few months ago.
SBC with enclosure
Vice versa, there's a complementary trend of traditional embedded applications moving from custom hardware and niche software platforms to single board computers powered by main stream operating systems e.g. Linux or Windows embedded. Driven by narrow market windows, cost pressure and short technology cycles, product makers try to keep up with the pace by using pre-designed and verified hardware platforms and/or Open Source software frameworks hiding the underlying complexity from their development teams.
This convergence from both sides of the embedded spectrum has gained momentum with the introduction of the Intel Atom architecture (and its compatibles) early last year. Its low TDP in combination with well-known interface standards e.g. USB, Gigabit Ethernet or SATA has lead to fanless and compact board designs, perfectly suited for single board computers used in industrial applications.
So called computer on modules remove the burden of designing an entire processor platform from hardware engineers and shift the focus back to their core business: implementing solutions for your customers. A carrier or base board with standardized connector, e.g. via ETX/COM Express, to the processor module provides application specific functionality through interfaces like USB or PCIexpress. The same applies to the software development approach: a commodity Linux distribution replaces board support packages, embedded network stacks and error-prone third-party libraries. It should be obvious, that x86-compatibility gives you access to a giant amount of existing software and one of the largest developer communities today.
Computer On Module
Let's take an image processing system (could be used for quality inspection of an assembly line) as an example. The task is to capture N-images every second, perform a feature detection in the image, store the result and assert a signal in case of missing features to sort out items not conforming to spec (see following picture).
The traditional embedded approach would be to connect a camera module e.g. via Camera Link to your custom-designed PCB equipped with a decent digital signal processor (DSP) and an output interface to connect to the real world e.g. serial/parallel GPIO. While there is nothing wrong with that, the question is: Is it worth designing (and maintaining!) full custom hard/software? Do we need hard realtime? How fast can you deliver the product? And at which costs? Basing our system on a single board computer instead, leads to the following simplified system level architecture.
We use a CCD based camera module that supports continuous streaming over USB like the Lumenera Lm075. Isochronous transfers ensure a bounded transmission latency. Drivers and SDK are available so that image processing algorithms can be developed and tested right from the beginning on every PC with USB connectivity. Yes, this is a huge benefit! If you've ever had to setup a vendor specific DSP development environment, you know what I'm talking about..."Hell! Why can't I connect to the evaluation board!"
The good news is, that current x86 architectures implement powerful DSP-operations like Streaming SIMD Extensions (http://en.wikipedia.org/wiki/SSE2) and are capable of dealing with double precision numbers which is still not given for most DSPs. Examples how to leverage these instructions for algorithm acceleration can be found here (Using SSE for image processing).
Once the core algorithm is implemented, the outcome of each processed image needs to be (1) logged and (2) in case of errors signaled to external hardware dealing with the failing item e.g. sorting it out. The first task can be accomplished by simple logging to a file or, thanks to the Ethernet port, by storage in a SQL database. External signaling is slightly harder but can be done via SuperIO chipsets or GPIO pins if available. Usually, this involves another level conversion or buffering e.g. to 24V for industrial automation systems. And we're done...
Well, what I've left out for the sake of brevity is the timing part. The critical path from image acquisition to external signaling needs to be carefully evaluated and finetuned to meet your timing requirements. Hard real-time is only available with RT-patched kernels but can be accomplished as well.
The computing architecture described in this post is not restricted to image processing applications only, in fact it can be applied to many problems in the embedded systems domain. It integrates perfectly into corporate IT networks due to its Ethernet capability and can be administrated/monitored remotely. The benefit of using an architecture similar to the given example is, that the overall system design effort shifts noticeably to the software side, where your expertise is located, in the implementation of core business logic based on a proven and standardized platform.
Sunday, March 8. 2009
You can guess it from the title. I've attended embedded world this year (again). This is my personal review from Europes biggest exhibition about embedded technologies held in Nuremberg (Germany) from March 3rd to 5th.
I like the narrow focus on embedded hardware, software and services. The whole event fits in only four pavilions but has plenty to offer. Unlike some big events (e.g. electronica, Cebit) it's still growing: a 25% increase in exhibitors compared to last year. (I've found no official statistics about the visitor numbers, yet.)
As last year I took the train and arrived around 11am in Nuremberg. Passing by the first busy booths gave me a good feeling. It seemed like recession was dispelled from these pavilions. Maybe I'm wrong, but the general mood of the visitors didn't show a sign of crisis. Of course, everybody is aware of the exceptional situation and I expect more people getting laid off in the electronics industry but despite of all bad news the booths showed high load.
I've noticed an interesting trend going on in the computer on modules (COM) business. Multiple vendors are jumping on the bandwagon of increased system-integration. In order to reduce the number of components, save PCB space and power they start to offer modules with on-board FPGAs. The FPGA is connected through a 1x PCIe lane to the chipset and provides external (serial) connectivity for I2C, CAN, Ethernet etc. This sounds contradictory to reducing power? Wait. The best news is that you have access to the unused FPGA fabric and can insert your logic there. Even better, get rid of the CAN core and friends and occupy it all!
In the software pavilion I've went to the Trolls (aka Qt Software) and talked about the latest Qt release. Starting with version 4.5 the Qt framework is also offered as LGPLed package. That means you can link your closed source applications to the library without paying any license fees. That's good news for independent developers and micro ISVs. I was wondering how Qt Software is making money now. I got a lengthy explanation which can be summarized to: More developers, more mobile applications, higher Nokia phone sales. Nokia pays the bills. Additionally, the Qt Extended stand-alone product is discontinued. The last release will be 4.4.3. All Qt Extended features will be moved later into the Qt framework.
So far. In case things should get worse next year we can still ask Gov. Schwarzenegger for a keynote...erm click here.
Friday, February 6. 2009
While writing my last post I remembered how I came across parallel programming the first time. I was at university and at that time looking into cracking passwords from hashes e.g. on UNIX-like systems from /etc/shadow using John the ripper. BTW when saying cracking I mean the real brute-force approach, no dictionaries and stuff. The SUN workstations were too slow at that time. So I bought a bunch of 3G base station processor boards at ebay (see picture) armed with 4 DSPs and a PowerPC. Pretty hot stoff that time (and ridicously cheap, about 10 EUR each). They were used by a huge german company (starting with S) in a development project and sold at the end.
I had no clue about FPGAs that time and DSPs were my natural choice. I was young, I knew how to program in C, I was using an open source password cracking tool and the DSPs simply matched my number crunching requirements. What else do I need more? Erm...board manuals, schematics, professional development tools (yep, no gcc port for TMS320C6X available), debugging cables, probably a VME backplane and more knowledge about these architectures as I found out. Finally I managed to power the board using an old PC power supply and connected to the console port. Woohoo...it was still working and the bootloader was looking for a FTP server to pull the OS image from. My first steps into the field of parallel computing...
To bring the story to an end. I've never run any code on the DSPs due to the lack of board manuals, schematics, professional development tools...you name it. However, I've learned that (1) certain computing tasks can be broken down to a restricted set of arithmetic operations but need to be run at the maximum achievable speed and (2) it makes sense for a group of problems to split work among multiple processors (or cores) to finish computations in less time.
From todays perspective I've discovered another interesting aspect which is more related to the overall system concept. If you take a closer look at the architecture of the telco blade, you'll find similarities to modern processor architectures. There are specialized processing units (4 DSPs) grouped around a central processing unit (PowerPC) on the PCB. Does this remind you of something? No? Replace PowerPC with PPE, DSPs with SPEs and PCB with die and you'll get pretty close to the CellBE architecture (see figure below, taken from "Introduction to the Cell Broadband Engine").
Lessons learned from application specific processor boards lead to main stream processor architectures on a single die. I'm wondering if the telecommunications industry is still the number one driver of the processor evolution. Or is it the gaming industry with its demand for high bandwidth graphics hardware that is most influential on modern processor designs?
Sunday, January 18. 2009
While my last post was focused on the hardware-side of watchdog timers (WDTs) I will now discuss more high-level/software concepts of WDTs. So, if you haven't read part 1. Now is the time! Done? Here we go.
I'm assuming we have a software powered application that requires a full-fledged operating system (OS) and relies on a bunch of peripheral hardware besides a CPU e.g. harddisks, network adaptors etc. Let's call it server since that might be a valid use case. OK. Our application is supposed to run 24/7 somewhere deeply buried at a customer's site and it's extremely costly to send out tech-support staff for on-site fixes, just to discover that e.g. someone played around with our settings or temporarily disabled the air condition for maintenance causing the system to lock-up.
Being smart and having read part 1 we pull out this timer thing connected to the reset line of our server and we're set, right? Wrong. The problem with this approach is its simplicity and tempting ease of implementation. I understand that you want to get things done and push the box out of the door. But this approach brings further implications we havn't dealt with yet. Remember, our software is not running on a microcontroller. A hard reset should only be the last resort since it puts a lot more stress on all components than a safe shutdown. And what if there is really a broken piece of hardware or the air-condition runs amok and starts heating? It will result in an endless reboot-reset cycle causing even more harm.
Let's tackle the "endless reboot-reset cycle" problem first since it can be applied without changes to the hard reset implementation. The idea is quite simple: we extend the WDT by counting the number of timeouts. If a timeout occurs our application is obviously not running the way it was intended to. So we maintain a counter or some flags (ideally this is done in hardware) and additionally log the time in a non-volatile way. Observing more than X timeouts in a timeframe less than Y seconds (insert appropriate values for your application) will power down the system. Now, human intervention is required for a restart.
I've tried to illustrate this in a loose UML-style statechart (see figure), on the left hand-side the application process and on the right hand side the watchdog process. The dashed line denotes concurrency. State names are written in bold + underline. Lines with arrows denote state transitions. Transitions can be conditional (with label) or unconditional without label. Including the WDT reload/ping of the application process was a bit tricky. Note, the arrow which is overlapping the dashed line. I've found no better notation. If somebody can shed light onto that issue, feel free to comment.
In my next post I will cover the "safe shutdown" issue and why it is important to split this functionality from your core software.
(Page 1 of 2, totaling 6 entries) » next page