With the growing popularity of single board computers (SBC), especially those based on x86-compatible CPUs, more and more applications pave their way into the embedded systems domain. What we're experiencing is the reach of traditional desktop-like applications into areas, that were restricted to full custom developed and
RTOS-driven hardware a few months ago.
SBC with enclosure
Vice versa, there's a complementary trend of traditional embedded applications moving from custom hardware and niche software platforms to single board computers powered by main stream operating systems e.g. Linux or Windows embedded. Driven by narrow market windows, cost pressure and short technology cycles, product makers try to keep up with the pace by using pre-designed and verified hardware platforms and/or Open Source software frameworks hiding the underlying complexity from their development teams.
This convergence from both sides of the embedded spectrum has gained momentum with the introduction of the Intel Atom architecture (and its compatibles) early last year. Its low
TDP in combination with well-known interface standards e.g. USB, Gigabit Ethernet or SATA has lead to fanless and compact board designs, perfectly suited for single board computers used in industrial applications.
So called computer on modules remove the burden of designing an entire processor platform from hardware engineers and shift the focus back to their core business: implementing solutions for your customers. A carrier or base board with standardized connector, e.g. via
ETX/
COM Express, to the processor module provides application specific functionality through interfaces like USB or PCIexpress. The same applies to the software development approach: a commodity Linux distribution replaces board support packages, embedded network stacks and error-prone third-party libraries. It should be obvious, that x86-compatibility gives you access to a giant amount of existing software and one of the largest developer communities today.
Computer On Module
Let's take an image processing system (could be used for quality inspection of an assembly line) as an example. The task is to capture N-images every second, perform a feature detection in the image, store the result and assert a signal in case of missing features to sort out items not conforming to spec (see following picture).

The traditional embedded approach would be to connect a camera module e.g. via
Camera Link to your custom-designed PCB equipped with a decent digital signal processor (DSP) and an output interface to connect to the real world e.g. serial/parallel GPIO. While there is nothing wrong with that, the question is: Is it worth designing (and maintaining!) full custom hard/software? Do we need hard realtime? How fast can you deliver the product? And at which costs? Basing our system on a single board computer instead, leads to the following simplified system level architecture.

We use a CCD based camera module that supports continuous streaming over USB like the
Lumenera Lm075.
Isochronous transfers ensure a bounded transmission latency. Drivers and SDK are available so that image processing algorithms can be developed and tested right from the beginning on every PC with USB connectivity. Yes, this is a huge benefit! If you've ever had to setup a vendor specific DSP development environment, you know what I'm talking about..."Hell! Why can't I connect to the evaluation board!"
The good news is, that current x86 architectures implement powerful DSP-operations like Streaming SIMD Extensions (http://en.wikipedia.org/wiki/SSE2) and are capable of dealing with double precision numbers which is still not given for most DSPs. Examples how to leverage these instructions for algorithm acceleration can be found here (
Using SSE for image processing).
Once the core algorithm is implemented, the outcome of each processed image needs to be (1) logged and (2) in case of errors signaled to external hardware dealing with the failing item e.g. sorting it out. The first task can be accomplished by simple logging to a file or, thanks to the Ethernet port, by storage in a SQL database. External signaling is slightly harder but can be done via
SuperIO chipsets or GPIO pins if available. Usually, this involves another level conversion or buffering e.g. to 24V for industrial automation systems. And we're done...
Well, what I've left out for the sake of brevity is the timing part. The critical path from image acquisition to external signaling needs to be carefully evaluated and finetuned to meet your timing requirements. Hard real-time is only available with RT-patched kernels but can be accomplished as well.
The computing architecture described in this post is not restricted to image processing applications only, in fact it can be applied to many problems in the embedded systems domain. It integrates perfectly into corporate IT networks due to its Ethernet capability and can be administrated/monitored remotely. The benefit of using an architecture similar to the given example is, that the overall system design effort shifts noticeably to the software side, where your expertise is located, in the implementation of core business logic based on a proven and standardized platform.