While writing my last post I remembered how I came across parallel programming the first time. I was at university and at that time looking into cracking passwords from hashes e.g. on UNIX-like systems from /etc/shadow using John the ripper. BTW when saying cracking I mean the real brute-force approach, no dictionaries and stuff. The SUN workstations were too slow at that time. So I bought a bunch of 3G base station processor boards at ebay (see picture) armed with 4 DSPs and a PowerPC. Pretty hot stoff that time (and ridicously cheap, about 10 EUR each). They were used by a huge german company (starting with S) in a development project and sold at the end.
I had no clue about FPGAs that time and DSPs were my natural choice. I was young, I knew how to program in C, I was using an open source password cracking tool and the DSPs simply matched my number crunching requirements. What else do I need more? Erm...board manuals, schematics, professional development tools (yep, no gcc port for TMS320C6X available), debugging cables, probably a VME backplane and more knowledge about these architectures as I found out. Finally I managed to power the board using an old PC power supply and connected to the console port. Woohoo...it was still working and the bootloader was looking for a FTP server to pull the OS image from. My first steps into the field of parallel computing...
To bring the story to an end. I've never run any code on the DSPs due to the lack of board manuals, schematics, professional development tools...you name it. However, I've learned that (1) certain computing tasks can be broken down to a restricted set of arithmetic operations but need to be run at the maximum achievable speed and (2) it makes sense for a group of problems to split work among multiple processors (or cores) to finish computations in less time.
From todays perspective I've discovered another interesting aspect which is more related to the overall system concept. If you take a closer look at the architecture of the telco blade, you'll find similarities to modern processor architectures. There are specialized processing units (4 DSPs) grouped around a central processing unit (PowerPC) on the PCB. Does this remind you of something? No? Replace PowerPC with PPE, DSPs with SPEs and PCB with die and you'll get pretty close to the CellBE architecture (see figure below, taken from "Introduction to the Cell Broadband Engine").
Lessons learned from application specific processor boards lead to main stream processor architectures on a single die. I'm wondering if the telecommunications industry is still the number one driver of the processor evolution. Or is it the gaming industry with its demand for high bandwidth graphics hardware that is most influential on modern processor designs?
General Purpose GPUs, like nVidia CUDA, are entering the mainstream. The other day I was playing with it and can attest that it was a breeze to install and run simple applications on my laptop.
For things like password cracking it's might be a cheaper option than using FPGAs.
GPUs are cheaper, more visible and accessible than FPGAs and most importantly, way easier to program e.g. with C-style languages like CUDA or OpenCL.
The learning curve for the typical VHDL/Verilog FPGA flow is still too steep.