Archive

Thoughts on a modular Mac Pro

There have been some rumours recently that the next iteration of the Mac Pro is going to be modular, but we have had very little information about how this modularity might be expressed. In some ways the current Mac Pro is already quite modular - at least compared to every other Mac/MacBook. You have easy access to lots of RAM slots, you have multiple standards-compliant disk bays, PCI slots and CPU sockets. This affords the machine an extremely atypical level of upgradeability and expandability, for a Mac. Normal levels for a PC though. Even with that modularity in mind, the machine itself is fairly monolithic - if you do need more than 4 disk drives, or more PCI cards than it can take, you have limited or no expansion options. You could burn a PCI slot for a hardware disk controller and attach some disks to it externally, but you are quickly descending into an exploding mess of power supplies, cables and cooling fans. If Apple decides to proceed along that route, the easiest and most obvious answer is that they slim down the main Pro itself and decree that all expansion shall take place over Thunderbolt (currently 10Gb/s bidirectional, but moving to 20Gb/s bidirectional later this year when the Thunderbolt 2 Falcon Ridge controllers launch). This is a reasonable option, but even though Thunderbolt is essentially an external PCI-Express bus, its available bandwidth is considerably lower than the peak levels found on an internal PCI-E bus (currently around 125Gb/s). A much better option, it would seem to me, would be to be able to go radically modular and expand the Mac itself, but how could that be possible? How can you just snap on some more PCI slots if you want those, or some more disks if that's what you need? I will say at this point that I have absolutely no concrete information and I am not an electronic engineer, so what you read below is poorly informed speculation and should be treated as that :) I think the answer is Intel's QuickPath Interconnect (QPI), a high bandwidth (over 200GB/s), low latency point-to-point communication bus for connecting the main components of an Intel based computer. If you have any Intel CPU since around 2009, you probably have a QPI bus being used in your computer. Looking at the latest iteration of their CPUs, QPI is always present - on the uniprocessor CPUs it is used on the chip package to connect the CPU core to the elements of the northbridge that have migrated into the CPU package (such as the PCI-Express controller), however, on these chips the QPI bus is not presented externally. On the multiprocessor-capable chips, it is, and is the normal way to interconnect the CPUs themselves, but it can be used for other point-to-point links, such as additional north bridges providing PCI-Express busses. So you could buy a central module from Apple that contains 1, 2 or 4 CPUs (assuming Ivy Bridge Xeons) and all of the associated RAM slots, with maybe two minimal disk bays for the core OS to boot from, and a few USB3.0 and Thunderbolt ports. For the very lightest of users, this would likely be a complete computer - you have some disk, some RAM, CPUs and assuming the Xeons carry integrated GPUs, the Thunderbolt ports can output video. It would not be much of a workstation, but it would essentially be a beefed up Mac Mini. I would then envision two kinds of modules that would stack on to the central module. The simplest kind would be something like a module with a disk controller chip and a load of disk bays and, not needing the raw power of QPI, this would simply connect to the existing PCI-Express bus of the main module. There would clearly be a limit to how many of these modules you could connect, since there are a limited number of PCI-E lanes provided by any one controller (typically around 40 lanes on current chipsets), but with the second type of module, you could then take the expansion up a considerable number of notches. That second kind would have a large and dense connector that is a QPI. These modules could then attach whatever they wanted to the system - more CPUs (up to whatever maximum is supported by that generation of Xeon - likely 8 in Ivy Bridge), or very very powerful IO modules. My current working example of this is a module that is tasked with capturing multiple 4K video streams to disk simultaneously. This module would provide its own PCI-Express controller (linked back to the main module over QPI), internally connected to a number of video capture chips/cards and to one or more disk controller chips/cards which would connect to a number of disk bays. It sounds a lot like what would happen inside a normal PC, just without the CPU/RAM and that's because it's exactly that. This would allow for all of the video capture to be happening within the module. It would be controlled as normal from the software running in the main module, which would be issuing the same instructions as if the capture hardware was on the main system PCI-E bus, causing the capture cards to use DMA to write their raw video directly to the disk controller exactly as if they were on the main system PCI-E bus. The difference would be that there would be no other hardware on the PCI-E bus, so you would be able to make reasonable promises around latency and bandwidth, knowing that no user is going to have a crazy extra set of cards in PCI slots, competing for bandwidth. Even if you have two of these modules capturing a really silly amount of video simultaneously. It's a model for being able to do vast amounts of IO in parallel in a single computer. There would almost certainly need to be a fairly low limit on the number of QPI modules that could attach to the system, but being able to snap on even two or three modules would elevate the maximum capabilities of the Pro to levels far beyond almost any other desktop workstation. As a prospective owner of the new Mac Pro, my two reasonable fears from this are:

  • They go for the Thunderbolt-only route and my desk looks like an awful, noisy mess
  • They go for the radical modularity and I can't afford even the core module

(While I'm throwing around random predictions, I might as well shoot for a name for the radical modularity model. I would stick with the Lightning/Thunderbolt IO names and call it Super Cell) Edit: I'd like to credit Thomas Hurst for helping to shape some of my thinking about QPI.