VIPR: Virtual and physical Infrastructure Portfolio Redesign

The Virtual and physical Infrastructure Portfolio Redesign (VIPR) is a highly resilient, cost-effective, large scale, general purpose IT infrastructure that serves the whole University. The need for it has been publicly acknowledged since the Vice-Chancellor’s 2013 Oration, and the current service, built incrementally over many years, is no longer fit for purpose:

Inevitably, virtual infrastructure has entirely tangible costs, and these too must be funded. So overseeing the University’s digital investment has necessarily become a significant feature of how we now approach planning and resource allocation. A new IT Committee reports directly to Council, an indication of the priority being accorded to this area and a recognition of the importance of the digital challenge – especially in an institutional culture as decentralised and varied as ours.

The University needs larger, preservation-level storage. We need dual stack. We need big data. We need reasonable speed.

It occurred to me that the limiting factor in all the infrastructures we run was network. All of them have a network that wasn’t really quite fast enough. Wasn’t really scaleable enough. Wasn’t really big enough. Didn’t have enough ports. Didn’t have enough VLAN capability. Didn’t have a fast enough connection to the outside world. Didn’t have enough reliability. The list went on.

Why don’t I build one? I spent 3 years hanging around with a bunch of networking supremos in my last job. How hard can it be? Turns out, fairly tricky actually. But undeniably a lot of fun.

Now, when designing an infrastructure that is going to take on and “absorb” other infrastructures, the only safe way to proceed, is to list all of the requirements from all of the smaller infrastructures and create a master list (in mathematics they call it a “superset“) of all the requirements. I did this.

The big one that jumped out and bit me was the need for dual-site failover. It’s a tricky thing to get right and I’d argue very few people have done it well. I’m not even sure we’ve got it entirely right with VIPR. But I think we are clear on how we will proceed and improve it. And we’ve done the best we can thus far with the technologies available to us.

The other major thing that jumped out at me was the “big data” requirement. Just how much cross-talk bandwidth between datacentres does a big data platform (or worse, two big data platforms, talking to each other) need? The answer to that is “it depends”. It depends on how much storage, how many disks, how localized your storage nodes, or dispersed, what storage technology you need to use, whether resilience and mirroring are factors, and so on. It’s a really non-trivial exercise to factor in.

I crunched the numbers and came to the conclusion that we needed 40GE interconnections between whatever our two data centres were. The next, most obvious question was “how far can I stretch 40GE?”

Once again: “it depends”. There are two modes of operation for fibre, long haul and short haul. Long haul 40GE is really, eye-wateringly expensive. Short-haul 40GE is affordable. Short-haul goes out to 2KM. Any more than that and you need long haul.

So, then I started looking for data centres that fitted the other VIPR requirements. They were, quite simply:

·         USDC is one of them

·         The other needs to be within 2KM as the fibre runs, of the USDC

·         There has to be adequate power

Last May we started talking collaboration with Adrian from NSMS. NSMS have two big hitters on their immediate horizon: the end of the ICT rig (support ends imminently and a new home needs to be found for all those VMs) and the Integrated Communications Project, which needs its own dedicated, and rather large cluster.

But could we build an infrastructure that could run it all?

Eyes automatically fell on IT Services in Banbury Road. Despite its projected closure and relocation of services, it is still a viable DC and, the only game in town. Fibre run distance from USDC is greater than 2KM though, so we had to look into long haul 40GE optics.

By the end of September 2014, we had a network design, not one, but two homes, power, a collaborative team, budgets and a way forward. What else does a boy need?

—Ashley Woltering

Comments are closed.