The School of Computer Science (SCS) facilities are currently undergoing major renovations, including the entire HP5100 wing of the Herzberg Building. To allow construction work toÌýproceed, the wing must be temporarily vacated andÌýrelocated.Ìý

One of the most significant challenges in this process is moving the school’s server room facility.ÌýThis infrastructure supports departmental servers, specialized research equipment, and the OpenStack cloud platform used by the school.Ìý

SCS server room relocation led by Andrew Pullin

SCS server room relocation led by Andrew Pullin

The stakes are high. More than 2,000 undergraduate students rely on the OpenStack cloud for course assignments and laboratory work, with usage peaking during the fall and winter academic terms. At the same time, graduate students and researchers depend on the system around the clock to run Computer Science and Data Science experiments, simulations, and long-running computational workloads.Ìý

This createsÌýa difficult question:Ìý

  • How do you relocate a critical server facility in the middle of the winter term while supporting more than 2,000 students, faculty, and staff who depend on it 24/7, when the infrastructure runs one of the most complex cloud platforms ever created: OpenStack?

The School of Computer Science partnered with the university’s Information Technology Services (ITS), which offered temporary server space in the ÐÓ°ÉÔ­´´ Library to host the infrastructure during the renovation.Ìý

With a relocation site secured, the team considered two primary strategies for moving the server facility.Ìý

ÌýÌý

Option 1: Full Shutdown and Rapid RelocationÌý

  • Under this approach, the entire server room would be powered down, physically moved to the new location in a single day, and then reassembled and brought back online.Ìý
  • The advantage of this strategy is speed. In the best-case scenario, the move could be completed within a day, resulting in only one to two days of downtime for users.Ìý
  • However, the risks were significant. If multiple serversÌýfailed toÌýstart after the move, or if a critical infrastructure nodeÌýencounteredÌýproblems, the entire OpenStack environment could remain offline for an extended period. In a worst-case scenario, service outages could stretch intoÌýdaysÌýor even weeks, while systems were repaired and reconfigured.Ìý

ÌýÌý

Option 2: Live Migration and Incremental RelocationÌý

  • ÌýThe secondÌýoptionÌýinvolved a slower, more deliberate process: migrating servers individually while graduallyÌýrelocatingÌýhardware to the new facility.Ìý
  • Although this approach would takeÌýconsiderably longer, it offered a key advantage. Each server could be handled carefully andÌývalidatedÌýbefore proceeding to theÌýnext. If a problem occurred, it would affect only a single system rather than the entire infrastructure.Ìý
  • This incremental strategy significantly reduced the risk of a prolonged outage and ensured the OpenStack cloud could remainÌýoperationalÌýthroughout the relocation.Ìý

Ìý

The SCS technical staffÌýultimately choseÌýthis incremental strategy. The effort was led byÌýAndrew Pullin, who coordinated the migration plan and oversaw the relocation process.

Before any hardware was moved, the team first ensured the necessary network infrastructure was in place. The SCS subnet was extended between the Herzberg Building and the ÐÓ°ÉÔ­´´ Library, effectively spanning both locations. Because the OpenStack cloud requires its infrastructure nodes toÌýresideÌýon the same subnet, this network configuration was critical.Ìý

With the subnet extended across both buildings, OpenStack could treat serversÌýlocatedÌýin Herzberg and those in the library as part of the same environment, regardless of the physical distance between them. This allowed systems to beÌýrelocatedÌýgradually whileÌýremainingÌýfully integrated with the existing cloud infrastructure.Ìý

SCS tech staff: Karim Ismail and Andrew Pullin configuring a GPU server

SCS tech staff: Karim Ismail and Andrew Pullin configuring a GPU server

The OpenStack environment runs on a virtualized cloud infrastructure, where workloads exist as server images rather than being tied to specific physical machines. This architecture proved to be a major advantage duringÌýthe relocation.Ìý

Virtual machine images could be migrated to the library facility ahead of the physical move.ÌýEach night, servers inÌýHerzbergÌýcopied their images to the library location.ÌýBy morning, once the workloads had successfully migrated, the now-vacant physical server in Herzberg could be safely powered down, removed, and transported to the new facility.Ìý

This approach significantly reduced risk. If an issue occurred during migration, it would affect only a single server rather than the entire cloud environment, allowing problems to be isolated and resolved without disrupting the broader system.Ìý

Winter, however, introduced an entirely different challenge.Ìý

This year Ottawa experienced a particularly harsh and snowy winter, so much so that the Rideau Canal remained open forÌý56 daysÌýof skatingÌý(that’s unusually long). Moving more than 55 servers across campus in freezing conditions is not a trivial task.Ìý

Rideau skating canal

Fortunately, ÐÓ°ÉÔ­´´ University has a unique advantage: its extensive underground tunnel system.Ìý

The Supervisor of Operations at ITS, John MacGillivray, helped coordinate the move using the golf cart and trailer through the ÐÓ°ÉÔ­´´ tunnels. Moving the equipment in small batches allowed the team to safely relocate servers between buildings without exposing them to the winter weather. What might have been a logistical nightmare outdoors instead became an efficient relocation route beneath the campus.

Over the course of less than two months,ÌýAndrew Pullin and the SCS technical teamÌýsuccessfullyÌýmigratedÌýthe entire environment during the middle of the winter academic term. The move included:Ìý

  • 4 racks of equipment
  • 25 compute nodesÌýtotalingÌý1,672 CPU coresÌý
  • 24 GPU serversÌýcontainingÌý138 GPUsÌý
  • The fullÌýOpenStack infrastructure stackÌý
  • SupportingÌýnetworking and storage systemsÌý

In the end, like magic, the SCS cloud was successfully relocated across campus in the middle of the winter term, transparent to end users, who continued their work without ever realising the servers themselves had physically moved across campus.

Server room relocation timelapse

Server room relocation timelapse: front and back of each of the 4 racks

Modern network administration and virtualization technologiesÌýmadeÌýthis complex relocation possible.Ìý

Of course, there is one small catch.Ìý

When the renovations are finished…Ìýeverything will have to be moved back.