Most people probably don’t realize just how much our devices are time driven, whether it’s your phone, your laptop or a network server. For the most part, time keeping has been an esoteric chore, taken care of by a limited number of hardware manufacturers. While these devices served their purpose, a couple of Facebook engineers decided there had to be a better way. So they built a new more accurate time keeping device that fits on a PCI Express (PCIe) card, and contributed it to the Open Compute Project as an open source project.
At a basic level, says Olag Obleukhov, a production engineer at Facebook, it’s simply pinging this time-keeping server to make sure each device is reporting the same time. “Almost every single electronic device today uses NTP — Network Time Synchronization Protocol — which you have on your phone, on your watch, on your laptop, everywhere, and they all connect to these NTP servers where they just go and say, ‘what time is it’ and the NTP server provides the time,” he explained.
Before Facebook developed a new way of doing this, there were basically two ways to check the time. If you were a developer, you probably used something like Facebook.com as a time checking mechanism, but a company like Facebook, working at massive scale, needed something that worked even when there wasn’t an internet connection. Companies running data centers have a hardware device called Stratum One, which is a big box that sits in the data center, and has no other job than acting as the time keeper.
Because these time-keeping boxes were built by a handful of companies over years, they were solid and worked, but it was hard to get new features. What’s more, companies like Facebook couldn’t control the boxes because of their proprietary nature. Obleukhov and his colleague research scientist, Ahmad Byagowi began to attack the problem by looking for a way to create these devices by building a PCIe card with off-the-shelf parts that you could stick into any PC with an open slot.
They literally drew the first design on an iPad and began to build that vision into a prototype. A time appliance relies on a couple of key components: a GNSS receiver and what’s called a high stability oscillator. In a blog post describing the project, Obleukhov and Byagowi explained the role of these two parts:
“It all starts from a GNSS receiver that provides the time of day (ToD) as well as the 1 pulse per second (PPS). When the receiver is backed by a high-stability oscillator (e.g., an atomic clock or an oven-controlled crystal oscillator), it can provide time that is nanosecond-accurate. The time is delivered across the network via an off-the-shelf network card,” the two engineers wrote.
It all sounds pretty basic when described like this, but it’s actually quite complex and perhaps that’s why nobody had ever thought to attack the problem in this way, simply accepting that the current methods of determining time worked fine. But these two Facebook engineers were annoyed by the limitations of these approaches and decided to build something better themselves.
“A lot of it came from frustration. We were frustrated with whatever exists in the market, and we needed certain features like security features to maintain different things and monitor what’s going on. And we had to always ask the vendors [for these new features] and every time a request would take like six months to one year, and [it wouldn’t be exactly what we wanted] and we had to change things all the time, so that’s why we had to basically make this from scratch in this way,” Obleukhov said.
When It Comes To Facebook Scale, You Can Throw Out The Rulebook
One thing that made it possible to put a time keeping device on a PCIe card was the advances in miniaturization of the atomic clock/oscillator. So when you combine the timing of their frustration with the current capabilities of the technologies, they realized they could do this themselves if they dedicated themselves to the task.
As the design began coming together, the engineers decided to make it flexible to enable engineers to play off the basic design and drop in whatever components met their needs. Some might need highly sophisticated expensive parts, but others could get away with much cheaper parts, depending on their requirements.
They also decided early on to open source the design process, and to involve the Open Compute Project so that other companies and engineers could contribute to the design. “It was actually going to be open source from the get-go, and the reason for that is we needed to have community support. I didn’t want it to be just one in-house project and let’s say if I lost interest or the businesses lost interest [it could go away]. I wanted this to [keep going] regardless [of what happened],” Obleukhov said.
Today there are a dozen vendors involved in the project and a number of cards out there including the one designed by these engineers, as well as a commercial offering from Orilia, but the goal is to continue improving the design, and by making it open source, the community of companies and engineers involved will continue to improve it.
Facebook speeds up its data center network with the launch of its Backpack switch platform