Exclusive Microsoft researchers, in partnership with academia, have published a paper detailing how they have dramatically increased the speed of homomorphic encryption systems.
With a standard encryption system, data is scrambled and then decrypted when it needs to be processed, leaving it vulnerable to theft. Homomorphic encryption, first proposed in 1978 but only really refined in the last decade thanks to increasing computing power, allows software to analyze and modify encrypted data without decrypting it into plaintext first. The information stays encrypted while operations are performed on it – provided you have the correct key, of course.
This has major advantages from a security standpoint. Hospital records can be examined without compromising patient privacy, financial data can be analyzed without opening it up to theft, and it’s perfect for a computing environment where so much data is cloud-based on someone else’s servers.
There is, of course, a problem. The first fully working homomorphic encryption system, built by Craig Gentry (now an IBM Research cryptographer), was incredibly slow, taking 100 trillion times as long to perform calculations of encrypted data than plaintext analysis.
IBM has sped things up considerably, making calculations on a 16-core server over two million times faster than past systems, and has open-sourced part of the technology. But, in a new paper [PDF], Microsoft thinks it’s made a huge leap forward in applying the encryption system to deep learning neural networks.
Professor Kristin Lauter, principle research manager at Microsoft, told The Register that the team has developed CryptoNets that process the encrypted data. The team claims that its optical recognition system is capable of making 51,000 predictions per hour with 99 per cent accuracy.
The key to Redmond’s approach is in the pre-processing work. The researchers need to know in advance the complexity of the arithmetic circuit that is to be applied to the data. They need to structure the neural network appropriately and keep data loads small enough so the computer handling them isn’t over-worked.
To make this possible, the team developed the Simple Encrypted Arithmetic Library (SEAL) – code which it revealed last November. Detailed parameters have to be set up before the data run is attempted, to keep multiplication levels low.
In testing, the team used 28 x 28-pixel images of handwritten words taken from the Mixed National Institute of Standards and Technology (MNIST) database and ran 50,000 samples through to train the system. They then tried a full run on an additional 10,000 characters to test accuracy.
The test rig was a PC with a single Intel Xeon E5-1620 CPU running at 3.5GHz, with 16GB of RAM, running Windows 10. They structured the data in parallel, and the computer ran 51,739 predictions per hour with an accuracy rate of 99 per cent.
There’s still a lot of work to be done, Lauter said, but the initial results look very promising and could be used for a kind of machine learning-as-a-service concept, or on specialist devices for medical or financial predictions.
“I’m not in that part of the company’s decision-making process, so can’t guarantee when Microsoft will have a product using this technology,” she said. “But from a research point of view, we are definitely going towards making it available to customers and the community.” ®
Application release and deployment for dummies