Large enterprises are moving more and more resources to the cloud. The reasons are obvious: less cost, simpler maintenance, less hardware, global access. A growing number of actors in the payment space are also eyeing their HSMs, and wondering if they can or should be moved to the cloud, too. With the newest crop of payment HSMs the ‘can’ answer is “Yes”. But how about the ‘should’? Because HSMs are handling data critical to almost every payment transaction, what are the considerations and caveats merchants, acquirers and issuers need to examine? As a company immersed in the payment environment, GEOBRIDGE can assist those who are thinking about migrating their HSMs to the cloud.
Enterprises, especially those who manage sites in multiple geographical locations, have the most to gain by centralizing and virtualizing their HSM resources:
- Fewer appliances required: instead of having multiple HSM instances, organizations can build virtualized clusters. This allows condensing multiple, low throughput appliances into a smaller number of faster devices. Often, this does not even require new hardware – HSM throughput can be upgraded by a new license.
- Shared access: with the HSMs network-accessible, multiple applications have simultaneous access to the centralized HSM farm, instead of requiring their own physical appliances.
- Simplified DR planning: typically, only two physical clusters are required, in geographically separate locations. HSMs in each site are networked in such a way that if one site goes down, the other can seamlessly pick up the traffic.
- Reduced maintenance: with fewer devices comes less maintenance, and correspondingly lower cost.
- Centralized, remote management: the latest HSMs sport network-based remote management, allowing operations teams to manage, configure and update their devices without requiring physical access. With fewer sites and devices, management is streamlined. This means fewer expensive trips to the data centers.
- Easier compliance: fewer physical locations means fewer sites need to be audited. And because remote management means fewer data center visits, auditors have smaller logs to review.
- Comprehensive metrics: some HSM manufacturers (e.g. Thales) provide software packages that allow managers to collect usage statistics for both individual devices and in the aggregate. This allows a more accurate picture of both average and peak throughput, and makes long-term planning and cost-effective equipment acquisition decisions easier and better.
No solution is without drawbacks, however. Virtualizing physical security devices brings with it some restrictions and complexities:
- Complicated physical access: many companies, wishing to avoid the cost of buying and maintaining their own data centers, opt to lease rack space in a co-lo. This works well for many applications, but HSMs present a challenge because of their physical security requirements. This means that co-lo operators must be carefully vetted in terms of the physical access restrictions they enforce and access logs they capture. Best security practice (and auditors!) will require a log of everyone who has access to the HSMs; co-lo operators must commit to providing this level of access authentication and logging.
- Network security requirements: in a virtual HSM cloud, multiple applications will be accessing the HSMs simultaneously. Protecting this traffic means that each application must establish a TLS connection with every HSM they will access. This requires:
- A full Public Key Infrastructure (PKI) implementation to provide the necessary X.509 certificates. In this automated environment, all entities must mutually authenticate each other. That means that every device and application must possess an authentication key pair and possess a CA-signed certificate.
- With some HSMs, the certificate for every application that will access it must be presented first, so the HSM can build a whitelist of authorized applications.
- The certificates for all root and sub-CAs used to validate application certificates must be installed in all HSMs used by those applications.
- Since working certificates expire after 1-3 years, operations teams must develop a plan to rotate and replace keys and certificates periodically.
- Potentially degraded throughput: because the HSMs are no longer a local resource, packets are handled by more switches and routers. This increases the risk of delays and lost data. In addition, there is the additional overhead of TLS session establishment and packet encryption and decryption. Applications using cloud HSMs must be sufficiently robust to account for occasional delays while network infrastructure recovers lost or damaged packets.
- Remote management security: remote HSM managers are a huge boon to teams managing multiple HSMs in geographically diverse locations. But this capability also comes with a cost: additional PKI, often proprietary and usually separate from the application PKI. HSMs often employ physical tokens (e.g., smartcards or USB devices) that the managers must use to gain remote access; these tokens must be initially purchased and personalized, then securely stored. All this requires sufficient budget, careful planning and accurately documented procedures.
- Restricted network debugging: application development teams often rely on the ability to capture network traffic for program debugging purposes, as well as onboarding new customers. This capability disappears once the traffic is encrypted. Obviously, TLS is not required in a test environment, but a developer’s worst nightmare is an application that works perfectly in test but fails in production. TLS removes the possibility of debugging live, “on-the-wire” network traffic.
GEOBRIDGE has assisted several large processors and issuers with their cloud migrations. For enterprise customers that are considering moving their HSMs to the cloud, take advantage of GEOBRIDGE’s expertise. We will work with you to review your current environment, plan your migration, assist in PKI setup and provide bespoke documentation for your operations teams. If desired, we will come onsite and assist with the one-time setup steps. Let GEOBRIDGE help you avoid the pitfalls and minimize the risks inherent in a project of this magnitude and complexity. We can help take the guesswork out of the upgrade effort. Let GEOBRIDGE be your partner as you plan your next infrastructure uplift.