The main requirements were about two dozen. Among them are such basic ones as placing the platform on two data centers, the availability of a console for resource management, the ability to work through the API, payment for services upon use with a granularity of not more than one hour, the availability of automation tools, for example, Terraform. Other requirements are not to say that we were very surprised, they are simply all the customers do not make. Among such requirements is the need to own the building in which the cloud data center operates.
But here everything is clear in general. This customer apparently also read the history of the Russian collocation market. Or someone from their clients somewhere has already stuck abroad. Everything else is generally standard. The requirement of the data center is in Moscow (this was also on the list) - this is for the opportunity to come to the admin and for the speed of requests for replication. The most important point after two data centers is detailed SLA metrics. As I said, it worried them most about each item.
Here one of the most important points for the customer was that it provided exactly three support lines. The first line is always there and at all, the second line of support is usually there, but the requirements for it are already quite blurry. But there is also a third one, which in fact cuts various chips. And nothing is given to outsourcing, as small providers do sometimes. The project involved only their employees. Not a service team is allocated to a large customer project, but a separate project team, and this is recorded in the documents.
- The presence of three levels of technical support for the platform: the first line is the solution of incidents at the platform level (HW, virtualization), the second line is the solution of problems in the infrastructure of the customer located in the cloud platform (OS level, DBMS and other application software), the third line is connection cloud vendor platform developers and / or vendors to solve problems.
- Mode 24x7x365 first-line technical support.
- Mandatory knowledge of Russian and English languages from specialists of all levels of support.
- The possibility of placing applications for the incident that has occurred by e-mail or by calling technical support.
- The possibility of placing applications for the incident on a call to technical support.
- The response time of technical support specialists to an incident is from 10 to 15 minutes, depending on the priority of the request (the supplier is obliged to record a detailed description of the priorities of incidents in the service contract).
- The time to resolve an incident is from 90 to 240 minutes depending on the priority of the request (the supplier is obliged to record a detailed description of the priorities of incidents in the service contract).
- The obligatory presence of a dedicated project team, which includes: account manager, project manager, technical architect, engineers.
- The ability to use various means of communication between the supplier team and the customer team to more quickly resolve issues (for example, using Telegram, WhatsApp, etc.).
- Fixing the list of the project team in a signed contract for the provision of cloud platform services. The list should include the full name, contact numbers of mobile phones, e-mail addresses of all persons involved in the activity of the customer and supplier.
- The accounting system of consumed resources must comply with the established requirements of the “Rules for the application of automated settlement systems, approved. Order of the Ministry of Information Technologies and Communications of Russia 02.07.2007 No. 73 ".
- The provider must have a current certificate of compliance of the company's Information Security Management Systems with the requirements of ISO / IEC 27001: 2013 regarding the provision of outsourcing data center services and virtual data centers.
- Availability of a current certificate for the PCI DSS v3.2 cloud platform.
- The PCI DSS 3.2 compliance certificate should include IT support, physical security, system services security, physical equipment, networks, storage.
- Certificates Tier III Design Data Center, Tier III Facility Data Center, Tier III Operational sustainability Data Center.
- The allocation of computing resources (virtual cores, RAM) should be carried out in a guaranteed way, excluding the possibility of mutual influence of the customer’s virtual servers located on the same physical node on each other.
- The cloud platform should provide the ability to change the amount of computing resources without re-creating the VM.
- The possibility of a guaranteed placement of the VM on different physical nodes.
- The cloud platform should provide a choice of cluster (DC) when starting the VM.
- The cloud platform should provide the ability to create virtual disks of different performance (IOPS) through a web management interface and API.
- The cloud platform should provide the ability to change disk performance on the fly.
- Disk resources must be available with performance guarantees, as measured by the number of IOPS per disk.
- Guaranteed disk performance should extend to 100,000 IOPS.
- The cloud platform should provide the ability to migrate data between disk resources of different performance “on the fly” without stopping in providing the service.
- The cloud platform should allow to organize isolated network environments that are inaccessible to other customers of the cloud platform.
- The isolated cloud platform network environments should allow managing the network addressing and routing of the customer's IT infrastructure.
- The cloud platform should have the functionality of connecting external dedicated communication channels of customers.
- Assignment or removal of external IP addresses to virtual servers using the cloud platform should be provided.
- The cloud platform should provide external fault-tolerant connection at a speed of at least 40 Gb / s.
- The cloud platform must have built-in DNS and DHCP services.
- The cloud platform must provide IPSec VPN connections.
- The cloud platform should provide fault-tolerant access to the Internet, independent of the provider, and aggregate at least four providers.
- The bandwidth between the VMs within the same data center should be at least 10 Gbit / s.
- L2-connectivity between virtual infrastructures deployed in various data centers.
- The cloud platform must have a software interface compatible with Amazon S3.
- Object storage should work according to a protocol that provides the ability to store and receive any amount of data at any time from anywhere on the Internet.
- The data storage system for fault tolerance should be distributed at least between two executor sites.
- The storage system should be able to expand as you add files.
- Object storage must support versioning.
- Each object in the repository must be replicated between executor sites. In the event of a single failure of any of the object storage components, there should be no impact on the quality of service.
- Ability to work with the storage via HTTPS.
- Support access control list (ACL) and Policy.
- Support for Object Lifecycle policies for the lifetime of objects.
- The ability to encrypt on the server side Server side encryption.
- Support for static websites and user names for websites like mysite.ru
- The fault tolerance level of the storage service is at least 99.99%.
- The separation of the customer’s information environment within the cloud platform into several independent virtual networks should be ensured.
- Managing access to virtual networks should be implemented on various ports and protocols using a free built-in firewall.
- The integration of the virtual platform servers into one virtual private network (VPN) with the customer’s physical or virtual servers located on a remote site or data center should be ensured.
- Access to the software management functions (API) of the cloud platform should be provided in such a way that the security system is not compromised even when using unsafe transport protocols.
- To access the software management functions (API) of the cloud platform, the HTTPS protocol must be used. Certificates must be signed by trusted certificate authorities.
- Access to virtual Linux \ UNIX servers should be carried out via the SSH protocol using passwordless key authentication. The virtual platform should provide the ability to manage authentication keys (creation and deletion), as well as provide a mechanism available from the VM for delivering public keys to the VM during its loading.
- The organization of secure access to the servers of the IT system should be implemented using an IPsec VPN connection.
- A firewall must be built into the virtual platform, configured separately for each virtual network, as well as for virtual networks of isolated cloud environments.
- Availability of penetration test results with a deadline of not more than 1 year.
- The backup service should be managed by the customer independently through a web-based management interface.
- The functionality should be available via the web interface by specifying the backup schedule for individual servers, as well as for their manual backup and recovery.
- The data backup service must be taken into account and paid for upon use, namely, by gigabytes of protected data per month.
- The data backup service should provide the ability to backup common corporate system and application software. Software agents installed on protected servers should be free of charge.
- Backup management - through the web interface and through a software agent.
- Use file elastic S3 storage for storing copies.
- Use deduplication.
- In the cloud platform, the logical division of the VM into groups with the option of separate billing should be available.
- Payment only actually occupied volume.
Source: https://habr.com/ru/post/437194/