CommentThe standard set of equipment in this segment for an enterprise network (enterprise network) is
In SAFE, the “Remote Access” segment is part of “Internet Access”. But in this series of articles we will consider it separately.
Remark 1
In this series of articles, when I talk about firewalls, I mean NGFW .
Remark 2If you did not find a firewall in this segment, then you should not rush to conclusions.
I omit consideration of various types of L2 / L1 or overlay L2 over L3 solutions necessary for providing L1 / L2 connectivity and I limit myself only to questions of level L3 and higher. In part, L1 / L2 issues were addressed in the chapter “ Cleaning and Documentation ”.
Example 1. Delay
With regard to the Internet, it makes no sense to talk about delays, even about 1 millisecond. Therefore, the delay in this segment cannot be a factor limiting the use of the firewall.
Example 2. Performance
In some cases, this factor can still be significant. Therefore, it is possible that you will have to bypass the firewall by passing some traffic (for example, traffic load balancers).
Example 3. ReliabilityTherefore, it is more likely that only the price can be the factor that will force you to abandon the use of firewalls in this segment.
This factor still needs to be taken into account, but still, given the unreliability of the Internet itself, its importance for this segment is not as significant as for the data center.
So, suppose your service lives over http / https (with short sessions). In this case, you can use two independent boxes (without HA) and in case of a problem with one of them, by routing, transfer all traffic to the second.
Or you can use firewalls in transporter mode and, if they fail to resolve the problem, let traffic bypass firewalls.
Important!As usual, it should be understood that, depending on the service the company provides, the design of this segment can be very different. As usual, you can choose different approaches depending on your requirements.
There is a temptation to combine this firewall with the data center firewall (use one firewall for these segments). The decision, in principle, is possible, but you need to understand that "Internet Access" firewall actually stands at the forefront of your defense and "takes over" at least part of the malicious traffic, then, of course, you need to take into account the increased risk that this firewall will be disabled. That is, using the same devices in these two segments, you will significantly reduce the availability of your data center segment.
Example
If you are a content provider, with a CDN network (see, for example, a series of articles ), then you may not want to create tens or even hundreds of infrastructure presence points using separate devices for routing and filtering traffic. It will be expensive, and just may be redundant.
For BGP, you don’t need to have dedicated routers at all, you can use open-source tools like Quagga . Therefore, perhaps all you need is a server or several servers, a switch and BGP.
In this case, your server or several servers can play the role of not only the CDN server, but also the router. Of course, there are still a lot of details (for example, how to ensure balancing), but this is realizable, and we have successfully applied this approach for one of our partners.
You can have several data centers with full protection (firewalls, DDOS protection services provided by your Internet providers) and dozens or hundreds of “simplified” points of presence with only L2 switches and servers.
But what about the protection in this case?
Let's look at, for example, the recently popular DNS Amplification DDOS attack . Its danger lies in the fact that a large amount of traffic is generated, which simply “clogs” 100% of all your uplinks.
What we have in the case of our design.
- if you use AnyCast, the traffic is distributed between your points of presence. If the total bandwidth of yours is terabits, then this in itself (in fact, there have been several attacks with malicious traffic of the terabit order recently) prevents you from overflowing uplinks
- if, however, some uplinks are “hammered”, then you simply take this site out of service (stop announcing the prefix)
- You can also increase the share of traffic sent from your “full-fledged” (and, accordingly, protected) data centers, thus removing a significant portion of malicious traffic from unprotected points of presence
And a small note to this example. If you give enough traffic through IXs, this also reduces your exposure to such attacks.
Example 1
If you are an exchange, and for your customers important time intervals are less than milliseconds, then, of course, there is no question of any kind of Internet.
Example 2
If you are a gaming company, and tens of milliseconds are important to you, then, of course, connectivity is very important to you.
Example 3The study of connectivity is a separate interesting topic, worthy of a separate article or series of articles and requires a good understanding of how the Internet works.
You also need to understand that, due to the properties of the TCP protocol, the data transfer rate within one TCP session also depends on RTT (Round Trip Time). CDN networks are also built to solve this problem, bringing the content distribution servers closer to the consumer of this content.
Example
I will give just one small example.
Suppose your data center is in Moscow and you have a single uplink - Rostelecom (AS12389). In this case (single homed) you do not need BGP, and you most likely use the address pool from Rostelecom as public addresses.
Suppose that you provide a certain service, and you have a sufficient number of customers from Ukraine, and they complain of long delays. In the study, you found that the IP addresses of some of them are in the grid 37.52.0.0/21.
By running traceroute, you saw that the traffic was going through AS1299 (Telia), and by ping, you got an average RTT of 70 - 80 milliseconds. You can see it also on Rostelecom's looking glass .
With the whois utility (on ripe.net or a local utility) you can easily determine that the block 37.52.0.0/21 belongs to AS6849 (Ukrtelecom).
Further, going to bgp.he.net you see that AS6849 has no relationship with AS12389 (they are neither clients nor uplinks to each other, nor do they have peering). But if you look at the list of peers for AS6849, then you will see, for example, AS29226 (Mastertel) and AS31133 (Megafon).
By finding these providers' looking glass, you can compare the path and RTT. For example, for Mastertel RTT will be about 30 milliseconds.
So, if the difference between 80 and 30 milliseconds is essential for your service, then perhaps you need to think about connectivity, get your AS number, your address pool in RIPE, and add additional uplinks and / or create points of presence on IXs.
Example
Suppose you have several uplinks, but only one of the providers can provide you with this protection. But if all the traffic goes through one provider, then how is the connectivity, which we briefly discussed a little earlier?
At the time of the attack, you will have to partly sacrifice connectivity in this case. But
- This is only for the duration of the attack. In the case of an attack, you can manually or automatically reconfigure BGP, so that the traffic goes only through the provider that provides you with an umbrella. After the end of the attack, you can return the routing to its previous state
- It is not necessary to transfer all traffic. If, for example, you see that through some uplinks or peering does not attack (or traffic is not significant) you can continue to announce prefixes with competitive attributes in the direction of these BGP neighbors.
Example 1
Suppose that you “closed the umbrella” from DDOS using one of the providers. Suppose that this provider uses Arbor to filter traffic and filters at the edge of its network.
The band that Arbor can “process” is limited, and the provider, of course, cannot constantly pass the traffic of all its partners who have ordered this service through filtering equipment. Therefore, under normal conditions, traffic is not filtered.
Suppose that there is an attack SYN flood. Even if you have ordered a service in which in the event of an attack, traffic is automatically transferred to filtering, this does not happen instantly. For a minute or more, you remain under attack. And this may lead to the failure of your equipment or degradation of the service. In this case, the restriction of traffic at the boundary routing, though, will lead to the fact that some TCP sessions will not be established during this time, but it will save your infrastructure from more large-scale problems.
Example 2The third level of DDOS / DOS protection is the settings of your firewall.
An abnormally large number of SYN packets can be not only the result of a SYN flood attack. Let's assume that you provide a service in which you can have about 100 thousand TCP connections at the same time (in one data center).
Suppose that as a result of a short-term problem with one of your main providers, you have “kicked” half of the sessions. If your application is designed in such a way that it “without thinking twice” immediately (or after some interval of the same time for all sessions) tries to re-establish the connection, then you will receive approximately at least 50 thousand SYN packets at the same time.
If, for example, ssl / tls handshake should work on top of these sessions, which involves the exchange of certificates, then from the point of view of resource exhaustion for your load balancer, this will be a much stronger DDOS than a simple SYN flood. It would seem that the balancers should work out such an event, but ... unfortunately, we are faced with such a problem.
And, of course, a policer on the border router will save your equipment in this case too.
The board
Try to give the firewall as little work as possible by filtering as much as possible on the first two lines of defense. And that's why.
It didn’t happen to you that accidentally generating traffic to check, for example, how resistant your servers operating system to DDOS attacks were, did you “kill” your firewall, loading it 100 percent, while using traffic with normal intensity? If not, then maybe just because you haven't tried?
In general, the firewall, as I said, is a tricky thing, and it works well with known vulnerabilities and tested solutions, but if you send something unusual, just some garbage or packages with incorrect headers, then you’ve got some so small (based on my experience), the probability can enter into a stupor and top equipment. Therefore, at stage 2, using normal ACLs (at the L3 / L4 level), allow only the traffic that should go into your network.
Source: https://habr.com/ru/post/436230/