📜 ⬆️ ⬇️

How to take control of network infrastructure. Part Three Network security Continuation

This is the second part of the chapter “Network Security” (which in turn is the third part of the series “How to take control of the network infrastructure”). In the first part of this chapter, we looked at some aspects of the network security of the Data Center segment. This chapter will focus on the “Internet Access” segment.

image

Internet access


Security is undoubtedly one of the most complex topics in the world of data networks. As in previous cases, without pretending to the depth and completeness, I will consider here quite simple, but, in my opinion, important questions, the answers to which, I hope, will help raise the level of security of your network.

When auditing this segment, pay attention to the following aspects:


Design


As an example of the design of this segment for an enterprise network, I would recommend a guide from Cisco within the SAFE model .

Of course, perhaps the decision of other vendors will seem more attractive to you (see the Gartner quadrant for 2018 ), but without urging you to follow this design in detail, I still find it useful to understand the principles and ideas underlying it.
Comment

In SAFE, the “Remote Access” segment is part of “Internet Access”. But in this series of articles we will consider it separately.
The standard set of equipment in this segment for an enterprise network (enterprise network) is


Remark 1

In this series of articles, when I talk about firewalls, I mean NGFW .
Remark 2

I omit consideration of various types of L2 / L1 or overlay L2 over L3 solutions necessary for providing L1 / L2 connectivity and I limit myself only to questions of level L3 and higher. In part, L1 / L2 issues were addressed in the chapter “ Cleaning and Documentation ”.
If you did not find a firewall in this segment, then you should not rush to conclusions.

Let us, as in the previous part , begin with the question, is it necessary to use a firewall in this segment in your case?

I can say that this seems to be the most justified place for using firewalls and for applying complex traffic filtering algorithms. In Part 1, we mentioned 4 factors that can prevent the use of firewalls in the data center segment. But here they are not so significant.
Example 1. Delay

With regard to the Internet, it makes no sense to talk about delays, even about 1 millisecond. Therefore, the delay in this segment cannot be a factor limiting the use of the firewall.
Example 2. Performance

In some cases, this factor can still be significant. Therefore, it is possible that you will have to bypass the firewall by passing some traffic (for example, traffic load balancers).
Example 3. Reliability

This factor still needs to be taken into account, but still, given the unreliability of the Internet itself, its importance for this segment is not as significant as for the data center.

So, suppose your service lives over http / https (with short sessions). In this case, you can use two independent boxes (without HA) and in case of a problem with one of them, by routing, transfer all traffic to the second.

Or you can use firewalls in transporter mode and, if they fail to resolve the problem, let traffic bypass firewalls.
Therefore, it is more likely that only the price can be the factor that will force you to abandon the use of firewalls in this segment.
Important!

There is a temptation to combine this firewall with the data center firewall (use one firewall for these segments). The decision, in principle, is possible, but you need to understand that "Internet Access" firewall actually stands at the forefront of your defense and "takes over" at least part of the malicious traffic, then, of course, you need to take into account the increased risk that this firewall will be disabled. That is, using the same devices in these two segments, you will significantly reduce the availability of your data center segment.
As usual, it should be understood that, depending on the service the company provides, the design of this segment can be very different. As usual, you can choose different approaches depending on your requirements.
Example

If you are a content provider, with a CDN network (see, for example, a series of articles ), then you may not want to create tens or even hundreds of infrastructure presence points using separate devices for routing and filtering traffic. It will be expensive, and just may be redundant.

For BGP, you don’t need to have dedicated routers at all, you can use open-source tools like Quagga . Therefore, perhaps all you need is a server or several servers, a switch and BGP.

In this case, your server or several servers can play the role of not only the CDN server, but also the router. Of course, there are still a lot of details (for example, how to ensure balancing), but this is realizable, and we have successfully applied this approach for one of our partners.

You can have several data centers with full protection (firewalls, DDOS protection services provided by your Internet providers) and dozens or hundreds of “simplified” points of presence with only L2 switches and servers.

But what about the protection in this case?

Let's look at, for example, the recently popular DNS Amplification DDOS attack . Its danger lies in the fact that a large amount of traffic is generated, which simply “clogs” 100% of all your uplinks.

What we have in the case of our design.

  • if you use AnyCast, the traffic is distributed between your points of presence. If the total bandwidth of yours is terabits, then this in itself (in fact, there have been several attacks with malicious traffic of the terabit order recently) prevents you from overflowing uplinks
  • if, however, some uplinks are “hammered”, then you simply take this site out of service (stop announcing the prefix)
  • You can also increase the share of traffic sent from your “full-fledged” (and, accordingly, protected) data centers, thus removing a significant portion of malicious traffic from unprotected points of presence

And a small note to this example. If you give enough traffic through IXs, this also reduces your exposure to such attacks.

BGP Setup


There are two topics here.


We have already talked a bit about connectivity in part 1 . The bottom line is that traffic to your customers goes the best way. Although, optimality is not always about delay, but usually it is low latency that is the main indicator of optimality. For some companies it is more important, for others - less. It all depends on the service you provide.
Example 1

If you are an exchange, and for your customers important time intervals are less than milliseconds, then, of course, there is no question of any kind of Internet.
Example 2

If you are a gaming company, and tens of milliseconds are important to you, then, of course, connectivity is very important to you.
Example 3

You also need to understand that, due to the properties of the TCP protocol, the data transfer rate within one TCP session also depends on RTT (Round Trip Time). CDN networks are also built to solve this problem, bringing the content distribution servers closer to the consumer of this content.
The study of connectivity is a separate interesting topic, worthy of a separate article or series of articles and requires a good understanding of how the Internet works.

Useful resources:

ripe.net
bgp.he.net
Example

I will give just one small example.

Suppose your data center is in Moscow and you have a single uplink - Rostelecom (AS12389). In this case (single homed) you do not need BGP, and you most likely use the address pool from Rostelecom as public addresses.

Suppose that you provide a certain service, and you have a sufficient number of customers from Ukraine, and they complain of long delays. In the study, you found that the IP addresses of some of them are in the grid 37.52.0.0/21.

By running traceroute, you saw that the traffic was going through AS1299 (Telia), and by ping, you got an average RTT of 70 - 80 milliseconds. You can see it also on Rostelecom's looking glass .

With the whois utility (on ripe.net or a local utility) you can easily determine that the block 37.52.0.0/21 belongs to AS6849 (Ukrtelecom).

Further, going to bgp.he.net you see that AS6849 has no relationship with AS12389 (they are neither clients nor uplinks to each other, nor do they have peering). But if you look at the list of peers for AS6849, then you will see, for example, AS29226 (Mastertel) and AS31133 (Megafon).

By finding these providers' looking glass, you can compare the path and RTT. For example, for Mastertel RTT will be about 30 milliseconds.

So, if the difference between 80 and 30 milliseconds is essential for your service, then perhaps you need to think about connectivity, get your AS number, your address pool in RIPE, and add additional uplinks and / or create points of presence on IXs.

With BGP, you not only have the opportunity to improve connectivity, but you also reserve your internet connection.

This document provides guidelines for configuring BGP. Despite the fact that these recommendations were developed on the basis of the “best practice” of providers, still (if your BGP settings are not quite elementary) they are undoubtedly useful and should actually be part of the hardening that we discussed in the first part .

DOS / DDOS protection


Now DOS / DDOS attacks have become a daily reality for many companies. In fact, in one form or another, you are attacked quite often. The fact that you do not notice this yet indicates only that a targeted attack has not yet been organized against you, and that the means of protection that you use even without knowing it (various built-in protection of operating systems), are sufficient to minimize the degradation of the service provided for you and your customers.

There are Internet resources that, based on logs from equipment, draw beautiful attack maps in real time.

Here you can find links to them.

My favorite card from CheckPoint.

DDOS / DOS protection is usually layered. To understand why, you need to understand what types of DOS / DDOS attacks exist (see, for example, here or here )

That is, we have three types of attacks:


If you can defend yourself against the last two types of attacks using firewalls, for example, then you will not defend yourself against attacks aimed at “overflowing” your uplinks (of course, if your total capacity of Internet channels is not calculated as terabits, but better than terabit).

Therefore, the first line of defense is protection against “volumetric” attacks, and your provider or providers owe this protection to you. If you have not realized this yet, then you are just lucky.
Example

Suppose you have several uplinks, but only one of the providers can provide you with this protection. But if all the traffic goes through one provider, then how is the connectivity, which we briefly discussed a little earlier?

At the time of the attack, you will have to partly sacrifice connectivity in this case. But

  • This is only for the duration of the attack. In the case of an attack, you can manually or automatically reconfigure BGP, so that the traffic goes only through the provider that provides you with an umbrella. After the end of the attack, you can return the routing to its previous state
  • It is not necessary to transfer all traffic. If, for example, you see that through some uplinks or peering does not attack (or traffic is not significant) you can continue to announce prefixes with competitive attributes in the direction of these BGP neighbors.

You can also outsource protection from “protocol attacks” and “application attacks” to partners.
Here you can read a good study ( translation ). True, the article is two years ago, but it will give you an idea of ​​the approaches, how you can defend against DDOS attacks.

In principle, you can restrict yourself to this, having completely surrendered your protection to outsourcing. There are advantages to this decision, but there is also an obvious minus. The fact is that we can talk (again, depending on what your company does) about business survival. And trust such things to third-party organizations ...

Therefore, let's consider how to organize the second and third lines of defense (as an addition to the protection from the provider).

So, the second line of defense is filtering and policers at the entrance to your network.
Example 1

Suppose that you “closed the umbrella” from DDOS using one of the providers. Suppose that this provider uses Arbor to filter traffic and filters at the edge of its network.

The band that Arbor can “process” is limited, and the provider, of course, cannot constantly pass the traffic of all its partners who have ordered this service through filtering equipment. Therefore, under normal conditions, traffic is not filtered.

Suppose that there is an attack SYN flood. Even if you have ordered a service in which in the event of an attack, traffic is automatically transferred to filtering, this does not happen instantly. For a minute or more, you remain under attack. And this may lead to the failure of your equipment or degradation of the service. In this case, the restriction of traffic at the boundary routing, though, will lead to the fact that some TCP sessions will not be established during this time, but it will save your infrastructure from more large-scale problems.
Example 2

An abnormally large number of SYN packets can be not only the result of a SYN flood attack. Let's assume that you provide a service in which you can have about 100 thousand TCP connections at the same time (in one data center).

Suppose that as a result of a short-term problem with one of your main providers, you have “kicked” half of the sessions. If your application is designed in such a way that it “without thinking twice” immediately (or after some interval of the same time for all sessions) tries to re-establish the connection, then you will receive approximately at least 50 thousand SYN packets at the same time.

If, for example, ssl / tls handshake should work on top of these sessions, which involves the exchange of certificates, then from the point of view of resource exhaustion for your load balancer, this will be a much stronger DDOS than a simple SYN flood. It would seem that the balancers should work out such an event, but ... unfortunately, we are faced with such a problem.

And, of course, a policer on the border router will save your equipment in this case too.
The third level of DDOS / DOS protection is the settings of your firewall.

Here you can stop both attacks of the second and third types. In general, everything that reaches the firewall can be filtered here.
The board

Try to give the firewall as little work as possible by filtering as much as possible on the first two lines of defense. And that's why.

It didn’t happen to you that accidentally generating traffic to check, for example, how resistant your servers operating system to DDOS attacks were, did you “kill” your firewall, loading it 100 percent, while using traffic with normal intensity? If not, then maybe just because you haven't tried?

In general, the firewall, as I said, is a tricky thing, and it works well with known vulnerabilities and tested solutions, but if you send something unusual, just some garbage or packages with incorrect headers, then you’ve got some so small (based on my experience), the probability can enter into a stupor and top equipment. Therefore, at stage 2, using normal ACLs (at the L3 / L4 level), allow only the traffic that should go into your network.

Firewall traffic filtering


We continue to talk about the firewall. It should be understood that DOS / DDOS attacks are just one of the types of cyber attacks.

In addition to DOS / DDOS protection, we can still have something like the following list of features:


You decide what you need from this list.

To be continued

Source: https://habr.com/ru/post/436230/