Notes to Flow-Based Traffic Analysis System Design

CESNET technical report number 14/2004
also available in PDF, PostScript, and XML formats.

Tom Kosnar
7.12. 2004

1   Notes to Flow-Based Traffic Analysis System Design

1.1   Abstract

Traffic monitoring is a necessary and important part of contemporary advanced network management and administration. Flow-based measurements are the major traffic information sources at the network level. The high-speed wide area network environment with non-trivial topology (maybe dynamically changing) transporting wide spectrum and significant amount of traffic is the object of interest. There are many essentially different questions asked by network administrators and managers in everyday practice. We have tried to design and later build an experimental multi-purpose flow-based monitoring system which should help to answer most of them. Although the architecture of flow-based monitoring systems is similar from the general point of view, the networks differ one from each other, user community changes its customs and also purposes of measurement may be slightly different. Therefore sharing personal experience makes sense in this area and there are still many topics to be discussed over and over. So here are some of my notices in brief.

1.2   Design and Discussion Outlets

System will be built with open-source software. The ability of open-source tools as well as the limitations were also taken in mind. It is expected to be run on conventional hardware. No specific hardware extensions should be needed except more robust IO architecture may be needed which implies server type systems. The design will not solve flow record creation issues. Flow records or flow record streams are expected to the input data of the system.

1.3   User requirements

Users feel, that they know what information are necessary about what's going on - but when they can access it, they usually don't care about it unless they are in troubles. Then they discover that informations which would be useful differ from what they requested earlier. Next time (in troubles) they discover, that another informations are needed again. What's the result? Most of the systems available can hardly answer all user questions because they are based on pre-configured or even hard-coded aggregations, top lists and other statistics while the essential information - primary flow records are thrown away too early. My experience after several years of monitoring traffic is:

These groups of requirements should be covered by designed system.

[Figure]

Figure 1: Approximate estimated distribution of user requests (empirical evidence) and corresponding aggregation progress.

1.4   Accessing traffic information

Manner in which the primary information - flow records will be stored must enable aggregated or non-aggregated selections according to arbitrary conditions over arbitrary subsets of flow fields in range of hours with acceptable response time.

To analyze what happens exactly, there is almost every time necessary (especially in security area) to provide different aggregations while the selection condition remains the same and the subset of flow fields to be selected may remain the same too. Therefore the following strategy of information retrieval was chosen:

[Figure]

Figure 2: Two phase work style.

1.5   Where to collect flows from ?

Primary flow record sources are usually active network devices (routers) or passive probes listening on specific lines or connected to interfaces with mirrored traffic. There are some conditions where the per flow record source analysis is necessary. For instance various "favorite" DoS and DDoS attacks characteristics is source address spoofing. Networks for which source address policy routing is not or cannot be set up are emitting traffic which address based analysis is wrong. The only chance is follow the interface identifiers in flow records and provide the analysis step by step - from flow records source to flow records source. On the other hand things like specific summary analysis or accounting should be done over several flow record sources which are in such case more or less (depends on real conditions, network architecture, purposes and others..) understood as single source. What does it mean for system design its architecture and monitoring strategy ? In general it's highly recommended to run flow measurement and export on all devices running as AS gateways and backbone gateways (in case you are WAN operator of course) - briefly all gates to/from your infrastructure. In case it's not possible or leads to extreme amount of sources (unacceptable costs), try to find optimal borderline in your topology between your infrastructure core and its real border (even a single flow source may be sufficient in specific network architectures - some campus networks for example). Anyway the system itself must be capable to behave and to be configured as flow source based as well as network cloud oriented - both styles at the same time.

[Figure]

Figure 3: Possible area of flow measurement.

So we can expect 1-N permanent primary flow record sources. Their exported flow records will be the essential information base for the traffic analysis. Some post-processing (will be discussed later) is also expected in this case. Besides that we must admit from time to time additional flow sources (targeted attacks, tuning). The number and purpose of flow sources probably will vary in time. Therefore the system will be designed as distributed and scalable. It will help to solve the lack of sources and computing power and also helps to keep the running system in clear logical architecture. On the other side the system management and configurations may become confusing when the architecture will be distributed and especially when high flexibility is expected. Therefore the centralized configuration and interactive management via dedicated, web delivered user interface was the final decision.

1.6   Flow sampling

In relation to distributed architecture and acceptable load of each component the absolute flow rate from single source has to be taken into account. It depends on the traffic structure, traffic amount and variety of parameters at the network devices or probes. The basic parameters define required time to live for created flows in memory (usually flow count to keep in memory, time to live for active and inactive flows and similar ones given by implementation) and percentage of IP datagrams to be used for flow generation - sampling mechanism. Sampling mechanism has the power to keep the rate of exported flow records in acceptable range. At the network devices, the IP datagram layer is understood. In current implementations it is provided mostly as 1 from N with fixed offset or as 1 from N in average with random offset within defined range. Both styles ensure in general some possibility to estimate the real amount of traffic, but unfortunately cannot guarantee acceptable flow rate under specific conditions (DDoS, DoS attacks, smurfs, port scans for instance). The adaptive sampling could ensure it, but there is a lack of implementations and you would lose the extrapolation parameter to estimate the real traffic amount unless the extrapolated values (octets, pkts) or weighted average of sampling values would become a part of the exported data. I decided to incorporate sampling at the flow layer into the design. This option can help to keep appropriate flow collector in stable state in any situation. Setting more aggressive parameters for flow export improves the quality of extrapolated values (sensibility to sampling offset is not so high in case it's fixed), but anyway we cannot omit, that unlike IP datagram sampling this is sampling of aggregates.

[Figure]

Figure 4: IP datagram and flow record sampling difference.

1.7   Classification and Filtering

We accepted that for some specific purposes flow records from several sources may be considered as they came from single source. What are these specific purposes ? It depends, but their typical common attribute is that they are targeting some excluded part of traffic which can be later analyzed according to user requests described above. There are basically two major mechanisms used - filtering and extractions from ordered short-term aggregations (top lists). From the time point of view it means persistent behavior while processing incoming flows and optionally scheduled actions to provide longer-term statistics - so called flow post-processing.

Persistent filtering of incoming flows should separate information about traffic we know we will be interested and don't want to spent sources and time selecting it from all stored data again and again. At the flow record level we should better say but primarily provide cloning and optionally storing into separate data sets - we insist on keeping possibility to store flows from each source one by one. The conditions may be complex and we have to assume, that there may be several conditions defined to select data for one purpose. Some conditions may belong to many filter definitions. In such case another mechanism could be useful - classification. Flow records format may be extended and condition matches may result in filling appropriate additional field. This mechanism is suitable for things like accounting because it provides translation from technical identifiers to an independent administratively setup logical system. It seems, that there should be separate classification fields for both sides of transmission represented by flow record data. Each set of classification fields may be considered as a hierarchy (accounting, traffic structuring). Summary: flow records that match conditions may be primarily classified. Extended flow records that match conditions (including added flow record fields) should be cloned. Their further processing will be the same as in case of flows coming from primary sources but with knowledge that they belong to appropriate filter definition. Optional accounting is special case of that - classified flow record is processed according to validity of appropriate classification fields.

[Figure]

Figure 5: Incoming flows classification and filtering, cloning record.

1.8   Short-term Aggregation

Persistent short-term aggregation of incoming flows itself may be useful to reduce flow record count to process in case we don't want to use all flow fields for further processing and/or don't insist on exact time information. Persistent short-term aggregation with ordering is a technique which is for example used to discover possible DoS attacks sources immediately (hand by hand with flow sampling of course) - it's not so meaningful for system we try to design.

[Figure]

Figure 6: Incoming flows aggregation using flow record field mask.

1.9   Flow Post-processing

Flow post-processing should keep such essential information about traffic as requested in particular case for a long time. Post-processing aggregations usually relay on traffic information stored as the result of primary flow processing and that's the manner we will follow. This implies, that operations like on-fly classification and filtering are replaced with advanced data retrieval from storage. Typical aggregations consist of data selection according to conditions, grouping, ordering and storing the most significant parts.

[Figure]

Figure 7: Flow processing and post-processing.

1.10   Specific Aggregations

Regardless of the implementation it seems to be useful to take into account (while working on the system design) two similar aggregation techniques for reaching better information value. The first one is based on sequential masking of fields in pre-aggregated ordered list of records and on-fly secondary aggregation of the result. The following example may better explain this. Let's assume we want to create top lists of most significant data sources for one hour range from stored flow records. Standard solution for example in SQL world to retrieve data may look like this:

select SOURCE_IP_ADDRESS, sum(OCTETS) as BYTES ...
from SOME_TABLE 
where ...some..condition... 
group by SOURCE_IP_ADDRESS 
order by BYTES desc ...

When some limit would be set up, we could store such result directly. But especially for short range aggregation may be very interesting information about destination address, protocol and source port number for let say top N and at least protocol information and source port number for next M and protocol for next X. Next Y will contain source IP address only and the rest could be anonymous byte count. Following the SQL example, the query would be now:

select SOURCE_IP_ADDRESS, PROTO, SOURCE_PORT, DESTINATION_IP_ADDRESS, sum(OCTETS) as BYTES ...
from SOME_TABLE
where ...some..condition...
group by SOURCE_IP_ADDRESS, PROTO, SOURCE_PORT, DESTINATION_IP_ADDRESS
order by BYTES desc ...

That may create the pre-aggregated list with grouping over all possible fields. Secondary aggregation has to take care of records deep in the source list which may belong to the top records in result after masking some record fields. We will use M=N=X=2 to make the example and corresponding diagram clear.

[Figure]

Figure 8: Post-processing aggregation example.

We must travel step by step all the pre-aggregated list - but when want to keep traffic summaries we have to anyway. No records are counted twice or even more and the information value is higher. In real conditions the parameters like M,N,X should not be hard-coded, but configurable for each post-processing train.

The second technique is similar - it adds secondary grouping only. Let's assume that we will add a classification field to the previous example - "organization ID". The number of such IDs is relatively low. It would be nice to get similar global top list as in previous example (including "organization ID" field) followed by smaller top lists - each for exclusive "organization ID" found in pre-aggregated results. That will ensure, that we will keep information for each "organization ID" - even for that which traffic was very low and such traffic wouldn't be anonymous bytes only. It means in general, that we can provide controlled sub-aggregations for each exclusive combination off selected flow record fields.

1.11   Flow Processing Summary

After discussing some topics the summarized view on the flow processing part is needed. The input flow processing architecture is extended with one module which hasn't been mentioned yet - the replication. Its functionality is obvious - resends incoming flow record datagrams one by one to other flow receivers. Flow record format would be extended in parsing module. Regardless of classification we would also like to keep information about current sampling rate and flow source (internal identifier) in each record. It is desirable when processing mixture of flows from many sources (originally) in common mechanism.

[Figure]

Figure 9: Processing incoming flows from primary flow source.

What was not described above is the manner how the flow records matching some filter definition will be processed ("Matching Flow Records Processing" in diagram). How specific they are ? Their format was extended by additional fields and they are bound to identifiers of appropriate filter definitions. When expecting distribution of flow processing devices, each flow source may export its flows to separate collector. But any filter definition may be required to be applied on several collectors and we want to keep the result in a single data set somewhere in the system. It means, that we would have to transport some data anyway. Why not to do it immediately after cloning the flow record in case it matches at least one filter definition? With some caching mechanism waiting a while for additional records to be exported it may be very efficient. Flow records exported by the filtering module may be processed in the same way as the primary ones and the filter definition identifier may have the same meaning as flow source identifier before when creating data sets in storage module.

[Figure]

Figure 10: Processing flows exported by the filtering module.

The diagram significantly differs from what was said before. After implementing and testing experimental parts of the system I decided to remove the classification and filtering modules from processing of previously filtered flows. There are basically two reasons. The practice shows that almost all requests we thought to be solved by two (or even more) phase filtering can be solved by single step filtering with optional classification support. The second reason is technical. I couldn't find any simple way (configuration structure) how to tell the system, that potential multi looping inside generic distributed filtering mechanism may be required in some cases and in others not.

1.12   Flow Storage

Last, but not least. Storage architecture is deciding factor for system flexibility, scalability and extensibility, especially when should be distributed. We decided above, that would like to achieve the option to view once selected flow records set using different flow field subsets in aggregations and to run the primary selection in the same way too. We also expected to use rather complicated conditions to retrieve stored data while post-processing flow records. The decision leads to relational database (having other reasons in mind too). Primary flow storage should be source based - each source, each filter definition should have separate data set (we must be able to identify it) regardless of how many tables it consists of. When storing flow records on source basis the most important thing is the time granularity of single data set - the table. Contemporary routers may export flow records at peak rates around 200-300 000 per second. On the other side some specific filter definition condition may match one flow per minute. The difference is obvious. It implies "tunable" parameters - configurable for each source and changeable anytime in the process as needed - this the only way to efficient data storage and later retrieval.

[Figure]

Figure 11: Input processing - time granularity of data tables example.

In general we would like to offer two types of data from the time point of view to be retrieved. The first data type should consist of as much as possible not aggregated flow records either from primary sources or results of filtering. In this case the time granularity should be set up according to real flow rate. The second one should contain aggregated data keeping essential information (per purpose) for a long time. In this case the time granularity should be set up according to purpose - tables time ranges and record counts on one side and typical time range of user requests on the other side are the parameter which should be considered. Also the number of aggregation steps can make sense - it may be recursive in generic architecture. In our case we expect to provide single aggregation step while post-processing data with separately configurable aggregations based on techniques mentioned above.

[Figure]

Figure 12: Post-processing - generic aggregation example.

další weby:fond rozvojemetacentrumCzechLightpřenosyvideoservereduroameduID.cz