<?xml version="1.0" encoding="utf-8"?>
<!-- $Id: netflow.xml,v 1.2 2005/12/19 10:02:36 celeda Exp $ -->
<!DOCTYPE zprava SYSTEM "techrep.dtd">

<zprava cislo="33/2005" jazyk="en">
<nazev>Software for NetFlow Monitoring Adapter</nazev>
<autor>Pavel Čeleda, Milan Kováčik, Radek Krejčí, Jaroslav Kysela, Petr Špringl</autor>
<datum>December 20, 2005</datum>

<h1>Abstract</h1>

  <p>The objective of this technical report is to describe software activities
    and tools developed during the work on the NetFlow probe. The development
    process follows hardware software co-design principles.  We are using the
    hardware advantages to speed up time critical parts during IP flows
    monitoring.  The software covers state of the art problems in the field of
    generation and export of NetFlow v9 datagrams.</p>

  <!-- ================================================================ -->

<h1>Introduction</h1>

  <p>The development started as part of the GN2 project with the objective to
    create an autonomous network monitoring probe capable of generating data
    in the NetFlow version 9 format. At the beginning the work was hardware
    oriented.  The COMBO6 hardware accelerator with interface card (COMBO-4MTX
    or COMBO-4SFP) form the basis of our hardware <cite href="CoHW"/>. The
    hardware behaves like a programable NIC (<i>Network Interface Card</i>).
    The program for the NIC is written in VHDL (<i>VHSIC Hardware Description
    Language</i>) and we call them firmware. Firmware is software that
    is embedded in a hardware device. Line of partition between the hardware
    and software world builds the communication interface (PCI bus) which
    provides data to upper software layers in PC.</p>

   <p>The hardware description of NetFlow probe can be find in <cite
     href="Zad04"/>, <cite href="Zad05"/>. Technical report is focusing on the
     software development done during 2005 which joined the hardware
     development. The main aim was to support reading of flow records from
     the probe and the generation of NetFlow v9 datagrams <cite
     href="RFC3954"/>. Netflow data are typically acquired and exported by IP
     routers. <a href="#fig1">Figure 1</a> shows installation overview of the
     hardware accelerated NetFlow probe which we are using for IP flow
     monitoring.</p>

   <obr id="fig1" src="overview">NetFlow probe overview</obr>

   <p>The NetFlow probe works as a T-splitter and is fully transparent for
     network traffic. IP flows are monitored by NIC firmware, preprocessed and
     forwarded to upper software layers. User space application
     <prikaz>flowexporter(1)</prikaz> reads data from hardware, produces UDP
     NetFlow v9 datagrams and resends them to a collector.</p>

  <!-- ================================================================ -->

<h1>NetFlow Linux driver for COMBO6</h1>

   <p>The NetFlow Linux driver for COMBO6 supports latest 2.4 and 2.6 Linux
     kernels.  The configuration of the driver sources is standalone, but the
     configured linux kernel source tree must be present to proceed a
     successfull compilation. The configuration script is available for the
     user to make the driver configuration easy.</p>

     <obr id="fig2" src="sw_layers">NetFlow probe software layers</obr>

   <p>The NetFlow driver defines special <i>ioctl()</i> syscalls which are managing the
     device enumeration, subscription and the ring buffer management. The ring
     buffer contains the netflow records and queues for all connected
     applications. The contents of netflow records is shared among all
     applications (with the read-only access) and mapped into the user space
     using the <i>mmap()</i> syscall. The record descriptors (containing status and
     pointer to contents) are also available using the <i>mmap()</i> syscall. All
     these data are read-only.</p>

   <p>The <i>mmap()</i> access for data is used to eliminate extra copy
     rather than the standard <i>read()</i> syscall. This method is called zero-copy,
     because applications have direct access to data from driver. Because
     modern CPU have MMU (<i>Memory Management Unit</i>), these memory pages
     are protected (read-only access for applications). The driver only
     maintains the queues of waiting netflow records for applications.  Also,
     putting the data arbiter code into kernel makes the data flow robust,
     because if one application crashes, the other applications are not
     affected.</p>

  <p>The ring buffer is allocated when the driver is loaded. The ring buffer
    size can be specified via kernel module parameter or command line
    parameter to kernel at boot. We can eventually use another external way to
    modify the ring buffer size at runtime (use the proc filesystem etc.).</p>

  <!-- ................................................................ -->

    <h2>Data flow control (from the application view)</h2>

      <p>An application will start/stop capturing of data using an dedicated
	ioctl. Note that other applications might be running after stop, thus
	packets can be still overwritten in the ring buffer even if the
	application stops the capturing. The real hardware stop occurs when
	all applications are disconnected from the driver. To determine the
	first packet in or get a next packet from the FIFO (<i>First In First
	Out</i>) list maintained for the given application, a lock-next ioctl
	is used. At the end of communication, the remaining lock should be
	unlocked with an unlock ioctl (note that the lock-next ioctl will
	unlock the current packet automatically, so there is no need to call
	always a lock-next -> unlock sequence; the standard use is: open,
	subscribe, lock-next, lock-next, &ldots;, unlock, close).  The access
	to unlocked areas is permitted, but not very useful, because driver or
	hardware might overwrite data at any time.</p>

  <!-- ................................................................ -->

    <h2>Library libcsflow</h2>

       <p>The libcsflow library was designed as a middle layer between netflow
         applications and driver. It hides specific syscalls and provides the
	 standard interface in C language. The futher improvements will be to
	 move more common code like the netflow record parsing to this library
	 from netflow applications.</p>

     <h3>The C interface</h3>

     <pre>
/*
 * Device handler management
 */

int     csflow_open(csflow_device_t **dev, const int card, const int interface);
int     csflow_close(csflow_device_t **dev);
int     csflow_fd(csflow_device_t *dev, int *fd, short *events);

/*
 * Start / stop functions
 */

int     csflow_start(csflow_device_t *dev);
int     csflow_stop(csflow_device_t *dev);

/*
 * Flow record / timestamp management
 */

int     csflow_bufinfo(csflow_device_t *dev, u_int32_t *entries, u_int32_t *entry_size);
int     csflow_lock(csflow_device_t *dev, u_int32_t count, u_int32_t *locked);
int     csflow_getptr(csflow_device_t *dev, u_int32_t idx, void **ptr);
u_int64_t csflow_tstamp32_uptime(csflow_device_t *dev, u_int32_t tstamp32);
u_int64_t csflow_get_init_uptime(csflow_device_t *dev);
u_int64_t csflow_get_current_uptime(csflow_device_t *dev);
       </pre>

  <!-- ================================================================ -->

<h1>Hardware start-up</h1>

  <p>It is necessary to load and initialize the NetFlow probe firmware into
    the COMBO cards (mother and interface card) after kernel drivers are
    loaded. It means boot *.mcs (<i>Intel PROM format</i>) files which are
    result of VHDL compilation into FPGAs (<i>Field-Programmable Gate
    Array</i>). <prikaz>csboot(1)</prikaz> is used for it.</p>

     <obr id="fig4" src="combo_hw">Combo6 mother card with Combo-4SFP</obr>
     
  <p>Now the firmware is loaded into COMBO cards but it doesn't run. It must be
    initialized and started up. Firmware initialization contains several
    activities:

    <ul>
      <li>UHDRV initialization,</li>
      <li>HGEN initialization,</li>
      <li>HFE program loading.</li>
    </ul></p>

    <p>This operations can be done by <prikaz>netflowctl(1)</prikaz> tool.</p>

  <!-- ................................................................ -->
  
  <h2>UHDRV initialization</h2>

    <p>UHDRV (<i>Unified Header Driver</i>) is a hardware unit which is an
      interface between UH_FIFO and HGEN_FIFO and STATFIFO. UHDRV decides
      whether incoming record from UH_FIFO is sent for further processing -
      records with bad CRC (<i>Cyclic Redundancy Check</i>) or non IP
      (<i>Internet Protocol</i>) does not continue further. UHDRV chooses
      which array of incoming records will be masked and it is used for kind
      of agregation.</p>

    <p>It is necessary to write some program to UHDRV which tells it what kind
      of arrays has to be masked.</p>

  <!-- ................................................................ -->

  <h2>HGEN initialization</h2>

    <p>HGEN (<i>Hash Generator</i>) is a hardware unit for generating hash of
      flow. The hash function has to be initialized by random 64-bit long
      value for better usage.</p>

  <!-- ................................................................ -->

  <h2>HFE program loading</h2>

    <p>HFE (<i>Header Field Extractor</i>) is a hardware unit which is
      intended for analyzing of input packets. It is a processor based on RISC
      (<i>Reduced Instruction Set Computer</i>) architecture controlled by
      specific instruction set. HFE reads packets data from input buffer,
      analyses control information in its headers and produces specific data
      structures.</p>

    <p>Before using HFE it is necessary to load program into processor memory.
      Program controls the generation of specific data structures.</p>

  <!-- ................................................................ -->

  <h2>Configuration of NetFlow probe</h2>

    <p>The <prikaz>netflowctl(1)</prikaz> is used for configuration of NetFlow
      probe. There are several main properties to set and configure:
      <ul>
        <li>active timeout,</li>
        <li>inactive timeout,</li>
        <li>sample and hold,</li>
        <li>&ldots;</li>
      </ul>
    </p>

    <dl>
    <dt>Active timeout</dt>
      <dd>Active timeout is used for releasing flows which last for longer time
        than the specified timeout.</dd>
      
    <dt>Inactive timeout</dt>
      <dd>Inactive timeout is used for determining how long the flow should be
        kept in memory even if the device has not seen any packet of that flow.
      </dd>
     
    <dt>Sample and hold</dt>
      <dd>Sampling of incoming packets is the easiest way how to guarantee the
        measured bandwidth. It helps also to decrease number of new flows during
        attacks when every incoming packets belongs to new flow.

	Sample and hold method is quite similar to input sampling but with
	following twist. As with ordinary sampling, each packet is sampled
	with a probability. If a packet is chosen and the flow it belongs to
	is not in the flow memory, a new item is created. However, after an
	item is created for a flow, unlike in sampled NetFlow, every
	subsequent packet belonging to the flow updates the item.
      </dd>
    </dl> 

  <!-- ................................................................ -->

  <h2>How to use <prikaz>netflowctl(1)</prikaz> for configuration of NetFlow probe</h2>
  
  <p>Description of user interface of <prikaz>netflowctl(1)</prikaz>.</p>

    <p>Firmware initialization.</p>
<pre>         -c init                initialization of UHDRV, loads program for HFE
                                and initialize HGEN.
</pre>
    <p>If you don't want to use default values you can specify it with
      following parameters.</p>
	    
<pre>
	-u uhdrv_file           file for initialization of UHDRV, udrv_file can
                                be generated by gen_mem tool
        -g init_value_high      32 high bits of initialization value for HGEN
        -i init_value_low       32 low bits of initialization value for HGEN
        -e hfe_prog             file with binary program for HFE
</pre>
        <p>Inactive timeout can be set to value in seconds by parameter</p>
<pre>        -c inact_timeout -v value -b</pre>

        <p>Active timeout can be set to value in seconds by parameter</p>
<pre>        -c act_timeout -v value -b</pre>

        <p>Sample and hold can be set by parameters</p>
<pre>        -c sample_hold -v "threshold" -s "sampling_rate" -t "type"
                "type" is 0 - for constant sampling,
                          1 - for variable sampling
</pre>
  
     <p>Reset the whole design can be done by parameter</p> 
<pre>        -c reset</pre>
     <p>and new initialization values can be set by parameters -u, -g, -i, -e else the
      default values will be used.</p>

  <!-- ................................................................ -->
  
  <h2>How to configure NetFlow probe</h2>
  
    <p>A set of shell scripts can be used to do all necessary operations for
      configuring and starting up NetFlow probe. Detailed description is
      available here <cite href="NetHowTo"/></p>	    
    <ul>
    <li><prikaz>netflow_ph1_modules</prikaz> - loads kernel modules,</li>
    <li><prikaz>netflow_ph1</prikaz> - hw booting, configuration and
      <prikaz>flowexporter(1)</prikaz> start,</li>
    <li><prikaz>netflow_ph1_log</prikaz> - hw monitoring and logging.</li>
    </ul>
    
    <obr id="fig5" src="netflow_ph1">NetFlow probe initialization</obr>
	    
    <p><prikaz>netflow_ph1(1)</prikaz> detects which COMBO cards are installed
      in computer and according that it will load right firmware into FPGAs
      (see <a href="#fig5">Figure 5</a>, step 2 and 3). The script calls
      <prikaz>netflowctl(1)</prikaz> and initialize firmware and starts it up
      (step 4). Then active and inactive timeouts are set (step 5) according
      parameters of the script.</p>

    <pre>
Usage: netflow_ph1 [options]
 -a    active timeout in seconds
 -c    collector
 -h    print this help
 -i    inactive timeout in seconds
 -p    port (0,1)
 -r    set timeouts and run flowexporter (skips card boot)

Parameter '-c' is mandatory.

Example: netflow_ph1 -p 0 -a 30 -i 10 -c collector.liberouter.org:60000
         netflow_ph1 -p 0 -a 15 -i 5 -c collector.liberouter.org:60000 -r
    </pre>
    
    <p>Flowexporter is now activated and exporting data to collector.</p>

  <!-- ================================================================ -->
  
<h1>Flowexporter</h1>

  <p>Flowexporter is a basic NetFlow v9 data exporting tool. It is
    written in C to support the COMBO6 card with NetFlow extension. It communicates
    with a collector via the IPv4 protocol using UDP connection.</p>

  <p>Command line options:</p>
  <pre>
Usage: flowexporter [-dh] [-i card:interface -n host:port -t num]
 -d                   run as a daemon
 -h                   display this help message
 -i card:interface    specify netflow card and interface number
 -n host:port         send packets to host on port
 -t num               set the count of data packets between two template sendings            
                      (collector refresh timeout)
 -v                   program version
   </pre>

  <!-- ................................................................ -->
 
  <h2>Structure and functions</h2>

     <p>Flowexporter consists of three basic modules and one supervising
       module. The basic modules create templates
       (<i>flowexporter_configure()</i>), read records from the COMBO6 card
       (<i>flowexporter_system()</i>), connect to a collector and assemble
       NetFlow v9 packets (<i>flowexporter_exporter()</i>). The supervising
       module parses command line options, calls certain functions from basic
       modules and handles some events (errors, system signals).</p>

     <p>The algorithm of flowexporter is very simple. Having all
       modules configured, flowexporter commes into an infinite loop:</p>

<pre>
    loop_begin
        if &lt;refreshing timeout&gt; then
            refresh the collector;

         read a netflow record;

        if &lt;exporting packet not too big&gt; then
            copy record into packet;
        else
          begin
           send the packet to the collector;
           create a new exporting packet;
           copy the record into the new packet;
          end
             if &lt;any error occurred or signal cough&gt;
                 exit
        fi
    loop_end
</pre>

    <p>To be a little more exact, we may have a look at the pending flow
      record inside the flowexporter.</p>

    <p>After the <i>poll()</i> function succeeded a flow record is picked from
      the  driver's ring buffer.  The record is overcasted into a structure
      describing certain record members so the flowexporter could deal
      with the flow record.</p>
  
    <p>Due to the differences between members of certain records, eight
      basic types of records can be distinguished:</p>
   
    <ul>
      <li>ipv4 : tcp, udp, icmp, other</li>
      <li>ipv6 : tcp, udp, icmp, other</li>
    </ul>
  
    <p>A netflow template is related to each of these types.</p>

    <obr id="fig6" src="ethereal">Exported NetFlow v9 UDP datagram</obr>
    
    <p>Having been overcasted, the flow record is classified as a member of a
      certain type. The classification allows the flowexporter to prepare the
      record for sending to the collector. This is done by removing unused
      structure members according to the netflow template and putting the
      record into the right flowset. Flowset is a part of export packet
      storing records of the same type.</p>

    <p>Here, very basic operation is done: if the record belongs to the
      flowset currently being assembled, it is just added. If not, the
      current flowset is terminated and a new one is created to be able to
      store the record.</p>

    <p>From here on, one could name the record as a flowset record.</p>

    <p>A record gets to the collector when the packet currently being built is
      too big. Flowexporter packets are 1200 bytes long. So, if the record
      adding exceeds the packet size, the packet is sent and a new one is
      created to be able to send the exceeding record later on.</p>
  
    <obr id="fig7" src="collector">NetFlow data processed by FTAS
      collector<cite href="Kos14"/><cite href="Kos15"/></obr>
  
    <!-- ================================================================ -->

<h1>Package system</h1>

  <p>Package system is a tool for easier preparation of distribution
    packages. These packages contain necessary files to start work with COMBO
    cards. The external testers can simply plug the card, proceed installation
    of the package and start to use the card. The first such package was
    created in the summer, but it was created completely manually.  After
    experiences from this package we decided to develop a tool, which would
    be able to build packages automatically.</p>

  <p>Package system is a shell script that copies necessary files from CVS
    (<i>Concurrent Versions System</i>) tree and from project machines.
    Package system is universal for all current Liberouter projects and it can
    be easily adapted to any new project. This universality warrants unified
    form of all distribution packages.  Contents of the specific package is
    dependent on the project database file (each project has own database
    file). Database file contains information for the package system script
    about files which should be copied from CVS and alternatively what changes
    have to be performed (<prikaz>cvs update</prikaz> or removing of some
    subdirectories). The contents of database file is not limited to paths in
    the CVS tree, e.g. MCS files are stored separately on the project machines.
    This is the reason why the script can be run only on the project machines.
    However, some files stored in the CVS tree can contain data (usually
    paths) which are all right in the CVS, but in the package structure they
    are incorrect (e.g.  package doesn't contain all directories).  This
    problem is solved by modified files which are stored separately in the
    CVS tree. The script copies these files at the end and rewrites all
    incorrect files.  This is a way how to add files which shouldn't be stored
    in the CVS into the package (typically main README or release notes for
    the package). The last operation is generation of distribution package
    (archive file - .tgz).</p>

  <h2>Package contents</h2>

    <p>Package must contain all necessary files for card initialization
      (<prikaz>csboot(1), csbus(1), csid(1), etc.</prikaz>).  The most important
      files are MCS files. These files contain programs for the FPGAs used on
      the COMBO cards.  Kernel drivers are next vital part of the package.
      Drivers provide means for communication between COMBO cards
      and operating system (applications running there).  There is range of
      user space applications which can be part of package but it depends on
      the given project. All these tools (and drivers) are distributed as
      source code files.  Package installation is covered by the build system.
      This system allows users simple process of installation.  Build system
      perform automatic compilation of all necessary tools and eventually
      their installation into the system.  Drivers are compiled separately,
      but the compilation isn't complicated (it is very similar to build
      system) and it's well described in the README file.  In this moment user
      has prepared all necessary tools on his machine but he still can't use
      the card. This step is covered by scripts which are specific for the
      concrete project.  Generally these scripts have to boot firmware
      programs into the COMBO cards, initialize design and attach the driver.
      Next activity is dependent on the concrete project (interfaces can be
      configured or some application can be run).</p>
 
  <h2>Preparation of new package</h2>

    <p>The preparation of new package begins with modification of the database
      file for the appropriate project. It means that lines with information
      about files that will contain the package must be added. These lines
      must contain information about package version and alternatively about
      changes which have to be performed (<prikaz>CVS update</prikaz> or
      removing subdirectory). When the database contains information about
      whole directory structure in the new package we have to prepare files
      which aren't in the CVS (or they are incorrect). It includes preparation
      of the main README file and files with description about changes since
      last package version (release notes - RELNOTES file). At this moment
      it's time to "tell" script that new package exists. It means adding
      general information about package in the list of packages. The file with
      this list is stored near the package script and it contains names of all
      known packages, path to database files and the firmware version.
      Firmware version is more detailed described in the database file (path
      to the mcs files on the project machine).  Package name is usually
      compound of the project name and the version of the package.</p>
      
     <obr id="fig8" src="pkg_system">NetFlow package system</obr>

     <p>Preparation of new package version is more simple because the most of
      files are prepared in the previous version and only changes must be
      defined. Our policy says that new package must be released at
      least every three months. In the case that no new features were
      added during last three months, it's necessary to announce it at
      client's web. But usually there are some changes in tools which are part
      of the package. In this case we simply use package system to create last
      version of the package with up-to-date versions of changed tools (stored
      in the CVS).</p>

    <p>Before releasing the package we test packages on the computers with the
      basic installation of supported operating systems. This testing is used
      for checking function of the package, i.e. proper function of the build
      system and scripts for booting COMBO card.</p>

    <p>After finishing testing (testing period is usually one week), the new
      package will be released. Distribution to our customers (external
      testers) is provided throw our website. For this purpose we have 
      client's web (at <a href="http://www.liberouter.org/clients">
      http://www.liberouter.org/clients</a>).  After simple registration,
      anybody can download required version of the package.</p>
 
  <!-- ================================================================ -->

<h1>Conclusion</h1>

  <p>We succeeded in reading flow records from NetFlow probe hardware and
    generating NetFlow v9 datagrams. The probe has been successfully tested
    during last tree months. IP flows from real university backbone have been
    acquired and NetFlow v9 datagrams sent to the FTAS (<i>Flow-Based Traffic
    Analysis System</i>) collector. NetFlow package (hw and sw) was delivered
    to foreign testers in Holland.</p>

  <p>Realized software solution is based on several basic entities wired
    together with shell scripts. The development of software was closely
    connected with progress in hardware implementation. We have developed
    several tools to support different hardware requirements and
    configurations. To improve this situation we will focus in the future
    releases on the following topics:</p>

   <ul>
     <li>integration of NetFlow framework tools and creation of NetFlow probe
       hardware manager (system daemon), which will be responsible for common
       configuration (FPGA and HFE booting, hardware setup, etc.) and
       providing shared configuration interface to other user space
       applications.</li>

     <li>enhancement of supported features by the <prikaz>flowexporter(1)</prikaz>
       <ul>
         <li>support for Cisco NetFlow v5,</li>
         <li>software flow filtering based on IP ranges,</li>
         <li>support for Netopeer configuration utility (XML configuration),</li>
         <li>IPFIX (<i>Internet Protocol Flow Information Export</i>),</li>
         <li>flow anonymization,</li>
         <li>&ldots;</li>
       </ul>
     </li>
     <li>support for new hardware features and platforms which will be added.</li>
   </ul>

 <!-- ================================================================ -->

  <seznamknih>
    <kniha id="CoHW">
      Liberouter Project. <i>Description of COMBO cards.</i>
      http://www.liberouter.org/hardware.php
    </kniha>
    <kniha id="NetHowTo">
      Liberouter Project. <i>NetFlow probe HOWTO.</i>
      http://www.liberouter.org/netflow/userdoc.php
    </kniha>
    <kniha id="Kos14">
      Košňar, T. <i>Notes to Flow-Based Traffic Analysis System Design.</i>
      CESNET Technical Report 14/2004.
    </kniha>
    <kniha id="Kos15">
      Košňar, T. <i>Flow-Based Traffic Analysis System - Architecture Overview.</i>
      CESNET Technical Report 15/2004.
    </kniha>
    <kniha id="RFC3954">
      Network Working Group. <i>RFC3954 - Cisco Systems NetFlow Services Export Version 9.</i>
    </kniha>
    <kniha id="Zad04">
      Žádník, M. <i>Overview of NetFlow Monitoring Adapter.</i>
      CESNET Technical Report 8/2004.
    </kniha>
    <kniha id="Zad05">
      Žádník, M. <i>NetFlow probe firmware design.</i>
      http://www.liberouter.org/netflow/design.php
    </kniha>
  </seznamknih>
</zprava>

