Improving Reliability of a Streaming System
CESNET
technical report number 24/2006
also available in PDF,
PostScript, and
XML formats.
Miloš Wimmer
8.1.2007
1 Introduction
CESNET provides live high quality broadcasting of Czech Radio stations over the Internet. To support that purpose, we have created a system for broadcasting Ogg Vorbis-encoded audio streams at 128 kb/s and 224 kb/s. The actual solution relies on dividing the task between servers producing the ogg streams and dedicated broadcasting servers. Only free software is used - namely the GNU/Linux operating system, vlc and ices stream producers, and the icecast streaming server.
Audio broadcasting systems do not need only to meet the sound quality requirements, but also to maintain high reliability. The system needs to stay available not only in cases of unexpected server failures. All servers also require maintenance resulting in temporary service unavailability. The system's availability may be increased by introducing redundancy and robustness combined with intelligent software.
2 System Description
Every streaming system may be divided into four major parts - one that acquires and processes the input signal, then a producer encoding the input and creating a digital stream which is transported to the streaming server, the streaming server which distributes the stream to individual clients, and finally the above-mentioned clients receiving the streams and playing the audio at users' workstations.
Our Internet broadcasting system employs three encoding servers to create compressed streams, and two streaming servers serving clients wishing to receive the broadcast. Our solution has been mostly inspired by the fact that the requirements of the encoding and streaming process differ significantly. The encoding server needs a lot of computing power since it has to be able to encode multiple inputs at various bitrates simultaneously. Any possible lack of processor time or other resources could seriously impact the resulting stream causing disturbances, outages, or increasing latency.
All servers included in our streaming systems run Debian GNU/Linux. Audio encoding is carried out by the eggenc and ices applications, streaming is provided by Icecast. All these tools are being developed by the Xiph.org project under the GNU license guaranteeing the right for their free use.
2.1 Input Sources
Czech Radio stations 1, 2, 3, 6 and Region are being received from a DVB-S and DVB-T digital broadcast. Regional stations Regina and Plzeň are being received from standard analogue FM broadcasting, while the newest D-dur, Rádio Česko and Leonardo stations are being supplied in the form of ogg streams by the Czech Radio itself (same as station 1, 2, and 3 backup streams).
Ogg Vorbis streams are being encoded at 48 kHz, 16 bits and 2 channels with variable bitrates of 128 and 224 kb/s. We have chosen the 48-kHz sampling frequency not only because it represents a minor quality improvement as compared to 44.1 kHz but also because the DVB streams are being broadcast at the same frequency. 128-kb/s streams are intended for demanding listeners, while 224-kb/s streams are meant for listeners requiring the highest available quality of sound. Low-bitrate streams are provided by the Czech Radio's own technical resources.
2.2 Producers
Server tun2.cesnet.cz located at the University of West Bohemia is the primary encoding server for the streaming system. It runs on a Dell PowerEdge 1800 fitted with two 3.4-GHz Intel P4 Xeon processors, 1 GB of RAM, an Intel PRO/1000 Fibre network interface, Hauppauge WinTV NOVA-S-CI DVB-S tuner, and Sound Blaster Audigy sound card. This server produces audio streams for Czech Radio stations 1, 2, 3, 6, and Region being received through the DVB-S receiver. It also processes the Czech Radio Plzeň station, which is received through an analogue tuner connected to the server's sound card. All output streams with variable bitrates (VBR) of 128 and 224 kb/s are being transferred to two Icecast servers running at amp1.cesnet.cz and amp2.cesnet.cz. Besides that, unmodified DVB-S streams are being sent to assigned multicasting addresses via an MPEG-TS.
The remaining encoding servers are located in the CESNET server room in Prague. Server tun3.cesnet.cz runs on a DELL PowerEdge 1950 fitted with two 3-GHz Intel Dual Core Xeon processors, and 2GB of RAM. It is not equipped with any receivers as it only processes original MPEG-TS streams carrying the audio signal of Czech Radio stations 1, 2, and 3 as produced by tun1.cesnet.cz. Output streams are identical with those generated by tun2.cesnet.cz and are sent to the very same streaming servers - amp1.cesnet.cz and amp2.cesnet.cz.
Server tun1.cesnet.cz runs on a DELL PowerEdge server equipped with two 2.4-GHz Intel P4 Xeon processors, 1GB of RAM, a Hauppauge WinTV-NOVA-T DVB-T card, and a Sound Blaster Audigy sound card. The server uses its DVB-T card to receive Czech Radio stations 1, 2, and 3 and sends the resulting MPEG-TS streams to predefined multicast group addresses (these streams are being received - among others - by the tun3.cesnet.cz server). Besides that, tun1.cesnet.cz produces two Ogg Vorbis streams carrying the Czech Radio Regina station, which is being acquired by processing a signal received by an analogue tuner connected to the server's sound card.
The tun2.cesnet.cz and tun1.cesnet.cz encoding servers need to process several inputs received by the DVB cards simultaneously. That is why we are using the vls application server to communicate with the card and receive all streams being broadcast in any given packet (CSlink in our case), separate them and forward them as individual MPEG-TS streams for further processing - such as broadcasting to multicast or unicast addresses.
In our system, the vls server sends unmodified MPEG-TS content of Czech Radio stations 1, 2, 3, 6 and Region to predefined multicast group addresses located within the CESNET network. This allows users who are able to receive the CESNET multicast to listen to the original streams without any modifications. It is advisable to use the vlc client to process such streams. Vlc is and advanced software product being developed by the VideoLAN project under the free GNU license and it supports all major contemporary operating systems. It offers a number of features and supports numerous data formats. Vlc may be used both as a player and as a streaming server.
MPEG-TS streams produced by the vls application are being received by vlc clients at tun2.cesnet.cz and tun3.cesnet.cz. Vlc clients decode them to generate a pure PCM signal, which is subsequently encoded by the oggenc tool producing ogg 128-kb/s and 224-kb/s streams. The resulting streams are sent to the ices application and then transported to the amp1.cesnet.cz and amp2.cesnet.cz servers.
To make the matter clearer, let us discuss the difference between the MPEG-TS streams and ogg streams. MPEG-TS streams broadcast by the vls application running at tun1.cesnet.cz or tun2.cesnet.cz, respectively, are not converted in any way. They are forwarded as original streams received by DVB receivers (MPEG-TS coding, 48-kHz sampling frequency, 192-kb/s bitrate). The vls server sends any given stream to a pre-set address, which makes it impossible for multiple (unicast) clients to connect and receive the signal. On the other hand, it is possible to send the stream to a multi casting address making it available for all multicast-enabled clients able to access the multicast network - i.e., the limited number of CESNET users.
Ogg streams being transported from the encoding servers to the streaming servers pass through a transcoding process changing the format from MPEG-TS to PCM and subsequently to Ogg Vorbis (48-kHz sampling frequency, bitrate of 128 or 224 kb/s). Given the high quality of the resulting stream, the impact of the conversion is almost imperceptible for any regular listener.
2.3 Streaming Servers
The first streaming server - amp1.cesnet.cz - is located in the CESNET server room in Prague and runs on a common DELL PowerEdge 350 hardware fitted with one 1-GHz P-III processor and 1GB of RAM. The other streaming server - amp2.cesnet.cz - may be found at the University of West Bohemia. It runs on a DELL PowerEdge 1950 equipped with a 2.3-GHz Intel Dual Core Xeon processor and 2GB of RAM.
Both streaming servers (amp1.cesnet.cz and amp2.cesnet.cz) are connected to a Linux Virtual Server cluster making them available to clients through a shared virtual IP address matching the radio.cesnet.cz domain name. Individual listeners may connect to that address. Should one of the streaming servers become unavailable, it is automatically removed from the server pool and clients are automatically redirected to the other server.
Both streaming servers run the Icecast application receiving the Ogg Vorbis streams produced by the encoding servers and making them available to all clients over http.
Clients usually connect to addresses such as http://radio.cesnet.cz:8000/given_streams_mount_point, for example http://radio.cesnet.cz:8000/cro3.ogg.
URLs are published at the Czech Radio live broadcasting website, internet radio index servers, or from streaming server home pages.
A full list of streams broadcast through our system is shown at http://radio.cesnet.cz/.
2.4 Clients
Clients are software applications run by end users - listeners - on their computers. Clients can read and process streams distributed by the streaming servers. Processing the stream by a client usually means decoding it into a PCM signal and playing it on a sound card.
Ogg Vorbis streams are currently supported by most clients. The most popular players are vlc, winamp, xmms, zinf, foobar2000, qcf, and others.
The topology of the whole system is shown in Figure.
3 Improving System Reliability
To achieve high system reliability, it is necessary to ensure that all its components are available at all times.
At the source level, this is done primarily by reading the input signal for the three most important stations (Czech Radio 1, 2, and 3) simultaneously from three independent sources - digital satellite broadcast (DVB-S), digital terrestrial broadcast (DVB-T), and a back-up low-bitrate ogg stream provided by Czech Radio.
At the producer (encoding server) level, reliability is achieved by setting up two servers located in geographically separated locations. Each of the producers is also broadcasting its product to two independent streaming servers amp1.cesnet.cz and amp2.cesnet.cz. Besides that, encoding servers run monitoring tools watching the whole encoding process and ready to restart the whole chain of tools participating in the generation of the affected stream. Even further, the streaming server configuration relies on so-called fallback technology allowing the streaming server to detect broken streams and transfer all affected users to a backup stream without breaking their connection.
Streaming server reliability is further increased by introducing redundancy - i.e., running multiple servers found in geographically distant locations connected to an LVS virtual cluster.
3.1 Fallback
Fallback technology employed by the streaming servers is used for transferring user connections between individual streams without breaking established connections, i.e., without breaking the stream being received.
Level 1 and level 2 back-up streams are defined as fallbacks in the icecast server's configuration file. The server is thus instructed to switch all clients to appropriate back-up streams in case of loosing the primary stream. After the primary streams reemerge, the icecast server switches its clients back to them. The following section of the icecast.conf configuration file is used to achieve that purpose:
# Level 1 fallback stream definition
# z-cro1.ogg is a back-up for cro1.ogg
<mount>
<mount-name>/cro1.ogg</mount-name>
<fallback-mount>/z-cro1.ogg</fallback-mount>
<fallback-override>1</fallback-override>
</mount>
# Level 2 back-up stream zz-cro1.ogg
# (provided by the Czech Radio icecast server)
<relay>
<server>195.113.180.42</server>
<port>8000</port>
<mount>/cro1_mid.ogg</mount>
<local-mount>/zz-cro1.ogg</local-mount>
<relay-shoutcast-metadata>0</relay-shoutcast-metadata>
</relay>
# Level 2 fallback stream definition
# zz-cro1.ogg is a back-up for z-cro1.ogg
<mount>
<mount-name>/z-cro1.ogg</mount-name>
<fallback-mount>/zz-cro1.ogg</fallback-mount>
<fallback-override>1</fallback-override>
</mount>
3.2 LVS
There are several products available for setting up clusters of two or more Linux computers. The most advanced ones are probably Keepalived and Heartbeat. Clusters are used to increase service availability. Internal monitoring tools may watch either the server's IP connectivity (ping) or the accessibility of actual services. Should one of the servers be found inaccessible, its would be automatically removed from the cluster, and could get reconnected only after it came back on-line again. A typical set-up relies on a so-called LVS (Linux Virtual Server) Router monitoring the status of individual servers connected to the cluster. The LVS Router also serves as a redirector accepting external connection requests and assigning them to actual servers in the cluster. The LVS Router does not provide the service itself, it only serves as a service broker.
Considering the low number of faults, we did not want to include a specialized server to pose as an LVS Router. Such a server would require additional maintenance and should it fail, it would render all the streaming servers inaccessible. That would have us introduce LVS Router redundancy so that the streaming component itself consisted of two streaming servers and two LVS servers. This was considered unacceptable and we had to implement another solution, which is not typical but allows us to achieve similar functionality without employing additional servers.
We have used free software tools developed by the Keepalived project. These tools communicate with the Linux kernel's LVS layer and monitor server availability at Layer3, Layer4, and Layer 5/7. We rely primarily on its ability to transfer IP addresses between servers connected to the cluster. Servers amp1.cesnet.cz [195.113.161.81] and amp2.cesnet.cz [195.113.161.77] are naturally using their unique IP addresses. Both servers run a keepalived daemon making sure that one of them (preferrably the one designated as MASTER) is always assigned another - shared - address [195.113.161.70] matching the radio.cesnet.cz DNS name. Which server is currently in use depends on the configuration of the keepalived daemon as well as on streaming server accessibility. One server (amp1.cesnet.cz in our case) always servers as a MASTER, which means that anytime it is available, it is assigned the virtual address [195.113.161.70] and it uses two addresses at once. The other server (amp2.cesnet.cz) plays the BACKUP role and is only accessible through its static (real) IP address. In case of the MASTER server's failure, the virtual IP gets assigned to the BACKUP server. Server availability is monitored directly by the keepalived daemon through the VRRP protocol. After the MASTER server becomes available again, the virtual IP address is assigned back to the MASTER. Clients connecting to the radio.cesnet.cz resource - i.e., the cluster's virtual address - can receive the stream from the secondary server in case the primary fails.
The above method allows us to overcome server failures, yet it is not able to deal with situations where only certain services provided by the server fail. To achieve such functionality, we would have to use a full-blown LVS Router. In our case, a similar result is achieved by running a simple monitoring script on each of the servers. It checks the status of the streaming service and in case it decides it is not accessible, it stops the keepalived daemon. This causes the IP address to switch to the other server making the clients use the other server. After the streaming service starts working again, the monitor starts the keepalived daemon. In case this happens on the MASTER server, the virtual IP address is automatically assigned back.
Naturally, ogg streams generated by the encoding servers are broadcast to the static addresses of both streaming servers.
The keepalived.conf file for amp1 looks as follows:
global_defs {
notification_email {
wimmer@amp1.cesnet.cz
}
notification_email_from root@amp1.cesnet.cz
smtp_server 195.113.144.234
smtp_connect_timeout 30
router_id LVS_DEVEL
}
vrrp_instance eth {
state MASTER
interface eth0
virtual_router_id 123
priority 100
authentication {
auth_type PASS
auth_pass xxx.password
}
virtual_ipaddress {
195.113.161.70
}
}
The keepalived.conf file for amp2.cesnet.cz contains the following:
global_defs {
notification_email {
wimmer@amp1.cesnet.cz
}
notification_email_from root@amp2.cesnet.cz
smtp_server 195.113.144.234
smtp_connect_timeout 30
router_id LVS_DEVEL
}
# VRRP instance definition
vrrp_instance eth {
state BACKUP
interface eth0
virtual_router_id 123
priority 50
authentication {
auth_type PASS
auth_pass xxx.password
}
virtual_ipaddress {
195.113.161.70
}
}
4 Conclusion
Our study aimed at designing and implementing a solution for improving streaming system reliability.
We have achieved increased reliability by introducing redundancy at both the encoding and streaming server levels, by increasing the robustness of individual components of the system, and by employing intelligent software - namely fallback and LVS clustering technology.
The whole system now consists of three encoding servers and two streaming servers. It can withstand simultaneous failure of two encoding servers and one streaming server. Both failure and recovery are handled automatically without the need for human intervention.
References
| [Wim04] | Wimmer M. Vysílání a přenos audio signálu ve velmi vysoké kvalitě do sítě Internet - rozvoj projektu. [Transmitting very high quality audio signal to the Internet - project development.] Technical Report 16/2004, Praha: CESNET, 2004. |
| [Wim05] | Wimmer M. Vysílání a přenos audio signálu ve velmi vysoké kvalitě do sítě Internet. [Transmitting very high quality audio signal to the Internet.] Technical Report 28/2005, Praha: CESNET, 2005. |