Integrating applications with Shibboleth Service Provider

CESNET technical report number 22/2007
also available in PDF, PostScript, and XML formats.

Ivan Novakov
6.12.2007

Keywords: shibboleth, service provider, federation, SAML, identity management

1   Abstract

The objective of this paper is to describe, how to integrate applications with Shibboleth infrastructure. It covers the most important aspects and provides appropriate examples. It is designated for developers and network administrators with basic knowledge of Shibboleth and web servers (Apache, IIS) in general.

2   Introduction

2.1   Federations and Shibboleth

Federated identity management allows sharing user information among different organizations on the basis of mutual trust. Federations serve as a uniform authentication and authorization layer between the users and the services. The main "building stones" of a federation are:

Shibboleth is an Internet2 project implementing federation infrastructure based on SAML (Security Assertion Markup Language) - a XML standard for authentication and authorization data exchange. Shibboleth is currently designed for web applications only (accessed from a web browser), since it takes advantage of some specific features of the HTTP protocol.

2.2   How it works

When a resource managed by a Shibboleth-enabled service provider is requested by an unauthenticated user, the service provider redirects the user to his home identity provider for authentication. Upon successful authentication a SAML assertion is created and sent back to the service provider, which may then let the user to access the resource. The service provider may also request the identity provider for user's attributes, that can be used for authorization or just for informational purpose.

2.3   Authentication and authorization

The Shibboleth Service Provider is implemented as a web server module (Apache, IIS) and a standalone daemon. The whole functionality is carried by the daemon (SAML processing, session management, authorization), while the module just intercepts requests and serves as an interface between the web server and the daemon. Based on its configuration the module protects certain locations in a similar way the basic authentication in Apache does. Incoming requests are passed to a secured location for further processing only if they meet the requirements desired for that location.

The module supplies the secured application with information about user's identity and its attributes as well as some additional information, for example - which identity provider the user belongs to, what authentication method has been used etc.

3   Integration - common aspects

The Shibboleth infrastructure is some kind of application layer between the user and the resource. The application does not handle authentication, it just receives user's identity and its attributes from Shibboleth. That makes the development of Shibboleth-enabled applications very easy.

3.1   Attributes

Generally, the Shibboleth Service Provider provides the underneath application access to user's attributes by setting them as additional HTTP headers. Attributes to be processed by the Shibboleth Service Provider are configured in the Attribute Acceptance Policy (AAP) configuration file, usually AAP.xml in the main configuration directory. A typical attribute configuration looks as follows:

<AttributeRule Name="urn:mace:dir:attribute-def:mail" 
               Header="Shib-InetOrgPerson-mail"
               Alias="email">

    <AnySite>
      <AnyValue/>
    </AnySite>
</AttributeRule>

where

It is also possible, by using proper <SiteRule> and <Value> elements to restrict attributes to some origins (identity providers) or values only. For more information on this, see Internet2 wiki pages. Attributes that are not defined in the AAP.xml file or do not meet the rule requirements are ignored.

The application then can read the attribute values from the headers. The Apache web server however makes them accessible through environment variables and thus, they can be easily retrieved from any application written in almost any kind of programming language (PHP, Perl, C/C++, Java). For example, the attribute configured above will be accessible in PHP as follows:

<?php
$email = $_SERVER['HTTP_SHIB_INETORGPERSON_MAIL'];
?>

Finally, Shibboleth may provide access to the raw SAML assertions associated with the request. In order to allow this, the exportAssertion attribute should be set to true for the current application in the main configuration file (shibboleth.xml):

<Host name="sp.example.org"
      requireSession="true"
      exportAssertion="true"/>

The assertions are exported as a base64-encoded HTTP header value, which may exceed the allowed header length value for the web server. Fortunately, both the newer versions of Apache and IIS support directives allowing to extend that value (LimitRequestFieldSize in Apache).

3.2   Authentication

In Shibboleth infrastructure, the authentication process itself does not reveal the identity of the user. The service provider issues an authentication request to the identity provider, which in case of a successful authentication returns a positive answer. That allows not to reveal sensitive information to applications that do not need it. However, many applications require some sort of persistent identity to be able to track the users between sessions or store users' preferences, roles and permissions locally.

That identity may be based on a user attribute. Typically, it is the eduPersonPrincipalName attribute, but it can be any other custom attribute. Usually, the value of that "identity" attribute is mapped to the REMOTE_USER environment variable, for example:

<AttributeRule Name="urn:mace:dir:attribute-def:eduPersonPrincipalName" 
               Scoped="true" 
               Header="REMOTE_USER" 
               Alias="scoped_user">

    <AnySite>
        <Value Type="regexp">^[^@]+$</Value>
    </AnySite>
</AttributeRule>

The Scoped property states, that the value should be supplemented with information about user's origin - the scope. The scope is associated with the identity provider. For example, user "joe" coming from an identity provider with scope "example.org" will have the REMOTE_USER variable set to "joe@example.org". The purpose is to keep users' identifiers unique among different organizations.

Once the REMOTE_USER variable is set, it is easy to perform authentication, for example:

<?php
$identity = null;
if (isset($_SERVER['REMOTE_USER']) && !empty($_SERVER['REMOTE_USER'])) {
  $identity = $_SERVER['REMOTE_USER'];
}
else {
  // Error, identity attribute is missing
}
?>

3.3   Authorization

While authentication is handled outside the service provider, the authorization decisions fall on the service itself. There are several ways how to handle authorization on a Shibboleth-enabled service provider.

3.3.1   Apache authorization

The simplest solution is to define authorization as directives in the configuration of the web server. In Apache, these directives can be placed in the main configuration enclosed in relevant <File>, <Directory> or <Location> sections or in a separate .htaccess file, placed in the directory needed to be protected. The standard Require directive is being utilized:

Require entity-name value [value] ...

In that case, entity-name stands for an attribute alias, defined for each attribute in the AAP.xml configuration file (the Alias property of the <AttributeRule> element). For example, the following configuration grants access to students or staff only:

<Directory /secure>
    AuthType shibboleth
    ShibRequireSession On
    Require affiliation student staff
</Directory>

Multiple Require directives are possible to define. Unless the ShibRequireAll directive is set, Apache would grant access if any requirement is set.

3.3.2   Shibboleth authorization

In fact, authorization directives defined in the web server's configuration are simply passed to Shibboleth for processing. The advantage is that the standard Apache configuration interface is used. However, the Apache directive Require is not flexible enough to define more complex rules. Such complex rules can be defined in the XML-based Shibboleth configuration. The syntax supports nested logical expressions with the AND, OR and NOT operators. These rules can be defined in the main configuration file shibboleth.xml or in a separate file in a special <AccessControl> element. Example for in-line configuration in shibboleth.xml:

<RequestMapProvider 
  type="edu.internet2.middleware.shibboleth.sp.provider.NativeRequestMapProvider">
<RequestMap applicationId="default">

<Host name="sp.example.org">
  <Path name="secure" authType="shibboleth" requireSession="true">
    <AccessControl>
      <AND>
        <OR>
          <Rule require="affiliation">member@example.org</Rule>
          <Rule require="affiliation">member@otherexample.org</Rule>
        </OR>
        <Rule require="entitlement">urn:mace:example.org:exampleEntitlement</Rule>
      </AND>
    </AccessControl>
  </Path>
</Host>

</RequestMap>
</RequestMapProvider>

To put the access control definitions in a separate file, the <AccessControlProvider> element should be placed in the main configuration file, pointing to the external file:

<Host name="sp.example.org">
  <Path name="secure" authType="shibboleth" requireSession="true">

    <AccessControlProvider 
      uri="/var/www/secure/.shibacl.xml" 
      type="edu.internet2.middleware.shibboleth.sp.provider.XMLAccessControl"/>

  </Path>
</Host>

The syntax in the external file .shibacl.xml is similar to the in-line configuration:

<?xml version="1.0" encoding="UTF-8"?>
<AccessControl xmlns="urn:mace:shibboleth:target:config:1.0">
  <AND>
    <OR>
      <Rule require="affiliation">member@example.org</Rule>
      <Rule require="affiliation">member@otherexample.org</Rule>
    </OR>
    <Rule require="entitlement">urn:mace:example.org:exampleEntitlement</Rule>
  </AND>
</AccessControl>

3.3.3   Custom authorization

Authorization may be - and most probably will be - embedded in the application itself. The attribute values are accessible from the environment, so the application can read them and make additional authorization decisions. For example, roles based on attribute values may be assigned to be used in the application's own ACL system.

<?php
if ($_SERVER['HTTP_SHIB_EP_AFFILIATION'] == 'member@example.org') {
  $userRole = 'member';
}
?>

3.4   Sessions

Upon successful authentication, a session is being established at the service provider. The user is granted access to the protected resource for the duration of that session. After the session expires, the service provider issues a new authentication request to the identity provider. The identity provider itself keeps a login session for the particular user and as long as the session is valid, the user is not required to enter his credentials upon authentication request issued by a service provider. Both the sessions at the service provider and the identity provider are limited in time. Their duration is dependent on the local policies and the specific needs of the applications.

3.4.1   Common use

In case of applications or resources, where authentication is strictly required in any case, it is appropriate to enforce session creation for every request. In Apache configuration that is done through the directive ShibRequireSession, for example:

<Directory /secure>
  AuthType shibboleth
  ShibRequireSession On
  ...
</Directory>

The same can be achieved in Shibboleth configuration. To let Shibboleth decide, how to handle sessions, the following configuration is needed:

<Directory /secure>
  AuthType shibboleth
  Require shibboleth
  ...
</Directory>

In Shibboleth configuration, it is then possible to use the requireSession attribute in the <Path> element in shibboleth.xml:

<Host name="sp.example.org">
  <Path name="secure" authType="shibboleth" requireSession="true">
    ...
  </Path>
</Host>

This way of handling sessions is appropriate mostly for protecting static content or simple applications, that do not utilize their own sessions.

3.4.2   "Lazy sessions"

Many applications allow anonymous access. Users can visit the site and optionally log in to access protected content. In that case it is not appropriate to enforce session creation. Instead, session creation on demand is required. Shibboleth introduces such functionality as so called "lazy sessions".

To achieve that, it is necessary to switch off the session requirement in Shibboleth by setting the requireSession attribute to "false" (see previous section). To invoke the session creation process (usually when the user hits the "Login" button) the application has to "ask" the Shibboleth SessionInitiator for a session. Instead of implementing specific API for that task, Shibboleth takes advantage of standard HTTP redirections and utilizes a special virtual location. By accessing that location the application triggers session creation.

That virtual location may be, for example:

/Shibboleth.sso/WAYF/myFed?target=https://sp.example.org/

The /Shibboleth.sso portion of the path is the value of the handleURL attribute in the <Sessions> element:

<Sessions lifetime="7200" timeout="3600" 
          checkAddress="true" consistentAddress="true"
          handlerURL="/Shibboleth.sso" handlerSSL="true" 
          idpHistory="true" idpHistoryDays="7"
          cookieProps="; path=/; secure">

  ...
</Sessions>

The /WAYF/myFed part is a specific location set in the <SessionInitiator> element:

<SessionInitiator
    isDefault="true" id="myfed" 
    Location="/WAYF/myFed"
    Binding="urn:mace:shibboleth:sp:1.3:SessionInit"
    wayfURL="https://www.myfed.org/wayf/"
    wayfBinding="urn:mace:shibboleth:1.0:profiles:AuthnRequest"/>

The target query parameter indicates the URL the user should be redirected to after performing successful authentication.

"Lazy sessions" are identical to the normal required sessions. However, the application should properly detect session expiration and perform appropriate actions, for example - re-establish the session or treat the user as anonymous.

3.4.3   Application sessions

Relying on Shibboleth sessions has some major disadvantages. Their lifetime is hard limited in shibboleth.xml and that may be inconvenient for some applications, especially those, which utilize their own session management. For example, it may be annoying for users, who frequently use an application, to have to login every hour again. Another problem may occur when the session expires, while the user is completing a form. After posting the form, he is redirected to the identity provider for authentication and the submitted content is lost.

On the other side, extending session duration is not a good solution, the problems will occur more seldom, but will be still there. The optimal solution is to let the application handle its own session, but at the same time still taking advantage of Shibboleth's authentication and attribute exchange functionality. This can be achieved by placing under Shibboleth protection the login page of the application only. When a user needs to be authenticated, the application will redirect him to that login page. The standard Shibboleth authentication procedure will be triggered and if the user authenticates successfully and meets the requirements a Shibboleth session will be created. The application can easily detect that, then establish its own session and store the user data (extracted from the user's attributes) in it. The user is then redirected back and treated as authenticated as long as the application session is valid.

For example, if the application is located at http://www.example.org/app/, the login page location could be http://www.example.org/app/login/. The path /app/login/ should be set up to require Shibboleth authentication in the web server configuration. Then a simple script index.php located in the /app/login/ directory is required, for example:

<?php
// first we need to check if the authentication process has been triggered
if (isset($_SERVER['HTTP_SHIB_IDENTITY_PROVIDER'])) {
  
  // then we check for username
  if (isset($_SERVER['REMOTE_USER'])) {
    // if present, we save the username into the session
    $SESSION['user']['username'] = $_SERVER['REMOTE_USER'];
    // ... as well as some other attributes
    $SESSION['user']['email'] = $_SERVER['HTTP_SHIB_INETORGPERSON_MAIL'];
    $SESSION['user']['name'] = $_SERVER['HTTP_SHIB_PERSON_COMMONNAME'];
    // ...
    
    // perform redirect back to the application
    header('Location: http://www.example.org/app/');
  }
  else {
    // handle error - username not found
  }

}
else {
  // handle error - no shibboleth session
}
?>

The application itself should check, if the user has been authenticated (if there is a session and a username stored in it). Unauthenticated users should be redirected to the /app/login/ location:

<?php
session_start();

// check if authenticated
if (!isset($_SESSION['user']['username'])) {
  // redirect the user to perform login
  header('Location: http://www.example.org/app/login/');
}
?>

The login script should handle error states properly, otherwise an endless redirection loop may occur between the application an the script.

3.5   Error handling

In case of an error, encountered while Shibboleth is processing the request, an error page will be displayed to the user. That error page is created using HTML templates, shipped with Shibboleth and located mostly in the main configuration directory (depends on the configuration). These templates should be customized to suit the application's needs and appearance. For example, it may be useful to set application's logo, contact information, links to help pages etc.

The templates are configured in the main configuration file (shibboleth.xml) through the <Errors> element:

<Errors session="/etc/shibboleth/sessionError.html"
        metadata="/etc/shibboleth/metadataError.html"
        rm="/etc/shibboleth/rmError.html"
        access="/etc/shibboleth/accessError.html"
        ssl="/etc/shibboleth/sslError.html"
        supportContact="root@localhost"
        logoLocation="/shibboleth-sp/logo.jpg"
        styleSheet="/shibboleth-sp/main.css" />

There are specific templates for the different types of errors. A simple template language is implemented allowing to insert dynamic information. The parser looks for tags, that look like:

<shibmlp tag-name>

The tag-name strings may refer either to XML attributes of the <Errors> element (for example - supportContact) or specific tags described below:

requestURL

The URL associated with the request.

errorType

The type of error.

errorText

The actual error message.

errorDesc

A textual description of the error intended for human consumption.

originContactName

The contact name for the identity provider provided by that site's metadata.

originContactEmail

The contact email address for the identity provider provided by that site's metadata.

originErrorURL

The URL of an error handling page for the identity provider provided by that site's metadata.

It is also possible to use a simple limited form of conditional checking, based on the presence or the absence of data stored in a specific tag:

<shibmlpif tag-name> arbitrary markup </shibmlpif>
<shibmlpifnot tag-name> arbitrary markup </shibmlpifnot>

3.6   Logout functionality

Currently, Shibboleth does not provide global logout functionality. It is possible to perform local logout at the service provider, but as long as the login session at the identity provider is active, the user can re-enter the application without any interaction, the necessary redirects are performed transparently.

However, in some cases, especially when lazy sessions or a Shibboleth protected login page is used, it may be convenient to have local logout. The easiest way to perform local logout is to use a virtual location handle provided by Shibboleth. Upon accessing that location, the current session at the service provider is destroyed. That location is configured in the main configuration file shibboleth.xml in the appropriate Sessions element, for example:

<Sessions lifetime="7200" timeout="3600" checkAddress="true" 
          consistentAddress="true" handlerURL="/Shibboleth.sso" 
          handlerSSL="true" idpHistory="true" idpHistoryDays="7"
          cookieProps="; path=/; secure">

  <!- ... ->

  <md:SingleLogoutService Location="/Logout" 
                          Binding="urn:mace:shibboleth:sp:1.3:Logout"/>

</Sessions>

If the name of the server is www.example.org, the complete logout URL according to the configuration is then:

http://www.example.org/Shibboleth.sso/Logout

The only real global single logout can be currently performed by closing the web browser and thus destroying the stored cookies which identify the individual sessions. A real global logout has been announced to be present in the next major version of Shibboleth - 2.x.

4   Adapting existing applications

While the development of Shibboleth-enabled applications from scratch may be simple and straightforward, adapting an existing code may be more or less tricky depending on code complexity, structure and design. However, based on experience, it is possible to claim, that the majority of the existing web applications and services should be adaptable without much trouble. Of course, with the exception of proprietary closed source applications, although, there exist some workarounds how to adapt even them.

4.1   Authentication

A Shibboleth-enabled application does not handle authentication itself, it just "receives" the identity from Shibboleth as an environment variable, in most cases the REMOTE_USER variable (dependent on the AAP.xml configuration). So the only thing needed to be done is to bypass application's own authentication mechanisms and directly set the identity. It is also important to choose a user attribute, that suits best the application, to carry user's identity. Its value should be unique for each user. Usually the eduPersonPrincipalName, eduPersonTargetedId or email attribute is being used.

Some applications support different authentication back-ends. Mostly, it means, that the application implements a uniform generic authentication interface with different authentication adapters. Those applications are most convenient for adapting, because Shibboleth authentication may be implemented as just another authentication adapter.

The most common authentication methods involve users submitting their credentials (usually username and password) at a special "login" page. In Shibboleth authentication, no user input is required. In that respect it is similar to the Apache basic authentication. Applications supporting basic authentication have to check the REMOTE_USER environment variable and usually do not utilize their own user management. That makes them easily adaptable with Shibboleth.

4.2   Authorization and user management

Many applications store users and their attributes in their local storages, define user groups, assign user roles, permissions etc. Often, these applications implement related functionality, such as automatic user registration, forgotten password retrieval, user attribute editing etc. In a Shibboleth-enabled application, these functionalities should be disabled, as the application itself does not manage users' identities. However, the internal relations involving users and their attributes (groups, roles, permission) should be preserved.

A Shibboleth-enabled application should be ready to handle user creation properly. Users may be created manually by the administrator. However, a better way is to create them automatically upon their first login. Probably, it will be necessary to set some of the internal attributes with values taken from Shibboleth (name, email, contact info), so proper attribute mapping should be implemented as well. It is also necessary to prevent users from editing these "external" values. To keep them up-to-date, it is a good idea to refresh them each time the user logs in.

In some cases, it may be possible to base application's authorization mechanisms on attributes from Shibboleth. Mostly, that involves implementing more or less simple logic responsible for assigning roles, depending on the values of selected Shibboleth attributes.

5   Conclusion

To develop Shibboleth enabled applications is fairly easy. Though, it requires some knowledge about Shibboleth infrastructure and especially, how Shibboleth Service Provider handles incoming requests. On the other side, to adapt an existing application it is necessary to acquire good understanding about how the application works and how it handles processes like authentication, authorization, session and user management.

Shibboleth is probably the most wide-spread federation infrastructure. Numerous applications have already been adapted and many major web service providers support it. The complete list of applications and services, that have been Shibboleth enabled is available online.

6   Other resources

další weby:fond rozvojemetacentrumCzechLightpřenosyvideoservereduroameduID.cz