Integrating applications with Shibboleth Service Provider
CESNET
technical report number 22/2007
also available in PDF,
PostScript, and
XML formats.
Ivan Novakov
6.12.2007
Keywords: shibboleth, service provider, federation, SAML, identity management
1 Abstract
The objective of this paper is to describe, how to integrate applications with Shibboleth infrastructure. It covers the most important aspects and provides appropriate examples. It is designated for developers and network administrators with basic knowledge of Shibboleth and web servers (Apache, IIS) in general.
2 Introduction
2.1 Federations and Shibboleth
Federated identity management allows sharing user information among different organizations on the basis of mutual trust. Federations serve as a uniform authentication and authorization layer between the users and the services. The main "building stones" of a federation are:
-
identity providers - store user identities, perform user authentication and provide user attributes
-
service providers - offer services or resources
Shibboleth is an Internet2 project implementing federation infrastructure based on SAML (Security Assertion Markup Language) - a XML standard for authentication and authorization data exchange. Shibboleth is currently designed for web applications only (accessed from a web browser), since it takes advantage of some specific features of the HTTP protocol.
2.2 How it works
When a resource managed by a Shibboleth-enabled service provider is requested by an unauthenticated user, the service provider redirects the user to his home identity provider for authentication. Upon successful authentication a SAML assertion is created and sent back to the service provider, which may then let the user to access the resource. The service provider may also request the identity provider for user's attributes, that can be used for authorization or just for informational purpose.
2.3 Authentication and authorization
The Shibboleth Service Provider is implemented as a web server module (Apache, IIS) and a standalone daemon. The whole functionality is carried by the daemon (SAML processing, session management, authorization), while the module just intercepts requests and serves as an interface between the web server and the daemon. Based on its configuration the module protects certain locations in a similar way the basic authentication in Apache does. Incoming requests are passed to a secured location for further processing only if they meet the requirements desired for that location.
The module supplies the secured application with information about user's identity and its attributes as well as some additional information, for example - which identity provider the user belongs to, what authentication method has been used etc.
3 Integration - common aspects
The Shibboleth infrastructure is some kind of application layer between the user and the resource. The application does not handle authentication, it just receives user's identity and its attributes from Shibboleth. That makes the development of Shibboleth-enabled applications very easy.
3.1 Attributes
Generally, the Shibboleth Service Provider provides the underneath application access to user's attributes by setting them as additional HTTP headers. Attributes to be processed by the Shibboleth Service Provider are configured in the Attribute Acceptance Policy (AAP) configuration file, usually AAP.xml in the main configuration directory. A typical attribute configuration looks as follows:
<AttributeRule Name="urn:mace:dir:attribute-def:mail"
Header="Shib-InetOrgPerson-mail"
Alias="email">
<AnySite>
<AnyValue/>
</AnySite>
</AttributeRule>
where
- Name is the attribute name as provided by the identity provider
- Header is the name of the HTTP header, that will carry the attribute value
- Alias is a more convenient shorter name to be used in Apache configuration (see below)
It is also possible, by using proper <SiteRule> and
<Value> elements to restrict attributes to some origins
(identity providers) or values only. For more information on this, see
Internet2
wiki pages. Attributes that are not defined in the
AAP.xml file or do not meet the rule requirements are
ignored.
The application then can read the attribute values from the headers. The Apache web server however makes them accessible through environment variables and thus, they can be easily retrieved from any application written in almost any kind of programming language (PHP, Perl, C/C++, Java). For example, the attribute configured above will be accessible in PHP as follows:
<?php $email = $_SERVER['HTTP_SHIB_INETORGPERSON_MAIL']; ?>
Finally, Shibboleth may provide access to the raw SAML assertions
associated with the request. In order to allow this, the
exportAssertion attribute should be set to true for
the current application in the main configuration file
(shibboleth.xml):
<Host name="sp.example.org"
requireSession="true"
exportAssertion="true"/>
The assertions are exported as a base64-encoded HTTP header value,
which may exceed the allowed header length value for the web
server. Fortunately, both the newer versions of Apache and IIS support
directives allowing to extend that value
(LimitRequestFieldSize in Apache).
3.2 Authentication
In Shibboleth infrastructure, the authentication process itself does not reveal the identity of the user. The service provider issues an authentication request to the identity provider, which in case of a successful authentication returns a positive answer. That allows not to reveal sensitive information to applications that do not need it. However, many applications require some sort of persistent identity to be able to track the users between sessions or store users' preferences, roles and permissions locally.
That identity may be based on a user attribute. Typically, it is
the eduPersonPrincipalName attribute, but it can be any other
custom attribute. Usually, the value of that "identity" attribute is
mapped to the REMOTE_USER environment variable, for
example:
<AttributeRule Name="urn:mace:dir:attribute-def:eduPersonPrincipalName"
Scoped="true"
Header="REMOTE_USER"
Alias="scoped_user">
<AnySite>
<Value Type="regexp">^[^@]+$</Value>
</AnySite>
</AttributeRule>
The Scoped property states, that the value should be
supplemented with information about user's origin - the scope. The
scope is associated with the identity provider. For example, user
"joe" coming from an identity provider with scope "example.org" will
have the REMOTE_USER variable set to "joe@example.org". The
purpose is to keep users' identifiers unique among different
organizations.
Once the REMOTE_USER variable is set, it is easy to
perform authentication, for example:
<?php
$identity = null;
if (isset($_SERVER['REMOTE_USER']) && !empty($_SERVER['REMOTE_USER'])) {
$identity = $_SERVER['REMOTE_USER'];
}
else {
// Error, identity attribute is missing
}
?>
3.3 Authorization
While authentication is handled outside the service provider, the authorization decisions fall on the service itself. There are several ways how to handle authorization on a Shibboleth-enabled service provider.
3.3.1 Apache authorization
The simplest solution is to define authorization as directives in
the configuration of the web server. In Apache, these directives can
be placed in the main configuration enclosed in relevant
<File>, <Directory> or
<Location> sections or in a separate .htaccess
file, placed in the directory needed to be protected. The standard
Require directive is being utilized:
Require entity-name value [value] ...
In that case, entity-name stands for an attribute alias,
defined for each attribute in the AAP.xml configuration file
(the Alias property of the <AttributeRule>
element). For example, the following configuration grants access to
students or staff only:
<Directory /secure>
AuthType shibboleth
ShibRequireSession On
Require affiliation student staff
</Directory>
Multiple Require directives are possible to define. Unless
the ShibRequireAll directive is set, Apache would grant
access if any requirement is set.
3.3.2 Shibboleth authorization
In fact, authorization directives defined in the web server's
configuration are simply passed to Shibboleth for processing. The
advantage is that the standard Apache configuration interface is
used. However, the Apache directive Require is not flexible
enough to define more complex rules. Such complex rules can be defined
in the XML-based Shibboleth configuration. The syntax supports nested
logical expressions with the AND, OR and NOT operators. These rules
can be defined in the main configuration file shibboleth.xml
or in a separate file in a special <AccessControl>
element. Example for in-line configuration in
shibboleth.xml:
<RequestMapProvider
type="edu.internet2.middleware.shibboleth.sp.provider.NativeRequestMapProvider">
<RequestMap applicationId="default">
<Host name="sp.example.org">
<Path name="secure" authType="shibboleth" requireSession="true">
<AccessControl>
<AND>
<OR>
<Rule require="affiliation">member@example.org</Rule>
<Rule require="affiliation">member@otherexample.org</Rule>
</OR>
<Rule require="entitlement">urn:mace:example.org:exampleEntitlement</Rule>
</AND>
</AccessControl>
</Path>
</Host>
</RequestMap>
</RequestMapProvider>
To put the access control definitions in a separate file, the
<AccessControlProvider> element should be placed in the
main configuration file, pointing to the external file:
<Host name="sp.example.org">
<Path name="secure" authType="shibboleth" requireSession="true">
<AccessControlProvider
uri="/var/www/secure/.shibacl.xml"
type="edu.internet2.middleware.shibboleth.sp.provider.XMLAccessControl"/>
</Path>
</Host>
The syntax in the external file .shibacl.xml is similar to the in-line configuration:
<?xml version="1.0" encoding="UTF-8"?>
<AccessControl xmlns="urn:mace:shibboleth:target:config:1.0">
<AND>
<OR>
<Rule require="affiliation">member@example.org</Rule>
<Rule require="affiliation">member@otherexample.org</Rule>
</OR>
<Rule require="entitlement">urn:mace:example.org:exampleEntitlement</Rule>
</AND>
</AccessControl>
3.3.3 Custom authorization
Authorization may be - and most probably will be - embedded in the application itself. The attribute values are accessible from the environment, so the application can read them and make additional authorization decisions. For example, roles based on attribute values may be assigned to be used in the application's own ACL system.
<?php
if ($_SERVER['HTTP_SHIB_EP_AFFILIATION'] == 'member@example.org') {
$userRole = 'member';
}
?>
3.4 Sessions
Upon successful authentication, a session is being established at the service provider. The user is granted access to the protected resource for the duration of that session. After the session expires, the service provider issues a new authentication request to the identity provider. The identity provider itself keeps a login session for the particular user and as long as the session is valid, the user is not required to enter his credentials upon authentication request issued by a service provider. Both the sessions at the service provider and the identity provider are limited in time. Their duration is dependent on the local policies and the specific needs of the applications.
3.4.1 Common use
In case of applications or resources, where authentication is
strictly required in any case, it is appropriate to enforce session
creation for every request. In Apache configuration that is done
through the directive ShibRequireSession, for example:
<Directory /secure> AuthType shibboleth ShibRequireSession On ... </Directory>
The same can be achieved in Shibboleth configuration. To let Shibboleth decide, how to handle sessions, the following configuration is needed:
<Directory /secure> AuthType shibboleth Require shibboleth ... </Directory>
In Shibboleth configuration, it is then possible to use the
requireSession attribute in the <Path> element
in shibboleth.xml:
<Host name="sp.example.org">
<Path name="secure" authType="shibboleth" requireSession="true">
...
</Path>
</Host>
This way of handling sessions is appropriate mostly for protecting static content or simple applications, that do not utilize their own sessions.
3.4.2 "Lazy sessions"
Many applications allow anonymous access. Users can visit the site and optionally log in to access protected content. In that case it is not appropriate to enforce session creation. Instead, session creation on demand is required. Shibboleth introduces such functionality as so called "lazy sessions".
To achieve that, it is necessary to switch off the session
requirement in Shibboleth by setting the requireSession
attribute to "false" (see previous section). To invoke the session
creation process (usually when the user hits the "Login" button) the
application has to "ask" the Shibboleth SessionInitiator for a
session. Instead of implementing specific API for that task,
Shibboleth takes advantage of standard HTTP redirections and utilizes
a special virtual location. By accessing that location the application
triggers session creation.
That virtual location may be, for example:
/Shibboleth.sso/WAYF/myFed?target=https://sp.example.org/
The /Shibboleth.sso portion of the path is the value of the
handleURL attribute in the <Sessions>
element:
<Sessions lifetime="7200" timeout="3600"
checkAddress="true" consistentAddress="true"
handlerURL="/Shibboleth.sso" handlerSSL="true"
idpHistory="true" idpHistoryDays="7"
cookieProps="; path=/; secure">
...
</Sessions>
The /WAYF/myFed part is a specific location set in the
<SessionInitiator> element:
<SessionInitiator
isDefault="true" id="myfed"
Location="/WAYF/myFed"
Binding="urn:mace:shibboleth:sp:1.3:SessionInit"
wayfURL="https://www.myfed.org/wayf/"
wayfBinding="urn:mace:shibboleth:1.0:profiles:AuthnRequest"/>
The target query parameter indicates the URL the user
should be redirected to after performing successful
authentication.
"Lazy sessions" are identical to the normal required sessions. However, the application should properly detect session expiration and perform appropriate actions, for example - re-establish the session or treat the user as anonymous.
3.4.3 Application sessions
Relying on Shibboleth sessions has some major disadvantages. Their lifetime is hard limited in shibboleth.xml and that may be inconvenient for some applications, especially those, which utilize their own session management. For example, it may be annoying for users, who frequently use an application, to have to login every hour again. Another problem may occur when the session expires, while the user is completing a form. After posting the form, he is redirected to the identity provider for authentication and the submitted content is lost.
On the other side, extending session duration is not a good solution, the problems will occur more seldom, but will be still there. The optimal solution is to let the application handle its own session, but at the same time still taking advantage of Shibboleth's authentication and attribute exchange functionality. This can be achieved by placing under Shibboleth protection the login page of the application only. When a user needs to be authenticated, the application will redirect him to that login page. The standard Shibboleth authentication procedure will be triggered and if the user authenticates successfully and meets the requirements a Shibboleth session will be created. The application can easily detect that, then establish its own session and store the user data (extracted from the user's attributes) in it. The user is then redirected back and treated as authenticated as long as the application session is valid.
For example, if the application is located at http://www.example.org/app/, the login page location could be http://www.example.org/app/login/. The path /app/login/ should be set up to require Shibboleth authentication in the web server configuration. Then a simple script index.php located in the /app/login/ directory is required, for example:
<?php
// first we need to check if the authentication process has been triggered
if (isset($_SERVER['HTTP_SHIB_IDENTITY_PROVIDER'])) {
// then we check for username
if (isset($_SERVER['REMOTE_USER'])) {
// if present, we save the username into the session
$SESSION['user']['username'] = $_SERVER['REMOTE_USER'];
// ... as well as some other attributes
$SESSION['user']['email'] = $_SERVER['HTTP_SHIB_INETORGPERSON_MAIL'];
$SESSION['user']['name'] = $_SERVER['HTTP_SHIB_PERSON_COMMONNAME'];
// ...
// perform redirect back to the application
header('Location: http://www.example.org/app/');
}
else {
// handle error - username not found
}
}
else {
// handle error - no shibboleth session
}
?>
The application itself should check, if the user has been authenticated (if there is a session and a username stored in it). Unauthenticated users should be redirected to the /app/login/ location:
<?php
session_start();
// check if authenticated
if (!isset($_SESSION['user']['username'])) {
// redirect the user to perform login
header('Location: http://www.example.org/app/login/');
}
?>
The login script should handle error states properly, otherwise an endless redirection loop may occur between the application an the script.
3.5 Error handling
In case of an error, encountered while Shibboleth is processing the request, an error page will be displayed to the user. That error page is created using HTML templates, shipped with Shibboleth and located mostly in the main configuration directory (depends on the configuration). These templates should be customized to suit the application's needs and appearance. For example, it may be useful to set application's logo, contact information, links to help pages etc.
The templates are configured in the main configuration file
(shibboleth.xml) through the <Errors>
element:
<Errors session="/etc/shibboleth/sessionError.html"
metadata="/etc/shibboleth/metadataError.html"
rm="/etc/shibboleth/rmError.html"
access="/etc/shibboleth/accessError.html"
ssl="/etc/shibboleth/sslError.html"
supportContact="root@localhost"
logoLocation="/shibboleth-sp/logo.jpg"
styleSheet="/shibboleth-sp/main.css" />
There are specific templates for the different types of errors. A simple template language is implemented allowing to insert dynamic information. The parser looks for tags, that look like:
<shibmlp tag-name>
The tag-name strings may refer either to XML attributes of
the <Errors> element (for example -
supportContact) or specific tags described below:
- requestURL
-
The URL associated with the request.
- errorType
-
The type of error.
- errorText
-
The actual error message.
- errorDesc
-
A textual description of the error intended for human consumption.
- originContactName
-
The contact name for the identity provider provided by that site's metadata.
- originContactEmail
-
The contact email address for the identity provider provided by that site's metadata.
- originErrorURL
-
The URL of an error handling page for the identity provider provided by that site's metadata.
It is also possible to use a simple limited form of conditional checking, based on the presence or the absence of data stored in a specific tag:
<shibmlpif tag-name> arbitrary markup </shibmlpif> <shibmlpifnot tag-name> arbitrary markup </shibmlpifnot>
3.6 Logout functionality
Currently, Shibboleth does not provide global logout functionality. It is possible to perform local logout at the service provider, but as long as the login session at the identity provider is active, the user can re-enter the application without any interaction, the necessary redirects are performed transparently.
However, in some cases, especially when lazy sessions or a
Shibboleth protected login page is used, it may be convenient to have
local logout. The easiest way to perform local logout is to use a
virtual location handle provided by Shibboleth. Upon accessing that
location, the current session at the service provider is
destroyed. That location is configured in the main configuration file
shibboleth.xml in the appropriate Sessions element,
for example:
<Sessions lifetime="7200" timeout="3600" checkAddress="true"
consistentAddress="true" handlerURL="/Shibboleth.sso"
handlerSSL="true" idpHistory="true" idpHistoryDays="7"
cookieProps="; path=/; secure">
<!- ... ->
<md:SingleLogoutService Location="/Logout"
Binding="urn:mace:shibboleth:sp:1.3:Logout"/>
</Sessions>
If the name of the server is www.example.org, the complete logout URL according to the configuration is then:
http://www.example.org/Shibboleth.sso/Logout
The only real global single logout can be currently performed by closing the web browser and thus destroying the stored cookies which identify the individual sessions. A real global logout has been announced to be present in the next major version of Shibboleth - 2.x.
4 Adapting existing applications
While the development of Shibboleth-enabled applications from scratch may be simple and straightforward, adapting an existing code may be more or less tricky depending on code complexity, structure and design. However, based on experience, it is possible to claim, that the majority of the existing web applications and services should be adaptable without much trouble. Of course, with the exception of proprietary closed source applications, although, there exist some workarounds how to adapt even them.
4.1 Authentication
A Shibboleth-enabled application does not handle authentication
itself, it just "receives" the identity from Shibboleth as an
environment variable, in most cases the REMOTE_USER variable
(dependent on the AAP.xml configuration). So the only thing
needed to be done is to bypass application's own authentication
mechanisms and directly set the identity. It is also important to
choose a user attribute, that suits best the application, to carry
user's identity. Its value should be unique for each user. Usually the
eduPersonPrincipalName, eduPersonTargetedId or
email attribute is being used.
Some applications support different authentication back-ends. Mostly, it means, that the application implements a uniform generic authentication interface with different authentication adapters. Those applications are most convenient for adapting, because Shibboleth authentication may be implemented as just another authentication adapter.
The most common authentication methods involve users submitting
their credentials (usually username and password) at a special "login"
page. In Shibboleth authentication, no user input is required. In that
respect it is similar to the Apache basic authentication. Applications
supporting basic authentication have to check the REMOTE_USER
environment variable and usually do not utilize their own user
management. That makes them easily adaptable with Shibboleth.
4.2 Authorization and user management
Many applications store users and their attributes in their local storages, define user groups, assign user roles, permissions etc. Often, these applications implement related functionality, such as automatic user registration, forgotten password retrieval, user attribute editing etc. In a Shibboleth-enabled application, these functionalities should be disabled, as the application itself does not manage users' identities. However, the internal relations involving users and their attributes (groups, roles, permission) should be preserved.
A Shibboleth-enabled application should be ready to handle user creation properly. Users may be created manually by the administrator. However, a better way is to create them automatically upon their first login. Probably, it will be necessary to set some of the internal attributes with values taken from Shibboleth (name, email, contact info), so proper attribute mapping should be implemented as well. It is also necessary to prevent users from editing these "external" values. To keep them up-to-date, it is a good idea to refresh them each time the user logs in.
In some cases, it may be possible to base application's authorization mechanisms on attributes from Shibboleth. Mostly, that involves implementing more or less simple logic responsible for assigning roles, depending on the values of selected Shibboleth attributes.
5 Conclusion
To develop Shibboleth enabled applications is fairly easy. Though, it requires some knowledge about Shibboleth infrastructure and especially, how Shibboleth Service Provider handles incoming requests. On the other side, to adapt an existing application it is necessary to acquire good understanding about how the application works and how it handles processes like authentication, authorization, session and user management.
Shibboleth is probably the most wide-spread federation infrastructure. Numerous applications have already been adapted and many major web service providers support it. The complete list of applications and services, that have been Shibboleth enabled is available online.