UTLogin Stability Project Overview

8/9/2017 - As of June 14, 2017, Action 1 has been completed. Action 2 has completed the Planning and Requirements phases and is now focused on developing the Design. Ongoing actions have begun to regularly report UTLogin availability.

UTLogin provides centralized authentication (single sign-on) services for more than 250 campus systems through a combination of Web Policy Agents (WPAs) installed on on-campus servers as well as SAML federation with off-campus systems. UTLogin processes more than 55 million authentication requests annually.

A marked increase in UTLogin service interruptions and system instability began in the summer of 2016. Although mitigations and fixes have been implemented to address each issue, new issues with different causes continue to appear. The Identity and Access Management (IAM) team believes the overall root cause of the ongoing instability is a combination of three major factors:

  • Maintenance complexity and support issues related to customizations and non-standard configuration of the base OpenAM vendor product implemented within UTLogin
  • Aging UTLogin system components that are at or near end of life
  • An increase in the number and complexity of the sites being protected by UTLogin

Goals and Scope

The IAM team will focus on simplifying and standardizing the UTLogin environment. UTLogin system components will be upgraded to current and well-supported versions. During this upgrade, customizations and non-standard configurations of OpenAM will be removed. Specifically, native capabilities will be used for whitelist filtering and brute force attack defenses. UTLogin’s current dependency on TED will also be removed. Reliance on external dependencies like DNS and load balancing services will be reduced to the bare minimum. The authentication policy model will be simplified while preserving the ability for CSUs to maintain their own policies, if possible.

Expert OpenAM consultants will be engaged to review UTLogin requirements and the design for the updated UTLogin environment. They will also provide cost and schedule proposals for deploying the new environment, with options for accelerating development, testing, and implementation of the new environment.

Scope

  • Action 1: Stabilize Current UTLogin Environment – Keep the current environment as stable as possible by putting the system in a “critical fix only” mode and limiting unproductive investment of time in the current environment.
  • Action 2: Simplify & Standardize UTLogin Environment – Upgrade system components to current supported versions, remove customizations and non-standard configurations of the base OpenAM product, minimize external dependencies, and review and simplify the authentication policy model.
  • Action 3: Measure & Report Progress – Monitor key performance indicators (KPIs) and report progress toward improving stability to UTLogin stakeholders.

Timeline

Action 1 (Complete): As of June 8, 2017, the IAM team disabled the self-service Realm Policy Agent and put the existing UTLogin environment in a “critical fix only” mode. Efforts will now focus on Action 2.

Action 2 (In Progress): Project planning through Design phase has been completed. The UTLogin team has developed the system requirements and is currently working on system design and prototyping. Work is also underway to procure expert OpenAM advisory services to assist the team. The timeline for implementation will be developed once the design has been created.

Action 3 (Ongoing): KPIs have been identified and are being published on a weekly basis (see: UTLogin Stability Report). Monthly status updates will be provided outlining incidents, KPI’s, and project status.

For more information regarding UTLogin Stabilization efforts, please visit: https://iamservices.utexas.edu/utlogin-stability-roadmap-june-2017/