Making Skies Safer: Applying analytics to aviation passenger prescreening systems
Applying analytics to aviation passenger prescreening systems.
by Laurra A.. McLayy, Sheldon H. Jacobson and John E. KobzaImages Courtesy of Dreamstime.com Top left: Fred Goldstein; Main image: Elena Ray
The terrorist events on Sept. 11, 2001, will forever alter the way our nation views aviation security. The article by Barnett (2001) in OR/MS Today highlighted numerous important questions and issues surrounding the events of that day and how air travel has been and will continue to be affected. Since then, aviation security systems have undergone significant changes, though the analysis of such systems continues to lag well behind their actual operation. Operations research provides a unique set of methodologies and tools for designing and analyzing aviation security systems, since the foundation of operations research is based on applying analytical methods to optimally allocate and use scarce assets in making better informed decisions.
The purpose of this article is to provide a brief survey of aviation security system applications that have been used or are well positioned to benefit from operations research modeling and analysis techniques. The research efforts discussed apply operations research methodologies to address problems in the area of passenger prescreening, an important and highly visible aspect of aviation security operations. Three specific issues are highlighted: identifying performance measures, analyzing how passenger prescreening systems can fail or succeed, and designing effective passenger screening systems.
Over the past several years, there have been numerous changes to all aspects of aviation security systems, all designed to prevent a reoccurrence of the events on Sept. 11, 2001. Some of the changes include reinforcing cockpit doors, expanding the federal air marshal program, allowing only ticketed passengers to enter the enplane side of airport terminals, using bomb-sniffing dogs and screening all checked baggage for explosives.
Many of the changes implemented have been politically driven – they have been a direct result of the “kneejerk” emotional response to Sept. 11, rather than from any coordinated, systematic analysis and planning. For example, within two months after the attacks, the United States Congress mandated 100-percent screening of checked baggage by a federally certified screening device or procedure by Dec. 31, 2002, as part of the Aviation and Transportation Security Act. Prior to Sept. 11, only a small fraction of checked baggage was screened in this manner. The rapid deployment of explosive detection devices in order to meet this deadline resulted in several billion dollars being invested before any type of systematic analysis of baggage screening security systems was performed. Operations research provides methodologies that can be used to determine how taxpayer dollars can be optimally spent and how security system assets can be optimally used.
Passenger Screening and Prescreening
There are two basic approaches to passenger screening: uniform screening and selective screening. From the introduction of passenger screening in the early 1970s until 1998, a uniform screening strategy was used, whereby all passengers were screened in the same manner. During this period, passengers were screened by X-ray machines, and their carry-on baggage was screened by metal detectors. The main argument for uniform screening is that all passengers should receive the highest level of screening since anyone could pose a threat. In contrast, a selective screening strategy targets additional security resources on a few passengers perceived as being of higher risk. The main argument for selective screening is that directing expensive security assets toward fewer passengers may be more cost-effective since most passengers do not pose a threat to the system.
Passenger screening systems can be designed to detect items that are a threat or passengers who are a threat. Through the use of X-ray machines and metal detectors, the passenger screening systems currently being used in the United States focused on detecting items that are a threat. Although this does not prevent terrorists from boarding airplanes, detecting threat items removes the tools that can be used to stage an attack. The Transportation Security Administration (TSA) has pursued the notion of detecting passengers who are a threat by coupling selective screening systems with a passenger prescreening system, an automated computer system that performs a risk assessment of each passenger prior to their arrival at the airport. If such a system is used, how the passengers are screened at the airport is a function of their assessed risk.
In 1998, a selective screening system was implemented that used a computer-aided passenger prescreening system (CAPPS) that selected passengers for additional screening. CAPPS was designed to eradicate human bias in the risk assessment decision-making process. Those passengers who were cleared of being a security risk were labeled nonselectees, while those who could not be cleared of being a security risk were labeled selectees. The main screening difference between these two classes of passengers is that checked bags of selectees were screened for explosives. Although the exact information used by CAPPS is classified, reports in the popular press indicate that it used information provided at the point of ticket purchase, including demographic and flight information, frequent flyer status of the passenger, and how the passenger purchased their ticket.
CAPPS has been in use since 1998. After Sept. 11, aviation security moved in the direction of uniform screening with the enactment into law of the 100-percent checked baggage screening mandate, which eliminated the distinction between selectees and nonselectees.The TSA revisited selective screening policies through the development of CAPPS II, a refinement of CAPPS. However, on July 14, 2004, the TSA announced that CAPPS II would not be implemented due to privacy concerns, despite having invested $100 million in its development. Shortly thereafter, the TSA announced plans to replace CAPPS II with Secure Flight, a passenger prescreening system akin to CAPPS II, which partitions passengers into three risk classes: selectees, nonselectees and a third class of passengers who are not allowed to fly. This third group is extremely small and is, in part, based on FBI watchlists.
Cost-benefit analyses of different baggage screening strategies provide a method of assessing and comparing the value of such approaches. Virta et al. (2003) perform an economic analysis capturing the tradeoffs of using explosive detection systems (EDSs) to screen only selectee baggage versus screening both selectee and non-selectee baggage (i.e., the 100-percent baggage screening mandate). They conclude that the marginal increase in security per dollar spent is significantly lower for the 100-percent baggage-screening mandate than when only selectee bags are screened. Jacobson et al. (2005) incorporate deterrence into this model (one of the indirect benefits of screening both selectee and nonselectee baggage), based on a remark by the inspector general of the United States Department of Transportation, and conclude that the cost effectiveness of the 100-percent baggage screening mandate depends on the degree to which it can reduce the underlying threat level.
Barnett et al. (2001) perform a large-scale experiment at several commercial airports in the United States to estimate the costs and disruptions associated with a positive passenger baggage matching policy (PPBM). Under PPBM, unaccompanied checked baggage is removed from aircraft on originating flights. PPBM can be applied to all or a portion of checked baggage. The findings of Barnett et al. (2001) counter predictions by the airlines that using PPBM would be expensive and result in widespread delays when used on all checked baggage. They found that on average, one in seven flights experienced a delay, with each such delay averaging approximately seven minutes.
Identifying Performance Measures
Based on the number of aviation security changes that have been implemented since Sept. 11, 2001, and the fierce political and public debate surrounding these changes, it has become apparent that it is a challenge to define what good aviation security is. Identifying performance measures of interest is not only important for long-term planning of security systems, but also for efficiently managing day-to-day operations and effectively managing security systems in transition. These performance measures can be incorporated into various types of passenger screening problems, including applications in discrete optimization models, applied probability models, cost benefit analyses and risk assessments.
Since Sept. 11, 2001, much of the interest in passenger screening systems has been limited to reducing the false clear rate — the conditional probability that there is no alarm response for a threat passenger or bag. An alternative is to reduce the false alarm rate — the conditional probability that there is an alarm response for a nonthreat passenger or bag. The false clear and false alarm rates cannot be simultaneously minimized (Kobza and Jacobson 1997). For example, if all passengers were allowed to board their flights with no screening, the false alarm rate would be 0 percent while the false clear rate would be 100 percent.
Since the vast majority of passengers are not threats, most alarms are in fact false alarms. A system with a low false clear rate may have a large false alarm rate, which can be very expensive, since there must be secondary screening procedures in place to resolve such alarms. In rare cases, the bomb squad must inspect a suspect bag or an airport terminal must be shut down for several hours, resulting in millions of dollars in losses to the airlines for a single false alarm incident.
Other performance measures deal with passenger screening systems in transition. When CAPPS was used to determine which checked baggage was screened for explosives between 1998 and 2001, there was an insufficient number of baggage screening devices available in many of the nation’s airports to screen all selectee bags for explosives. This partial baggage-screening problem has not been made obsolete by the 100-percent baggage-screening mandate following Sept. 11. It models any such scenario when a new screening technology has been partially deployed and is used under a selective screening system and, because of limited capacity, not all selectees can be screened by the new technology. These performance measures focus on the types of risk that can be reduced by a single screening technology or a series of screening devices working together in a system. There may be other types of risks on a flight that are not considered by these performance measures.
Fully utilizing baggage-screening devices is one possible performance measure for the partial baggage-screening problem. Intuitively, it is equally desirable to screen additional checked bags such that the new screening devices are being used up to their capacity. Jacobson et al. (2003) introduce two alternate performance measures that capture risk across a set of flights and incorporate them into discrete optimization models. The measures are considered for a set of flights carrying both selectee and nonselectee baggage. A flight is said to be covered if all the selectee bags on it have been screened and cleared. One measure considers the total number of covered flights. Optimizing over this measure minimizes the number of flights that may be subject to a particular risk. Another measure considers the total number of passengers on covered flights. Optimizing over this measure minimizes the total number of passengers on flights that may be subject to a particular risk. Note that by optimizing over these measures, the utilization of the baggage screening devices is indirectly maximized, though depending on which measure is chosen, the security of the system can be determined to be optimal in two distinct ways, putting either fewer flights at risk or fewer passengers at risk.
Analyzing Selective Passenger Screening Systems
Aviation security professionals have expressed concern over the actual effectiveness of selective screening systems like Secure Flight in preventing attacks, given the variety of ways in which such systems can fail. Three research efforts are highlighted to illustrate how operations research tools such as risk analysis, algorithm design and applied probability can be used to analyze the flaws in selective screening systems.
A weakness of any selective screening system is that it may be possible to game it through extensive trial-and-error sampling. At present, passengers are aware of whether they have been classified as selectees or nonselectees each time they travel (most notably, by an indicator on their boarding pass, as well as by the additional screening attention they receive at the security checkpoint.) Terrorists can exploit this information to determine how they are most likely to be classified as nonselectees by flying on a number of flights and effectively sampling the characteristics that result in a nonselectee classification. Therefore, terrorists do not need to understand how the prescreening system works; they merely need to be able to manipulate the prescreening system to get the desired result (i.e., be classified as nonselectees). Chakrabarti and Strauss (2002) present this strategy as the “Carnival Booth” algorithm, which demonstrates how a system using prescreening may be less secure than systems that employ random searches.
Another weakness of any selective screening system is its dependence on passenger information to accurately assess passenger risk. The specific details underlying the currently used selective screening system are classified. Moreover, it is not clear how such a system will correctly identify terrorists as selectees when compared to random screening. It is also a challenge to accurately assess whether a selective screening system has been effective, since terrorist attacks are rare events, and how terrorists behaved in the past may not be predictive of how terrorists will behave in the future.
Barnett (2004) uses risk analysis, applied probability and data mining to analyze these issues regarding prescreening systems. He concludes that using a prescreening system such as Secure Flight may improve aviation security under a particular set of circumstances, namely, if it does not reduce the screening intensity for non-selectee passengers, if it increases the screening intensity for selectees, and if the fraction of passengers identified as selectees does not decrease. For all these reasons, Barnett (2004) recommends that Secure Flight be transitioned from a security centerpiece to one of many components in future aviation security systems.
The TSA developed the Registered Traveler Program to use in conjunction with Secure Flight. The program is designed to avoid “wasting” security resources on extremely low-risk passengers. To enroll in the Registered Traveler Program, a passenger must pass a voluntary background check and submit biometric information for identity verification when traveling. Once part of the program, these passengers undergo expedited screening in designated security lanes. Barnett (2003) outlines several potential problems with such a program, and suggests that in the worst-case scenario, the Registered Traveler Program improves screening efficiency without improving the ability to positively identify terrorists. The Registered Traveler Program pilot program is currently being tested at airports throughout the United States.
These weaknesses of selective screening systems raise the question of whether to spend security dollars on improving intelligence or on building more effective screening technologies. McLay et al. (2005c) explore this issue by performing a cost-benefit analysis using concepts from applied probability and optimization. In their analysis, more effective (though more expensive) screening technologies are considered for screening selectee baggage, given a range of accuracy levels for a prescreening system in assessing passenger risk. Several selective screening scenarios are identified that are preferable to screening all passenger baggage with explosive detection systems (EDSs), by reducing the number of successful attacks with moderate cost increases. They conclude that the accuracy of the pre-screening system is more critical for reducing the number of successful attacks than the effectiveness of the baggage screening devices used to screen selectee baggage when the proportion of the passengers classified as selectees is small.
Designing Effective Selective Passenger Screening Systems
Prohibitive costs, long security lines and questionable effectiveness in preventing attacks have impeded passenger screening initiatives. Significant infrastructure changes have been made at several airports to accommodate new screening devices, and passengers have been subjected to long lines in airport lobbies awaiting screening. Passenger screening system designs must consider the potential impact of cost, space, throughput and effectiveness. Three research efforts are highlighted that use operations research methodologies to design selective screening systems.
One solution to this situation focuses on designing multilevel passenger prescreening systems. Multilevel systems are those in which an arbitrary number of classes for screening passengers are considered, rather than the two classes (i.e., selectees and nonselectees) currently being used. A class is a set of procedures using security devices for screening passengers. The nonselectee class, for example, may screen checked baggage with EDSs, passengers with X-ray machines and carry-on baggage with metal detectors.
One way to improve selective screening systems is to use expensive baggage screening technologies with low throughput to screen passengers perceived as higher-risk. This has the potential to be a more cost-effective approach to screen passengers primarily by increasing throughput. Butler and Poole (2002) design a layered approach to screening passengers and baggage instead of the existing TSA policy of 100-percent checked baggage screening using EDSs by considering the economic impact of using different screening technologies. They consider three groups of passengers: lower-risk passengers who have volunteered for extensive background checks, lower-risk passengers about whom little is known and higher-risk passengers. They recommend screening baggage with three layers of baggage screening devices. By weaving passengers through three layers of security devices composed of EDSs, high-throughput backscatter and dual-energy X-ray devices, and hand searches, throughput is increased while the overall false clear rate remains at a level comparable to that of the 100-percent baggage screening mandate. Butler and Poole make similar recommendations for passenger screening. One implication of this screening system is that the resulting improved throughput indirectly decreases space requirements and waiting times in airport lobbies, which is of interest because many airport lobbies were not designed to accommodate extensive screening systems and excessively long waiting lines.
Two multilevel passenger screening problems (that are formulated as discrete optimization models) give insight into how screening devices should be purchased and deployed (McLay et al. 2005a,b).An analysis of a greedy heuristic for the first problem suggests that using only two classes is particularly effective, which supports the two-class paradigm of Secure Flight. For the first problem, each of the classes is defined in terms of its fixed cost (the overhead costs), its marginal cost (the additional cost to screen a passenger) and its false clear rate, with a passenger prescreening system such as Secure Flight used to differentiate passengers. The objective is to minimize the overall false clear rate subject to passenger assignments and budget constraints. The second problem, a complementary problem to the first, considers screening devices that have been purchased and installed. The second problem illustrates how devices shared by multiple classes are used. Each class is defined by the device types it uses, and each device type has an associated capacity (throughput) in a given unit of time. Optimal solutions to examples with more available classes are more sensitive with respect to changes in passenger volume and device capacity. This research suggests that incorporating prescreening systems into discrete optimization models provides insight into efficient selective screening systems.
Operations research practitioners have the unique opportunity to make a difference in aviation security. New directions in aviation security need not merely be makeshift political solutions for mending complex problems; they can be the result of modeling, analysis and planning. By illustrating several ways in which operations research has made an impact in passenger pre-screening systems, it is shown to have a place in the design and analysis of aviation security systems. However, there are some limitations. When doing operations research modeling (or in fact, mathematical modeling of any type), one must often make assumptions that may limit the applicability of the results obtained. Though such assumptions are often based on reasonable and realistic factors, they may pose difficulties in facilitating the transfer of the operations research analysis to decision-makers, since errors can lead to security breakdowns that may place people at an unnecessary risk. Second, operations research models quite often look at an application’s average or mean performance. In aviation security systems, average performance does not always capture the most interesting and salient aspects of such operations, which are often concerned with rare events and events “at the extremes.”
The issues discussed here represent but the tip of the iceberg. There are numerous problems in aviation security that can benefit from operations research methodologies, including improving perimeter access security with respect to airport employees, designing models for cargo screening, analyzing passenger throughput and space associated with security lines, and modeling secondary screening of passengers and their baggage when screening devices give an alarm response, to name just a few. By using operations research methodologies to gain insight into ways to improve aviation security system operations and performance, our field can make a lasting impression on our nation’s security and well-being.
Sheldon H. Jacobson (firstname.lastname@example.org) is a professor at the Department of Mechanical and Industrial Engineering and director of the Simulation and Optimization Laboratory, University of Illinois at Urbana-Champaign. Laura A. McLay (email@example.com) is a Ph.D. candidate at the same department. John E. Kobza (firstname.lastname@example.org) is a professor at the Department of Industrial Engineering, Texas Tech University.
The authors would like to thank Professor Arnold Barnett, George Eastman Professor of Management Science at MIT’s Sloan School of Management, for his insightful comments that resulted in a significantly improved manuscript, as well as his numerous insights into applying operations research methodologies to improve aviation security. The research on aviation security conducted by Professor Jacobson and Professor Kobza has been supported in part by the National Science Foundation (DMI-0114499, DMI-0114046). Professor Jacobson’s research has also been supported in part by the Air Force Office of Scientific Research (FA9550-04-1-0110).
1. A. Barnett, R. W. Shumsky, M. Hansen, A. Odoni, and G. Gosling, 2001, “Safe at Home? An Experiment in Domestic Airline Security,” Operations Research, Vol. 49, pgs. 181-195.
2. A. Barnett, 2001, “The Worst Day Ever,” OR/MS Today, Vol. 28, No. 6, pgs. 28-31.
3. A. Barnett, 2003, “Trust No One at the Airport,” OR/MS Today, Vol. 30, No. 1, pg. 72.
4. A. Barnett, 2004, “CAPPS II: The Foundation of Aviation Security?” Risk Analysis, Vol. 24, pgs. 909-916.
5. V. Butler and R. W. Poole Jr., 2002, “Rethinking Checked-Baggage Screening,’’ Reason Public Policy Institute, Policy Study No. 297, Los Angeles, Calif.
6. S. Chakrabarti and A. Strauss, 2002, “Carnival Booth: An Algorithm for Defeating the Computer-Aided Passenger Screening System,” First Monday 7, www.firstmonday.org.
7. S. H. Jacobson, J. E. Virta, J. M. Bowman, J. E. Kobza, and J. J., Nestor, 2003, “Modeling Aviation Baggage Screening Security Systems: A Case Study,” IIE Transactions, Vol. 35, pgs. 259-269.
8. S. H. Jacobson, T. Karnani, and J. E. Kobza, 2005, “Assessing the Impact of Deterrence on Aviation Checked Baggage Screening Strategies,” International Journal of Risk Assessment & Management, Vol. 5, No. 1, pgs. 1-15.
9. J. E. Kobza and S. H. Jacobson, 1997, “Probability Models for Access Security System Architectures,” Journal of the Operational Research Society, Vol. 48, pgs. 255-263.
10. L. A. McLay, S. H. Jacobson, and J. E. Kobza, 2005(a), “A Multilevel Passenger Prescreening Problem for Aviation Security,” Technical Report, University of Illinois, Urbana, Ill.
11. L. A. McLay, S. H. Jacobson, and J. E. Kobza, 2005(b), “Integer Programming Models and Analysis for a Multilevel Passenger Screening Problem,” Technical Report, University of Illinois, Urbana, Ill.
12. L. A. McLay, S. H. Jacobson, and J. E. Kobza, 2005(c), “When is Selective Screening Effective for Aviation Security?” Technical Report, University of Illinois, Urbana, Ill.
13. J. E. Virta, S. H. Jacobson, and J. E. Kobza, 2003, “Analyzing the cost of Screening Selectee and Non-selectee Baggage,” Risk Analysis, Vol. 23, No. 5, pgs. 897-908.