Attachment to UIPL 27-96

ATTACHMENT TO UIPL 27-96

MEASURING THE ACCURACY OF DECISIONS TO DENY Ul CLAIMS:
A PILOT PROJECT

A. Background

The Department's original Benefit Quality Control (BQC) plan included assessing the accuracy of denied claims with the same thoroughness BQC assesses the accuracy of paid claims. Various internal and external stakeholders strongly urged the Department to implement "Denials QC" to maintain balance in the way QC treated claimant issues. During FY 1987, in a DOL-State pilot, five States investigated the accuracy of claims formally (or officially) denied for monetary, separation, and weekly eligibility reasons. Although this pilot indicated the probable existence of a substantial number of incorrect denial decisions even after the workings of redetermination or appeals processes, resource limitations precluded incorporating denials into the QC framework.

In 1993, the Department of Labor agreed with the Vice President's National Performance Review that the Benefits QC program needed to be reexamined, According to the NPR issue paper, in reexamining BQC, the Department should also consider whether BQC should .continue to keep its existing focus on paid claims ... or include measurement of decisions to ... deny claims.

The BQC reexamination occurred within the the context of a review and restructuring of the overall UI performance measurement and continuous improvement system. A joint FederalState Performance Enhancement Work Group (PEWG) first proposed a new approach for unifying all Ul performance measurement and improvement processes. Within this system it proposed an accuracy measurement program considerably smaller than BQC and offering more flexibility to use telephone, fax, and mail for verifying information. The PEWG recommended that denial as well as approval decisions be assessed.

At the same time, there has been continuing stakeholder interest in assessing denials accuracy. In its 1994 Performance Review of the UIS's Consolidated Financial Statements, the Department of Labor's Office of Inspector General (OIG) recommended the UIS "initiate quality control programs to measure the accuracy of denied initial claim determinations which should be quantified and reported as underpayments in the financial overview." In a March 23, 1995, letter to Deputy Secretary of Labor Thomas Glynn, the United Auto Workers recommended "...benefits quality control should be modified to measure and report wrongful denials of Ul benefits." In a March 1995 meeting with DOL officials, representatives of the National Employment Law Project (NELP) said they considered the lack of a means for measuring the accuracy of denials to be a definite deficiency.

Because of the length of time since the original pilot, the need to address various questions it left unanswered, and the urging of the Deputy Secretary of Labor, the Ul Service decided to conduct another pilot to test the measurement of denials accuracy before implementing it nationwide. This paper outlines the plans for that pilot test.

The paper is structured as follows. It first reviews how the 1987 pilot project was conducted and what it found. The next section describes the proposed pilot and its timing. There are three appendices. Appendix I provides additional material on the 1987 pilot including an executive summary, an overall assessment, and a synopsis of pilot results. Appendix 11 shows where denials occur in the Ul system and their extent in Calendar 1994. Appendix III describes the Quality Performance Indicator system and Benefits Quality Control program as potentially alternative methodologies for assessing the accuracy of denial decisions.

1. The Original QC Denials Pilot Proiect

a. Design. During FY 1987, with the technical assistance of Applied Management Sciences, Inc., five States pilot tested assessing the accuracy of formally denied claims using the BQC field-verification approach. The five States were Louisiana, Pennsylvania, Iowa, South Carolina and Washington. (See Appendix I for more information). Three different approaches to defining the universes for the QC activity or selecting the samples were taken in the 5 States:

Approach 1: Separate Cross-Sectional Weekly Denials Samples; BQC Kept Intact. One State defined separate universes of the three types of denial decisions (monetary, separation, nonmonetary-nonseparation). It drew and investigated weekly samples of each. Its BQC program remained unchanged and continued to measure the accuracy of paid claims in its ordinary way.

Approach 2: Separate Cross-sectional Weekly Samples of Positive and Negative Monetary, Separation, and Nonmon-Nonsep Decisions. In one State, the Ul process was disaggregated into its three stages or levels of decisions: monetary, separation, and weekly eligibility. At each level, samples of both positive and negative decisions were drawn and investigated in the "QC way" to determine their accuracy. This model offered an alternative to the current BQC design.

Approach 3: Longitudinal Tracking. Three States tested this approach which, as Approach 2, involves a conceptual redesign of BQC. Each week, a sample of initial claims was drawn and added to a "tracking file." The experience of this cohort was monitored; all denials were investigated as they occurred, beginning with monetaries (monetary ineligibility removed the claim from the file). In addition, a sample of the
weeks paid to the claimants remaining in the cohort was investigated each week as the alternative to BQC's method of selecting payments for review.

Although the 5 States used three different methods for selecting samples of denials, once the samples were selected, they all investigated the cases in the same way, following as much as possible the current in-person BQC methodology. In addition to reviewing all pertinent agency records, each investigation involved an interview with the claimant plus contacts with the parties necessary to ascertain the facts on which the denial decision was based.

b. Findings in Brief. In brief, the pilot showed the following percentages of the denial decisions were in error (i.e., should have been approvals):

	Before Appeal/Redet Average	Range	Average	Range
Monetary	23%	10-36%	16%	7-33%
Nonmon, Sep	15	5-29	9	2-25
Nonmon, Nonsep	14	7-23	11	6-21

The pilot investigated the correctness of initial decisions before redetermination or appeal, and also noted whether those initial decisions were ultimately reversed. It
suggested that, on average, existing appeal or reconsideration processes reverse one fifth to two fifths of erroneous denials; nevertheless, between a tenth and a sixth of denials remain erroneous.

The pilot itself yielded only case error rates; because there is no claim experience to measure, the dollar impact of erroneous denials--benefits lost by claimants--can only be estimated or projected through some form of statistical modelling. The contractor did attempt this, but the -number of assumptions required to make such projections makes them of questionable value for guiding decisions.

Agency Responsibility. States applied the usual BQC action, cause and responsibility codes to denial errors. As the summary shows, the agencies were
attributed either partial or total responsibility (the contractor's report did not differentiate among them) for about three quarters of erroneous nonmon denials:

	Average	Range
Monetary	27%	20-33%
Nonmon, Sep	73	47-100
Nonmon, Nonsep	71	48-93

Agency Actions. The pilot showed that the agencies' actions on erroneous denials could further be indicated as follows:

Monetaries
Range (Ave) Separations
Range (Ave) Non-Seps
Range (Ave)

Issue Undetectable
w/existing proced. 12-43 (25) 0-54(22) 0-62(28)

Already Resolving 29-82(55) 17-41 (31) 8-33 (23)

Made Wrong Decision 0-10(4) 22-53 (32) 15-61 (29)

Insuf. Follow-thru 0-27(10) 0-14(7) 0-47(12)

Issue not Detected--
Procedures not used 0-16(6) 3-12(7) 5-15 (9)

Although the range of estimates was quite large (in part because these estimates are based on actual errors detected, the number of which were frequently quite small), on average existing procedures were unable to detect about one quarter of the
issues. In one quarter to one half of the cases, the agency was already working on a correction. In about a third of the nonmonetary denial errors, the agency had the necessary information but made the wrong decision. In the remaining sixth of cases, the agency failed to follow through on information it had or missed the issue because it faded to follow its own workable procedures.

B. The New Denied Claims Accuracy Pilot

1. Why a New Pilot? Several of the factors leading the Department to measure the accuracy of denied claims have been outlined above. The Department has decided to do so using some variant of the "Benefits QC Approach" along the lines explored in the 1987 pilot test. Instead of attempting immediate nationwide implementation, it seems prudent to set the stage with another pilot. It has been nearly ten years since the original pilot. It is not known whether conditions have changed substantially since then that need to be taken into account in the design of a nationwide approach. In addition, questions remain unanswered from original pilot, and information is needed to guide implementation of a national measurement effort.

2. Issues to be Addressed. At least the following questions need to be resolved. Some relate to the design of the pilot and involve policy issues; others depend on questions of fact and only pilot findings will illuminate them:

a. What is the most satisfactory sampling design? Three different approaches were tried in 1987. The three separate cross-sectional samples of denial decisions (Abproach 1) leading to investigations alongside BQC was conceptually and practically the simplest (if for no other reason, it did not involve redesigning BQC). The Department intends to repilot using Approach I unless, in the detailed design phase, some reason can be shown for not using it.

b. Mat are the resource implications of investigating denials accuracy? Answering this question requires answers to the following:

What should be the sample sizes? Costs aside, sample sizes are driven by the need for precision in estimating.

How long does a denials investigation take? The 1987 pilot concluded that, on average, a denials investigation according to the BQC in-person protocol required about 60 percent as much time as a payment investigation. Since that time, however, the BQC protocol changed. In 1993 an "alternative methodology" was instituted; in March 1996 the Department proposed allowing States complete flexibility to use phone-mail-fax methods. How the investigations are conducted-using the present mix of in-person and phone/mail/fax contacts, all phone-mail-fax, or relying more or less on inperson methods-will be a major determinant; the other is the extent of the denials investigation itself.

In addition to changes in the "BQC" methodology, it has since been recommended that, in view of the existing Quality Performance Index (QPI)
review of normonetary determinations, the BQC approach should only be used for monetaries. This recommendation would have States use the QPI method and instrument to review sample of Sep and Nonsep denials instead of the BQC methodology. If this recommendation were to be accepted it would result in a less costly records-only review; the parties to the adjudication would not be recontacted.

If the BQC approach is used, what should be the scope of the investigation for nonmonetary denials? Should a separation investigation focus only on the circumstances of the separation, or include a review of monetary eligibility; and should a nonmon-nonsep be limited to that particular issue, or include the monetary and/or separation eligibility as well?

c. How should sampling and sample investigations allow for State processes for (a) redeterminations and (b) appeals? When should samples of denials be drawn, and when should they be investigated?

d. It is assumed at this point that the basic measure of interest is the rates of correct and incorrect denial decisions of the three main types. (That is, this pilot will not attempt to estimate or project dollar impacts of erroneous denials.) In addition to these basic outcome measures, however, what information is critical to gather in the course of the denials investigation? What elements may be dropped from the present Benefits QC data collection instrument (DCI); what should be added?

e. What is the appropriate means, and level of detail, for obtaining information on the cost of investigating the various types of denial issues?

2. Overview of the Pilot

a. Guiding Policy Framework. This pilot design reflects the following policy decisions:

The Department, following the recommendation of the State-Federal group which developed the Ul performs design, is committed to developing an approach to measuring denials accuracy;

There is a need or demand to measure the accuracy of denied claims in the same way--the extent to which denials decisions are being made in accordance with fully-informed application of State law and policy--as paid claims;

The recommendation to use the QPI instrument for reviewing nonmonetary denials accuracy is a reasonable one and it makes sense to ascertain whether a QPI review would be a satisfactory proxy for the more extensive, and more expensive, full BQC verification;

Denials data will be verified the same way as paid claims data. This means that, unless OMB vetoes the proposal to allow "full flexibility" in the use of phone-mail-fax for verifying paid claims data, the same flexibility will be allowed investigators determining the accuracy of denied claims.

In brief, these imply that this is an operational pilot, not a feasibility or cost-benefit study. The only benefit-cost aspect concerns the cost-effectiveness tradeoff regarding use of the QPI vs. Benefits QC methodology in assessing nonmon denials accuracy.

b. Objectives. The pilot thus will seek to achieve the following objectives:

Assess the current range of denials errors and establish desirable precision/sample sizes;

Determine how much the BQC field-verification methodology adds to the QPI methodology in assessing the accuracy of nomonetary denials;

Determine the cost of measuring denials by both the BQC and QPI methodologies;

Identify the principal reasons for denials errors and ensure that these reasons are coded so that all States can identify and eliminate or mitigate them;

Identify operational obstacles to measuring denials errors consistently.

c. Overview. Briefly, the pilot will involve 5 participating States, if enough qualified States volunteer in response to a system-wide announcement. Each pilot State will designate a coordinator who will assist National Office (NO) and Regional Office (RO) and contractor staff (PRAMM Consultants, Inc.) refine the design and outline materials. The contractor will produce the operational instructions resulting from this collaboration. Contractor staff, together with Department staff, will draw up the specifications for the universes of denials ("transactions files') from which samples will be drawn. State ADP staff will be responsible for doing the mainframe programming to create the transactions files. Additionally, the Department will draft specifications for the sample selection programs; it will seek to have two States (one COBOL, one COBOL II or alternative) actually write the selection programs as separately-funded deliverables under the cooperative agreement, for export to other pilot States. NO technical support staff will write the database program as variations of the programs already written for BQC and RQC applications; State ADP staff will install and maintain them. After training of the participating State staff (presumably drawn from existing BQC ranks) and completion of programming, States will begin sampling and investigating, which will continue for up to a year. Federal and contractor staff will monitor each State twice during the project. After completion of the investigations, States will complete case databases, which the Department will pick up electronically. The Department will provide copies to the contractor for the evaluation, States will be asked to review and comment on the draft evaluation report. There will be a debriefing meeting to discuss preliminary findings from that report and the discussion will help shape the final report and its recommendations.

d. Selected Aspects of the Design of the Pilot Test.

It is anticipated that the pilot will proceed as follows:

Sampling. Each week, each participating State will select weekly samples of monetary, separation, and nonmon-nonsep denials for investigation. The upper
and lower limits on the weekly sample pull will be set by the contractor and DOL, similar to the BQC sampling rules. Details on the extent of data which can
be downloaded are to be worked out.

Investigations. The pilot coordinator/supervisor will assign the denials for investigation. The monetary denials will be investigated using a BQC-like protocol,
which will involve review of all pertinent agency records, an interview with the claimant and contacts with the base-period employers to ascertain the correct wages, hours of work, weeks of work, etc., as prescribed by State law. Nonmonetary denials (both separations and nonseparations) will be reviewed as
follows: (a) The nonmonetary adjudication and all pertinent agency records will be reviewed and coded using the current QPI instrument. Using that review
instrument and applicable agency data alone, an agency specialist other than the QC-trained investigator will determine whether the original adjudication reached the correct eligibility decision. On the basis of this information, the applicable data will be coded into a data record maintained on the SUN machine. (b) The QC-trained investigator will conduct a BQC-type review of the case, involving claimant interview and appropriate contacts with employers and/or third parties to determine the eligibility decision that accords with a fully-informed application of State law and policy. Both the QPI data and the field-verification data will be coded into a data record maintained on the SUN machine.
Time Data. Investigators and supervisors will be expected to record the time required to complete investigations and review cases, as specified by PRAMM.
The time measurement instrument may ask for time to be recorded either on a case-specific ("job ticket") basis, or else on how the staff used their time during the day (time log). Time-recording win be kept to an absolute miniiinum!
Case Reviews. As now required under the BQC program, the supervisor will be responsible for reviewing all cases before they are considered final. In addition, a sample of cases maybe reviewed by Federal or PRAMM monitors during site visits.
Data Pickup and Integrity. DOL will pick up the data electronically on a periodic basis for transmission to the contractor. The contractor may contact
State staff occasionally in the course of "scrubbing" the data.

e. Length of the Pilot and Level of State Effort.

1. Length of Time. The pilot is expected to involve sampling over a one-year period beginning about December 1996 or January 1997. Up to 3 months after the last sample is drawn may be required to complete verifications and coding and case review. The State pilot coordinator will be expected to be available for refining the design, training, etc., from the execution of the pilot agreement (approximately 6 months before sampling begins) until review of the contractor's draft evaluation report and a debriefing meeting are completed. Projected dates are shown below.

2. Investigator Effort. Although the sample sizes and time to complete each case have yet to be determined, preliminary estimates are that samples of about 200 each of monetary, separation, and nonseparation denials will be drawn. Based on data from the telephone pilot and the 1987 Denials Quality Control pilots, it would appear that two investigator staff years will suffice.

3. Programming Effort. State programming staff will have to develop the transactions files for the three types of denials, following specifications developed by DOL, PRAMM and the pilot coordinators. Two States will be sought to write the sample-selection software, according to specifications developed by DOL. In addition, State staff will have to install and maintain all programs.

f. Resources Provided Through the Cooperative Agreement. Resources for the pilot coordinator and coordinator's travel will be provided through a cooperative agreement between DOL and the pilot State. In addition, each State will be provided with $15,000 through the pilot agreement to cover the costs of programming effort to develop the transaction files and install and maintain the sampling and database software. Two States will receive additional funding to cover the cost of writing sample selection programs according to DOL specifications. These programs will initially be shared with the other pilot States and eventually provided to all States doing denials.

g. Resources Provided Through the Grants Allocation. States will be expected to provide Ul-qualified investigators for the pilot, and cover any travel entailed. Pilot States will receive an allocation of two investigator positions each for use during the developmental pilot. These resources will be provided the State through the new Ul Performs allocation.

h. Application and Selection. All States interested in participating in a pilot to determine the accuracy with which they-are denying benefit claims are encouraged to apply. In applying, States must indicate

the name and qualifications of their proposed candidate for pilot coordinator;
their ability to identify the universes of monetary and nonmonetary denial actions from automated records for the transaction files;
their ability to build transactions files and implement COBOL programs, and maintain both;
their interest and capability to write BQC-type (COBOL) sample selection programs;
their commitment to make Ul-qualified investigators available for the pilot; and
whether their Ul law has features which might affect the difficulty or extent of denials, e.g., the existence of an alternative base-year law.

In selecting from among applicants, the Department will attempt to achieve reasonable balance among geography, size of States, and law features.

i. Projected Time Schedule for the Pilot Effort.

Recruitment and Selection of States	6/96-8/96
Execution of Pilot agreements	September 1996
Refinement of Design	Oct.-Nov. 1996
Programming	Aug.-Nov. 1996
Training	January 1997
Samples drawn	Mar 97 - Feb 98
Investigations completed	May 1998
Draft Evaluation Report	September 1998
Debriefing Meeting	October 1998
Final Evaluation Report	November 1998

Click here to view Executive Summary

Click here to view Denials Pilot - An Overall Assessment

	Monetaries Range (Ave)	Separations Range (Ave)	Non-Seps Range (Ave)
Issue Undetectable w/existing proced.	12-43 (25)	0-54(22)	0-62(28)
Already Resolving	29-82(55)	17-41 (31)	8-33 (23)
Made Wrong Decision	0-10(4)	22-53 (32)	15-61 (29)
Insuf. Follow-thru	0-27(10)	0-14(7)	0-47(12)
Issue not Detected-- Procedures not used	0-16(6)	3-12(7)	5-15 (9)