Illustration of clipbard

US Data Release Notes

This page contains release notes for each of Emsi’s quarterly US dataruns. The release notes outline any major methodology changes or fixes included with each datarun.


Emsi updates core LMI data four times per year with the latest data available from each dataset. These updates are called Dataruns. These are the target release dates for the next four dataruns:

  • 2021.3: 7/2/21
  • 2021.4: 10/1/21
  • 2022.1: 1/21/22
  • 2022.2: 4/15/22 (will be adjusted when QCEW release date is published)

The Bureau of Labor Statistics’ Quarterly Census of Employment and Wages (QCEW), released quarterly, kicks off a new Emsi datarun. You can view the QCEW release schedule here (note that Emsi uses the Full Data Release, not the News Release). As soon as a new quarter of QCEW is released, Emsi’s Data team downloads it and re-flows all Emsi data to incorporate all data source updates that happened since the last datarun. This incorporation of new data sources is then released as a new datarun.

To read about updates for postings and profiles data, see this article.

Upcoming Changes
This section outlines major changes coming with the next datarun that users should be aware of.

Upcoming Changes: 2021.3


Current Release Notes

This section lists the changes and updates introduced in the latest datarun.


Correction: Military Occupation Percentiles

Earnings percentiles for military-only occupations (55-9999) in the Non-QCEW class of worker have been incorrectly based off of only the latest earnings year, even though the data was available for all earnings years. This release corrects that error.

Census Tract Industry and Occupation Data is Final

After making some final alterations to the methodology, Emsi is removing the experimental status of census tract industry and occupation data. For more information on the methodology, see this article.

Population Demographics: New Cohort Model

At long last, Emsi has replaced the cohort model that projects US population counts by cohort. With better source data and more robust methodology, Emsi is able to more accurately project the demographic makeup of the population of the United States. For a more detailed description of Emsi’s new cohort model methodology, see this article. The largest methodological differences are listed below:

  • Emsi now runs the cohort model at the census tract level and then sums census tracts to aggregate up to counties, states, and nation. Previously, Emsi used the latest year of historical tract data to disaggregate county data to the tract level.
  • Emsi now uses local birth and death data from the CDC to calculate local birth and death rates, as well as local net in- or out-migration rates. Previously, Emsi used county cohort population data to regionalize national birth and death rates from the CDC.
  • Emsi now uses linear regression to estimate the birth, death, and migration rates used by the cohort model. Previously, Emsi used an average of all historical rates, which largely ignored local birth, death, and migration trends for individual cohorts.
  • Emsi now uses published state-level population counts by single-age from the Census Bureau’s Population Estimates program to estimate population counts by single-age in addition to their county-level population counts by age-group. Previously, Emsi did not incorporate this data, but because a cohort model requires single-age data to operate, Emsi was estimating single-age data from the age-group data using naive breakouts and multiple rounds of smoothing algorithms.
  • Because of the improvements in methodology, Emsi is also able to publish demographic data for single ages in addition to the previously available five-year age groups.

Occupational Earnings Percentiles Minor Improvement

When calculating earnings for occupations with small employment in metropolitan divisions in older years of the OES occupational wages time series, we now use earnings percentiles from the parent Metropolitan Statistical Area (MSA) for that occupation, rather than using national earnings for that occupation.

State Minimum Wage Data Source Change

Emsi now uses the Bureau of Labor Statistics (BLS) Occupational Employment Statistics (OES) program as the source for our state-level minimum wage data, instead of the Department of Labor (DOL). The primary reason for the change is that Emsi’s minimum wage data is used for unsuppressing OES occupational wages, and the DOL’s minimum wage data was sometimes inconsistent with that used by the BLS.

Dataset Chart

This section lists major sources and what “vintage” of each source was used in the last four dataruns. Click the hyperlinked name to read more about the source and what role it plays in Emsi data.

Bolded items indicate an update.

Name Source 2021.2 Datarun 2021.1 Datarun 2020.4 Datarun 2020.3 Datarun
Expected Release 4/20/21 Released 2/1/21 Released 10/7/20 Released 7/27/20
Quarterly Census of Employment and Wages (QCEW) BLS 2020Q3 2020Q2 2020Q1 2019Q4
Occupational Employment Statistics (OES) BLS 2020 2019 2019 2019
National Ind/Occ Employment Matrix (NIOEM) BLS 2019-2029 2019-2029 2018-2028 2018-2028
Employment Projections Tables (EP) BLS 2019-2029 2019-2029 2018-2028 2018-2028
Consumer Expenditure Survey (CEX) BLS 2019 2019 2018 2018
State Personal Income / Local Area Personal Income (SPI/LPI) BEA 2019 2019 2018 2017 Revised
Make & Use Tables (MUTs) BEA 2019 2019 2018 2018
National Income and Product Accounts (NIPA) BEA 2020Q4 2020Q3 2020Q2 2020Q1
Gross Domestic Product by State (GSP) BEA 2019 2019 2017 Revised 2017 Revised
American Community Survey (ACS) Census 2019 2019 2018 2018
County Business Patterns (CBP) Census 2018 2018 2018 2016
ZIP Code Business Patterns (ZBP) Census 2016* 2016 2016 2016
Nonemployer Statistics (NES) Census 2018 2018 2018 2018
Current Population Survey (CPS) Census 2019 2019 2019 2018
Sate and Local Finances (Census of Government, CoG) Census 2018 2018 2017 2017
Population Estimates (PopEst) Census 2019 2019 2019 2018
Origin-Destination Employment Statistics (LODES) Census 2018 2017 2017 2017
Quarterly Workforce Indicators (QWI) Census 2020Q4 2020Q3 2020Q2 2020Q1
Railroad Retirement Board (RRB) Railroad Retirement Board 2019/2018 2019/2018 2019/2018 2018/2017
Occupational Information Network (O*NET) US Dept. of Labor, Education & Training Administration 25.0** 25.0 25.0 24.3
Crime By County Federal Bureau of Investigation (FBI) 2019 2019 2017 2017
Completions IPEDS 2019 2019 2019 2019
Enrollments IPEDS 2019 2018 2018 2018
Birth/Death Rates Center for Disease Control (CDC) 2018 2018 (Birth), 2017 (Death) 2018 (Birth), 2017 (Death) 2018 (Birth), 2017 (Death)
Migration Internal Revenue Service (IRS) 2018 2018 2018 2018
Cost of Living Council for Community & Economic Research (C2ER) 2020 2020 2020 2019
Patents US Patent & Trademark Office 2015*** 2015 2015 2015

* 2018 ZBP has been released but makes use of a new Census suppression methodology which suppresses data far more heavily than in the past. Emsi is currently working on new methodology that will allow us to calculate ZIP-level employment differently.
** O*NET 25.1 has been released. This is the first release to use O*NET codes tied to the new 2018 SOC codes. Emsi is currently working to update O*NET classification taggers to be compatible with the 25.1 release. When this is finished, all Emsi data (postings, profiles, core LMI) will be updated to the 25.1 classification.
*** US Patents are currently only available via API.

Older Release Notes

This section contains release notes for past dataruns.


Correction: Emsi Population Projections Adjustments

Emsi has communicated in the past that we adjust our population demographics projections to the published national projections from the Census Bureau. While reviewing and improving the code, we realized that we have been inadvertently publishing the pre-adjustment data. This release fixes that bug.

Changes to the “Non-QCEW Employees” Class of Worker

In an effort to continuously improve our core LMI industry data, this quarter we revisited and improved the processes that create Emsi industry data for employees not included in the Quarterly Census of Employment and Wages (QCEW). The Bureau of Economic Analysis (BEA) releases employment estimates as part of the State and Local Area Personal Income (SPI/LPI) programs that include these “non-covered” employees. The majority of Emsi’s non-QCEW employment estimates are calculated by subtracting QCEW employment from total SPI employment. We reviewed the SPI methodology documentation and identified ways to improve our methodology to more accurately take into account the differences in coverage between SPI/LPI and QCEW. In summary, Emsi’s methodology changes result in:

  • Increased coverage. We increased the coverage of non-QCEW employees by roughly 1.1% (1.7M) by including more industries in the non-QCEW class of worker through the subtraction of QCEW covered employment from SPI total employment. The largest addition was approximately 1M employees in the Private Households industry (NAICS 814110), which is largely not covered by unemployment insurance. The added employment in most other industries is due to Emsi’s inclusion of adjustments made by the BEA to account for under-reporting of economic activity in most industries.
  • Decreased double-counting of covered employees. Before including QCEW in its estimates, SPI first distributes all uncategorized QCEW employment proportionally to other industries. Emsi now does a similar proportional distribution before subtracting QCEW from SPI, to create more accurate estimates. The previous methodology double-counted a small number of employees in most non-QCEW industries. This also affects some data in the “Extended Proprietors” class of worker.
  • More reasonable earnings. Earnings for non-QCEW employees are also calculated by subtracting QCEW from SPI, but prior to this release, some of those earnings were unreasonably low. We now place upper and lower bounds on non-QCEW earnings from the observed earning per job in QCEW. Overall, the earnings per job for non-QCEW employees increased ten percent.

Changes to QCEW source data

QCEW recently announced a change to the methodology they use to estimate employment and earnings for non-responding establishments. Under the current methodology, QCEW takes the missing establishment’s growth rate from this same time last year and applies that to the previous month. They’ve begun to rethink this, especially in light of COVID, and have decided to begin calculating missing establishments based on the growth rates of similar establishments within the same time frame.

Beyond this, they’ve also mentioned two other improvements they’d like to make: (1) implementing a methodology change that will immediately identify employers who have ceased operations and (2) use benefit claim counts as a supplement to existing manual data review.

Population Educational Attainment

Emsi improved the model that creates educational attainment data for the population aged 25 and above. This is done by breaking out final Emsi population demographics data by seven educational attainment categories using breakouts from 2009-2019 5yr ACS and 2000 Census data.

  • We removed old ACS 3yr tabulations with different definitions, which were muddying the breakouts. We now rely solely on Census 2000 and ACS 5yr tabulations.
  • When a breakout cannot be found for a given year, we now search for breakouts in the closest year rather than automatically jumping back to Census 2000 breakouts (Census 2010 and later do not collect educational attainment data, because they consciously leave educational attainment up to the ACS).
  • We now allow breakouts that have data for at least four of the seven educational attainment categories. We previously only used breakouts if they had all seven categories, which was unreasonable for certain demographics in certain counties.

Occupation Characteristics

Emsi publishes data from the BLS Employment Projections (EP) program for certain national occupation characteristics. EP is now using the same occupation definition as OES, which transitioned to 2018-based occupation codes earlier this year. This update makes a few national occupation datapoints more accurate/relevant to Emsi SOC:

  • Typical education, experience, and on-the-job training for each occupation
  • Percent of each occupation at each educational attainment level
  • Occupational separation rates (one of the key ingredients of Emsi Occupational Openings)


We stopped using national tabulations from the Current Population Survey (CPS) for unemployment breakouts in favor of searching for the nearest good breakout from the DOL’s Characteristics of the Insured Unemployed (CIU) monthly state unemployment data.

Occupation by Residence Employment

Emsi uses a ZIP Code to Census Tract mapping from Housing and Urban Development to create ZIP Code level occupation by place of residence employment. We improved our usage of the mapping which improved the accuracy of occupation by place of residence data by more closely mapping published data from LODES.

Staffing and Occupation Data Tuning

We tuned our expansion and unsuppression algorithms for OES staffing and occupation data to produce fewer unreasonable data points, and cluster employment and earnings data around more likely places.

New Datasets in Core LMI API (API Only)

We’ve added a few new datasets to the Core LMI API. See the changelog for more information.


Correction: Historical Regional Occupation data for regionalizing Historical OES Staffing Patterns

In a previously released methodology document, Emsi claimed to have begun using historical regional OES occupation data to regionalize historical national OES staffing patterns. This was incorrect – Emsi had begun using historical national OES staffing patterns with the 2018.3 release, but did not begin using historical regional OES occupation data to regionalize those staffing patterns until the 2020.4 release. As a result, users should expect to see more consistency between published occupation data from OES and Emsi’s staffing and occupation data.

Changes to the “QCEW Employees” Class of Worker

In an effort to continually improve our core LMI industry data, this quarter we revisited and improved the processes that create Emsi industry data based on the Quarterly Census of Employment and Wages. In summary, the improvements will be visible as more reasonable earnings estimates, fewer unreasonably small employment numbers (less than one job), and fewer jobs and earnings arbitrarily added where QCEW published zeros for certain establishments. Due to these changes, users should not expect historical data to match previous Emsi dataruns except where that historical data was completely disclosed by QCEW. There may also be slight differences between published annual data from QCEW and Emsi’s employment estimates. The specifics:

  • Newer unsuppression algorithms. We began using our newer unsuppression algorithms for QCEW data, resulting in more accurate and more reasonable employment and earnings estimates. The newer unsuppression algorithms allow us to tighten constraints and keep the unsuppressed data clustered more tightly around disclosed QCEW data, resulting in fewer unreasonably small employment estimates. Consequently, we also now exclude establishments that do not have employment during the quarter they are reported in QCEW. Because of this, users may see slight differences in establishment counts between Emsi and published QCEW.
  • Better starting point for the unsuppression. We also modified the processes that prepare QCEW data for unsuppression to make better use of other datasets (CBP, IPEDS, NCES CCD) as the initial educated guesses for suppressed values before the unsuppression begins. These improvements were primarily focused on state and local government owned establishments (including schools and hospitals), although private data also saw some improvements through the revitalization of our processes for County Business Patterns data: newer unsuppression algorithms, making better use of published data, fixing inconsistencies in raw 2001 CBP data, and incorporating 2017 and 2018 CBP data which was previously impossible because of the Census Bureau’s new disclosure avoidance policies.
  • Published QCEW Files. Emsi is no longer using QCEW annual files because they have more suppressions than the quarterly files (and thus less information for us to work with). Because of this, users may notice that our estimates differ slightly from published QCEW annual numbers.
  • Differences in how we standardize definitions. QCEW does not publish all of its data in a consistent set of NAICS or county definitions. Emsi has always standardized the definitions, but with this release we have adjusted the process that follows the standardizations to favor reasonability over accuracy. Because this algorithm works both top-down and bottom-up on the dataset, aggregation errors can change the total national employment by up to 50 jobs (maximum error at more detailed areas and industries is much lower, more in the realm of ⅓ job).

ZIP Breakouts for State and Local Government Higher Education

We improved our usage of staff employment data from IPEDS to better break out county data to ZIP codes for state and local government higher education.

Experimental Census Tract Industry and Occupation Data

We created experimental industry and occupation datasets at the Census Tract level to provide better geographical granularity. The methodology is still undergoing improvements and the data has not yet been included in our tools, but the data is available via API and datapulls. If you would like advanced access to help us validate the data, please contact your customer service representative.

Accuracy of the Social Accounting Matrix

We fixed a bug in the Commodity Flow Survey unsuppression and improved accuracy in the gravity flows model of Emsi’s Social Accounting Matrix (SAM) Input-Output model.

New Occupation Classification (Emsi SOC 2019)

Emsi’s occupation classification (SOC) is being updated to match the reduced level of detail available in the 2019 release of OES. This reduction in detail serves as preparation for OES adopting the official 2018 SOC (to be released spring 2022 with OES 2021 data). For more information on the timeline for the SOC 2018 transition, see Emsi’s SOC transition article. For more information on how the SOC 2018 transition affects 2019 OES data, see Question 10 of OES’s FAQ.

In addition, because of funding constraints, the OES has been reducing their survey sample size, which will reduce the accuracy of the data, but prove the need for Emsi’s modeling to make detailed estimates available.

“The OES sample has been reduced in recent survey panels. The May 2019 OES survey panel had a sample of approximately 183,000 establishments. The November 2017, May 2018, and November 2018 survey panels each had a sample of approximately 186,000 establishments. The May 2017 panel sample consisted of approximately 195,000 establishments, and the November 2016 panel sample consisted of approximately 202,000 establishments” (see the end of “Technical Notes for May 2019 OES Estimates”, from BLS).

Occupational Openings

In order to match BLS published separation rates to the new aggregate occupation codes in Emsi SOC 2019, we sum total employment and total separations for the aggregated occupations and recalculate the rate. Prior to this data release, separation rates for aggregate occupations were calculated by averaging the detailed rates. Separation rates are used in Emsi’s calculation of openings. For reference, Openings = Growth + Separations.

New Micropolitan Statistical Area

Emsi updated to the Census Bureau’s March 2020 MSA delineations. This update includes only one change, a new micropolitan statistical area: Bluffton Indiana (MSA 14160), which consists of Wells County Indiana (FIPS 18179).

Reduced Volatility in Historical Occupation Earnings

We improved the methodology for estimating suppressed occupational employment and earnings. This improvement targeted areas of low employment and volatile earnings. Previously, we used disclosed data from prior and subsequent years to form an initial estimate (seed) for suppressed values, and then allowed our algorithms to adjust those seeds to match known disclosed data. The constraints imposed by the disclosed data were not enough to keep the seeds from changing drastically and causing volatile earnings across the time series. We resolved this by adjusting the seeds to match the known national occupation change, then imposing tighter constraints on those particular seeds so our algorithms could not adjust them more than ten percent.

In short, we started with better guesses and constrained our models to stay within ten percent of that original guess. The effect of this change is a noticeable reduction in volatility for occupational earnings associated with low-confidence data points.

ZIP Breakout for State and Local Government

We fixed a bug in our ZIP code methodology that was inadvertently using total employment instead of state and local government employment to break out the state and local government industries (902999, 903999) from counties to ZIP codes.

Extensive changes to our Extended Proprietor class of worker

We extensively reviewed and revised our methodology for our Extended Proprietors class of worker, improving our unsuppression of BEA proprietors employment and earnings by using newer algorithms and by making better use of NES and QCEW to reduce the volatility present in the data and isolate earnings from income. Users of Extended Proprietors data may notice differences in how earnings are distributed between industries and areas. Accordingly, users of the Social Accounting Matrix (SAM) and Input-Output (I-O) model may notice some changes to industry multipliers.

New Datasets in Core LMI API (API Only)

  • BLS Current Employment Statistics (CES) Emsi now maintains an API that serves up unmodeled state and national industry employment data from the CES.
  • BLS Occupational Employment Statistics (OES) Emsi now maintains an API that serves up unmodeled state and national occupational employment and earnings percentiles from OES.
  • BLS Employment Projections (EP) Emsi now maintains an API that serves up unmodeled national industry by occupation employment (NIOEM) from the EP. It currently has historical data for 2018 and projected data for 2028.

IPEDS Institution Updates (API Only)

We added indicators to to IPEDS institution definitions for Historically Black Colleges/Universities (HBCU) and Hispanic Serving Institutions (HSI). The goal is to add these indicators to reports in our Analyst tools; however, at the time of this datarun release, they are only available in the API.

Growth/Openings Calculation Bug

We fixed a bug in how we calculated growth for occupational employment. Growth is part of the calculation for openings, so openings were also affected. Previously, if employment was zero for a year and greater than zero for the following year, growth was incorrectly calculated to be zero. We’ve corrected the calculation so growth now equals the employment in the following year. Nationwide for all occupations, the amount of growth we were previously dropping was between 6,000 and 16,000 jobs for historical years and between 500 and 3,000 for projected years. The maximum percentage error was just under 0.6%.

QCEW Supplemental Earnings

We improved our process for estimating supplemental earnings for QCEW industry data to reduce the volatility in areas where our BEA source data was suppressed. The new methodology relies more heavily on the parent geography’s supplemental earnings data and thus will more closely match the source data.

Veterans Population Bug (API only)

We fixed a bug in 2009 veterans population data that was overstating the veteran population by roughly 5x. We pull veterans population counts from the Census Bureau’s ACS API, and failed to notice that the variable definitions changed between 2009 and 2010. Only latest year (currently 2018) veterans population data is currently shown in our tools, so only API users may notice a change.

Geography Constraint Changes in Core LMI API (API only)

We’ve simplified the interface for accessing different levels of geography. For example, instead of querying a separate endpoint – Emsi.US.Industry.ZIP – for ZIP code industry data, users can simply prepend “ZIP” to geography constraints sent to Emsi.US.Industry. MSAs are also available by prepending “MSA” to geography constraint. For more details and a full list of datasets that now support this simplified interface, see

Distance Completions for 2012

We neglected to include 2012 IPEDS distance completions data when we originally began publishing this data in 2013. Thanks to an observant client pointing this out, we now include 2012 data in our distance completions dataset.

Extended Proprietors Updates

We made some minor modifications to the methodology used to estimate our Extended Proprietors (Class of Worker 4) data to improve its quality and reliability. The immediate effect of the changes is minor: industry earnings per job increased by 0.2%

IRS Migration Data Improvements

The underlying data behind our population migration data is based on IRS tax exemptions, which do not always represent a person. Following a recommendation from the IRS, we now multiply the number of exemptions by 0.9 to better estimate the actual number of people migrating between counties.

Occupation Data Changes (API only)

We’re working hard to unify related datasets to simplify the API user experience. Part of that effort is visible in this release. API users can now access ZIP code and MSA data directly from the Emsi.US.Occupation endpoint by prefixing ZIP codes with “ZIP” and MSA codes with “MSA”.  Nation, state, and county FIPS codes do not need to be prefixed and will work exactly as they did before (and don’t worry, we didn’t get rid of Emsi.US.Occupation.ZIP yet).

State Minimum Wage Bug Fix

The minimum wage data that informs our occupational earnings models formerly contained the state minimum wage even if it was lower than the mandated federal minimum wage. We’ve modified the minimum wage for such states so that they now match the federal minimum wage. The increased minimum wage will be the lower bound for earnings percentiles in the lowest earning occupations.

Staffing Pattern Improvements

We made some improvements to our staffing pattern process to improve the loading speed of staffing pattern reports. This change will result in extremely small employment numbers clustering more tightly around disclosed data and being less widely distributed (e.g. there will be more occupations with three jobs and fewer occupations with one job).

Improved ACS Methodology

For certain records, ACS cannot determine a specific industry to classify a job as, so they place these indeterminate records in less-detailed “catchall” categories. Emsi now distributes these values among the industries represented by the catchall, instead of dropping these records. This will result in 50k-80k jobs, depending on the year, being added to self-employed.

Improved Earnings Estimations

Emsi has reduced the volatility of earnings in occupation data for areas with small employment. Year over year data should be more continuous and less erratic as a result.

Occupation Hires and Separations Improvements

Previous to this release, Emsi estimated occupation hires and separations using a combination of industry data from QWI and Emsi employment staffing patterns, implicitly assuming that hires and separations rates within an industry were the same across all occupations. Emsi’s new methodology estimates occupation hires and separations by combining QWI industry hires and separations with national occupation separation rates from the BLS and regional staffing pattern growth and decline. This change will not take effect immediately in Emsi’s tools, but will be released and messaged sometime later in the quarter.

Staffing Pattern Area Changes

In order to improve the quality and stability of their estimates, the BLS reduced the level of geography detail available in the 2018 release of Occupational Employment Statistics (OES):

Consolidation of some nonmetropolitan areas. Some nonmetropolitan areas published in the May 2017 estimates have been combined to form larger nonmetropolitan areas. The May 2018 estimates contain data for 134 nonmetropolitan areas, compared with 167 nonmetropolitan areas in the May 2017 estimates.

Elimination of metropolitan division data. OES no longer publishes data for the metropolitan divisions within the 11 large metropolitan areas that are further broken down into divisions. Data for these 11 areas are available at the Metropolitan Statistical Area (MSA) only.

This change does not affect the level of geography detail available to Emsi clients, though detailed regions that fall within the same consolidated OES region may look more similar to one another.

Historical Occupational Wage Estimates

Emsi is now publishing historical occupational wage estimates back to 2005. Current year occupational wage estimates are more stable and accurate as a result. This data is currently accessible only via Emsi’s LMI data API, but will appear in Emsi’s tools in the near future.

Self Employed Earnings

Emsi sources self-employed earnings from the American Community Survey public use microdata. Over the past year, Emsi conducted a complete methodology review of ACS usage and decided to make these changes:

Five-year sample window: Because ACS is a survey with limited sample size, it is standard practice to combine five years of data to improve coverage and stability. When Emsi began using ACS data, ACS three-year tabulations were the standard, and five-year tabulations were relatively new; consequently, Emsi created an ad hoc combination of years (1,3,5,9) to increase sample size. Not seeing any advantage to this ad hoc methodology over the standard, Emsi now follows the example of official ACS tabulations and uses a five-year window. Total self-employed employment is roughly identical to what it was previously, but how those jobs are distributed across industries and occupations will be minorly affected. On average, staffing patterns have 5% more occupations than before.

Improved earnings outlier removal: Another drawback to the limited sample size of surveys is that outliers can disproportionately affect the quality of estimates, especially in granular tabulations. In this release, Emsi improved the methodology for ACS earnings outlier removal to prevent earnings from being skewed low (because of older years of data) and to allow removal of low-earning outliers (only high outliers could be detected in Emsi’s previous methodology). The end result is that average self-employed industry and occupation earnings are on average 30% higher.

Improved occupation earnings percentiles estimation: Prior to this release, Emsi estimated percentiles after tabulating ACS microdata, assuming that it was valid to aggregate weighted percentiles. In 2017, Emsi stopped aggregating weighted earnings percentiles elsewhere in US data, but neglected to fix the initial creation of self-employed earnings percentiles. Consequently, these percentiles were artificially flat (skewed toward the average). The new methodology correctly builds percentiles from the original microdata, resulting in more reasonable percentile curves.

Earnings for Military-Only Occupation

See the changes to Self Employed Earnings; Emsi’s earnings for the military-only occupation come from ACS. On average, earnings are 5% higher and percentile curves are no longer skewed toward the average.

Local Absorption in the Input-Output Model

Emsi adjusted the methodology in the input-output model that regulates local absorption of goods produced in region.

Gross Regional Product Bug

We discovered a bug in our process that estimates gross regional product for years where the detailed regional estimates in BEA’s GDP by state lag behind national estimates from the BEA’s National Income and Product Accounts (NIPA). When regional data lags behind national data, we fill in the missing regional detail using previous-year regional data and current NIPA data. Previously, Emsi worked under the assumption that Gross Domestic Income (GDI) was an alternative way to calculate GDP and should be equal to it, so we used GDI data from NIPA to bring forward the regional estimates. However, GDI is not equivalent to GDP in BEA estimates, despite the definition of the term, so we were incorrectly using GDI when we should have been using GDP. With this release, Emsi is now correctly using GDP data to bring forward the regional GDP estimates. This bug caused the most recent two years of GDP to be approximately 2% higher than they should have been at all levels of detail. This also affected the US SAM Input-Output model’s induced multipliers.

Military Employment Bug

We discovered a minor bug in the process that estimates non-QCEW employment for federal government military. Military employment data was missing from some counties affected by the county redefinitions in data prior to 2009.

Government Hires and Separations Bug

We discovered a minor bug in the process that estimates hires and separations (sourced from QWI) for industry and occupation workforce demographics. Hires and separations were erroneously zero for government industries and related occupations in a few state-year combinations where QWI lacked sufficient data. We now (correctly) use ratios from less detailed data to fill in those holes. This data is only visible to API clients and users of the hires portion of the job-postings report in Analyst.

Incorporated Self-Employed and Unemployed Employees from ACS

We discovered an error in how we use the Incorporated Self-Employed (ISE) class of worker from American Community Survey (ACS) data. ISE represents individuals who are employees of their own corporations, and thus are included in QCEW (Emsi CoW 1). In some cases, Emsi was previously treating these incorporated self-employed incorporated individuals as CoW 3 instead of CoW 1. Emsi’s self-employed class of worker (CoW 3) now does not include these individuals. See below for a summary of the effects on final data.

We also began making use of data from ACS respondents that were unemployed at the time the survey was collected, but had worked as an employee within the last year. Similar to how annualized QCEW jobs are reported, we down-weight unemployed ACS records based on the number of weeks the person worked (e.g. – a person who works three months of the year is counted as a quarter of a job). These new data feed into:

Workforce demographics (detailed industry and occupation data): Emsi uses ACS data to supplement and help unsuppress Quarterly Workforce Indicators (QWI) data, which is in turn combined with final Emsi industry and occupation data to estimate workforce demographic breakouts (e.g. – “We know how many plumbers work in X region, how many of them are white males?”). Previously, we were estimating a small portion of the demographic breakouts, where estimates were not available from QWI, using the wrong subsection of the US economy; the breakouts for CoW 1 and 2 mistakenly did not make use of ISE and the breakouts for CoW 3 and 4 mistakenly did make use of ISE. However, because the primary data source for demographic breakouts is QWI not ACS, the effects on final data are minor.

Staffing Patterns for QCEW and Non-QCEW (Emsi CoW 1 and 2): ACS data is used to create staffing patterns for industries that are not included in OES staffing patterns. These ACS staffing patterns were mistakenly created from all employee records from ACS except ISE and the currently unemployed. Again, since ACS is not the primary dataset from which Emsi creates staffing patterns, the effects of the bug are limited in scope to only industries that are missing from OES: Agriculture (NAICS 111000, 112000, 113110, 113210, 115310, 114111, 114112, 114119, 114210) and Private Households (NAICS 814110). The differences in the ACS-based staffing patterns will be minor, since most employees were correctly included. Note that this also minorly affects Emsi’s final occupation data which is calculated from industry and staffing data.

MSA Delineation Changes

In September 2018, the Census Bureau released new delineation files for Micro- and Metropolitan Statistical Areas. The 138 changes are fairly extensive and affect most of the major metropolitan areas in the United States. While county definitions did not change, 56 counties are now included in MSAs that were not previously, 42 counties have been removed from MSAs, and 40 counties have moved to new or different MSAs. If you have previously viewed data at the MSA level and are noticing large changes between current and previous data for your MSA, this is likely the cause. For more detail, contact your customer service representative.

Occupational Employment Statistics

Emsi continues to do research and preparation for creating a time series out of OES data. The objectives are to provide more accurate historical occupation employment estimates and begin producing historical occupation earnings. The research is still ongoing on both fronts, but various changes have been implemented to lay the groundwork for these future improvements. First, we tightened controls on occupational employment estimates, so they are now clustered more tightly around disclosed OES data. Second, we improved the algorithms that estimate top-coded earnings (earnings that OES suppresses because they are above $100/hour). Third, we began applying our occupational earnings aggregation methodology to the earnings unsuppression process. On the whole, the accuracy of unsuppressed earnings estimates did not change, but percentile wage curves are more reasonable and much more statistically defensible.

Changes in QCEW in South Carolina

QCEW is the foundation of Emsi’s employment and earnings data. The BLS has reported that QCEW data in South Carolina “are showing unusual movements which may be a result of a change in reporting.” For more information, see:

New Benchmark Data from the BEA

For the past five or more years, the Social Accounting Matrix (SAM) and Input-Output (I-O) data have had at their core the 2007 benchmark make and use tables from the BEA. This year, the BEA published 2012 benchmark files, and Emsi incorporated them alongside the previously existing 2007 benchmark files. Because this update accounts for five years of data updates, we expect to see substantial changes throughout Emsi’s SAM and I-O data and models.

None. There were no classification updates or major methodology changes.
New Emsi Occupation Classification (Emsi SOC 2017)

Emsi’s occupation classification (SOC) is being updated to match the reduced level of detail available in the 2017 release of OES. See for a summary of the changes. Where the BLS aggregated detailed occupations to the broader “parent” occupation, Emsi copied the parent code’s data down to a new Emsi-specific detailed code ending with an “8”, so clients can still browse occupation data by detail level. Data will no longer be available for the old detailed SOC codes. For more details or if you have questions, please contact your customer service representative.

Historical Staffing Patterns

Emsi is now using historical staffing patterns based on historical Occupational Employment Statistics (BLS) data. All QCEW and Non-QCEW historical staffing and occupation data is affected. OES advises against using historical OES data as a time series for reasons here: Emsi took those factors into account and made a significant effort to smooth the volatility between historical OES releases. In the near future, we hope to extend the use of historical OES data beyond national staffing patterns to regional employment and earnings.

Military Staffing Pattern

Due to lack of good, detailed military occupation data, Emsi previously lumped all military employment (NonQCEW, industry 901200) into a single occupation (55-9999). In response to various requests from clients, we now use a national staffing pattern from the American Community Survey (Census) to break out military employment between military-specific occupations (55-9999) and occupations that are shared with civilians (eg. pilots, mechanics). In 2016, approximately 44% of military employment was in military-specific occupations and 56% was in occupations shared with civilians. This change more closely reflects the design of the official SOC classification and also clears up an equivocation on the definition of “military” that was causing some confusion about earnings (Emsi was using earnings for military-specific occupations for all of military employment).

Non-QCEW (Emsi Class of Worker 2) Employment

We discovered a long-standing bug in one of our Non-QCEW Employees processes that was underestimating employment in a handful of industries:

  • NAICS 524 – Insurance Carriers and Related Activities
  • NAICS 531 – Real Estate
  • NAICS 54 – Professional, Scientific, and Technical Services

Users should notice a 2x or 3x (depending on the industry) increase in Non-QCEW employment and total earnings for these industries and their corresponding occupations (via the staffing pattern). Occupations in SOC 41 will be the most affected by the increase in employment.

Local Absorption in the I-O model

Emsi uses data from the Census Bureau’s Commodity Flow Survey (CFS) supplemented with proprietary data to create industry-specific transportation impedance estimates. We recently discovered that CFS is too general to create accurate impedance estimates for some industries, so we opted for using only the proprietary data in those cases. Consequently, users should expect to see changes in local absorption (demand satisfied in region) for the following industry sectors: 211, 213, 425, 44-45 (except 454), 4542, 45439, 48-49 (except 493), 51 (except 511), 551111, 551112. Retail industries, particularly, will have much higher local absorption.

Occupation by Residence for Government Jobs

Emsi uses industry commuting data from Census’s Longitudinal Origin-Destination Employment Statistics (LODES) dataset to create occupation data by place of residence. Previously we made use of the job type JT02 (private jobs) and assumed no commuting for government jobs, so that the place of work and place of residence were the same. Emsi now uses JT00 (all jobs) as well as JT02 to create commuting patterns for government jobs. This also improves the accuracy of the regional Social Accounting Matrix.

Metropolitan Statistical Areas (MSA) Definitions

This release uses the latest MSA delineations from August 2017 as provided by the Census Bureau (previous releases used the July 2015 delineations). The only change was that Twin Falls, ID was changed from a micropolitan statistical area to a metropolitan statistical area.

Input-Output Transaction (Z) Matrix

Emsi now defines national inter-industry transactions as significant only if they sum to more than $500K. This matches the BEA’s functional definition, since they round their transaction matrix to the nearest $1M. Emsi previously had no minimum value on national inter-industry transactions, allowing scores of insignificant transactions to clutter and add noise to the matrix. This change will have minimal visible affects, since the primary goal is to remove insignificant transactions and reduce noise in industry supply chains.

North American Industry Classification System (NAICS) 2017

Emsi switched from using NAICS 2012 to NAICS 2017. Summary of the changes:

  • 4-digit: 3 removed, 3 new, 0 combined, 2 recoded, 1 split
  • 5-digit: 10 removed, 7 new, 7 combined, 1 recoded, 2 split
  • 6-digit: 28 removed, 20 new, 13 combined, 11 renamed, 4 split

Occupational Information Network (O*NET) Version 22.0

With this release we updated from O*NET version 21.3 to version 22.0, to include updated Knowledge, Skill, and Abilities for 100 occupations. This will slightly impact the compatibility indices of occupation pairs.

Non-QCEW (Class of Worker 2) Earnings

We fixed a bug in the code that was causing earnings to be skewed low. In addition to normal data changes, earnings in this class of worker are approximately 1.5% higher than previous releases.

ZIP Code Population Demographics

We improved the usage of the probability matrices used to disaggregate county-level population demographics to ZIP codes, resulting in more accurate estimates for approximately 25% of the nation. Rural areas are the most improved.

ZIP Code Industry Employment Estimates

We improved our estimation of ZIP code industry employment with a more nuanced use of ZBP (ZIP Code Business Patterns, Census). Previously, if a county’s industry was not found in ZBP, we would default to using an USPS business delivery statistics to distribute that industry’s employment to all ZIP codes in the county. With the new methodology, if a county’s industry isn’t found in ZBP, we search for similar industries in ZBP before defaulting to USPS data. The use of USPS dropped from 20% to 0.5%, significantly increasing the accuracy of ZIP code employment estimates.

This change affects all ZIP code employment data (industry and occupation) and does not target any specific counties or industries. However, the most significant changes will be in rural areas and in regions where an industry’s employment is small. Changes will appear as a more focused distribution of data within the ZIP codes in each county.

Local Absorption in Emsi’s Social Accounting Matrix (I-O)

We slightly changed the constraints on the SAM’s bi-proportional algorithm to further maximize local absorption. This fixes some anomalies uncovered in the 2017.3 release where local absorption was too low.

Unemployment Data

Unemployment data has been brought up to date and the methodology adjusted to reflect current conditions more closely. The new methodology averages the past 12 months of unemployment to form a figure free from seasonal volatility. This change has been introduced with the 2017.4 data release; because of this, clients should expect to see differences in unemployment between the 2017.4 data release and prior releases.

Industry and Occupation (2-digit) Unemployment

We changed and simplified our methodology for estimating unemployment by 2-digit industry and occupation to more accurately reflect the data from the US Department of Labor’s Characteristics of the Insured Unemployed and rely less heavily on survey-based unemployment data from the Current Population Survey. This change allows us to publish monthly unemployment estimates, which may be available soon; in the meantime, we will publish only the latest month of data. Note that the new monthly data may be less stable when compared to our previous methodology, which was annualized.


We changed how we calculate both replacements and growth. For details, see

Balance-of-State (BOS) ZIP codes

QCEW uses a BOS county code to track employment where no county was reported. To make Emsi’s ZIP code and county data more comparable, Emsi is introducing BOS ZIP codes that match the BOS county codes. Where “SS” is the state FIPS code, BOS county codes are formatted as “SS999” and BOS ZIP codes are formatted as “001SS.” As with BOS county codes, BOS ZIP codes will represent all data for a state that was not able to be tagged with a valid ZIP code. They will appear in ZIP industry, occupation, and demographics data.

Estimates Derived from Quarterly Workforce Indicators (QWI)

We fixed a bug in our methodology for estimating industry and occupation employment by demographics that caused the wrong year of national-average QWI data to be used as an input when an entire state was missing from QWI. Wyoming employment breakouts were affected for Emsi’s 2016.3-2017.1 releases.

Occupation by Place of Residence Data

Emsi’s 2017.1 release was the first to include occupation by place of residence data. One of the known limitations with that initial release was the absence of data in “999” or balance-of-state counties. This only affected state and national sums, and given that balance-of-state counties account for only a fraction of each state’s total employment and that most users will not be using place of residence data at those geography levels, the missing data should not have caused any major problems. Regardless, in Emsi’s 2017.2 release, the data for 999 counties are included and the state and national totals by place of residence should better reflect reality.

Occupational Earnings Percentiles

State and national occupational earnings percentiles for employee classes of worker (1 and 2) are no longer based on job-weighted aggregations of county estimates. Instead, earnings percentiles at the state and national levels are preserved from OES. The result is more accurate earnings percentiles at the state and national levels. Average hourly earnings are unaffected.

Social Accounting Matrix (SAM) and Input-Output (I-O) Years

The data is now for 2015-2016.

Occupation by Place of Residence Data

We now have occupational employment data by place of residence derived from Emsi industry and staffing data and Census LODES data.

ZipCode Industry and Occupation Data

We improved our method of breaking out Emsi county-level data to the ZipCode level, making better use of USPS Delivery Statistics and Census’ ZipCode Business Patterns (ZBP) and adding and making use of the following datasets:

  • Four new tabulations of ACS five-year ZCTA employment estimates
  • NCES Common Core of Data and IPEDS employment
  • Railroad Retirement Board zip employment
  • IPEDS Employees by Assigned Position

Earnings for Self-Employed and Extended Proprietors

We improved our methodology for estimating Self-Employed and Extended Proprietors earnings, improving accuracy in both industry and occupation data. While some differences may be noticeable, most will not exceed the normal variability between releases.

Employment for Extended Proprietors

We improved our methodology for producing employment estimates for Extended Proprietors, slightly lowering employment counts across the nation. In certain circumstances, our previous methodology double-counted a small number of Self-Employed workers as Extended Proprietors.

State Projections

We reduced the level of detail we accept from state-published industry projections due to their varying quality and currency, effectively weighting Emsi and BLS industry projections more heavily.

Current Employment Statistics

Due to recent improvements in QCEW and Emsi release schedules, we no longer use CES to inform our current year industry employment estimates.