← Back to Results

Emsi Job Titles Methodology

In recent months, Emsi has expanded our library of job titles, growing from around 5,400 titles to over 75,000. While roles are often categorized in helpful ways through SOC or O*NET, those sources do not reflect the niche and constantly evolving nature of jobs in today’s economy. By expanding Emsi Titles to include the language used by employers in job postings, we gain a real-time view of new and emerging roles in the workforce.


Why expansion is important: 

  1. Granularity: With a larger number of titles, users can now search for specific job titles with increased success. 
  2. Recency: Emsi Titles are updated every few weeks to keep pace with shifting roles in the labor market. 
  3. Market-Alignment: Due to the granularity and currency of Emsi Titles, users have a more accurate depiction of how these titles are evolving in the labor market. 


How it works: 

To make this transition, we began with roughly 39 million raw job titles. “Raw” titles are those that have been pulled directly from the title field of a posting or profile. For example, suppose a Facebook job posting lists the job title as “Data Science Manager, Messenger.” This would be the raw title. 

Though these raw titles may each be considered unique by computer standards, we often find that two or more titles are an exact match, despite a small discontinuity such as a space or a period. To solve this problem, we deduplicate the titles, meaning we isolate the single, unique posting from the others. In some cases, we find job titles with misspellings, acronyms, industrial jargon, etc., so we dive deeper into the postings to understand the true function of the job and determine the corresponding Emsi Title. 

Next, we created a tagging system using 20 million of those raw titles, which tags each posting with the closest title in the system. Following the example above, the job title would be cut down to read “Data Science Manager.” 

With the remaining 16 million or so raw titles, we looked for matching phrases and grouped them together accordingly. From those groups, we determined 100K titles that represented the larger group, cleaned them up, and arrived at 75K Emsi Titles (now available in Emsi tools). 

Submit a Question

Let us know what specific questions we can help you with (we may even add your question to our knowledge base).


Submit a Question

Let us know what specific questions we can help you with (we may even add your question to our knowledge base).