--- title: Job Dataset description: Explore all 60 job dataset fields including column names, data types, descriptions, and fill rates showing data completeness metrics for each field url: https://theirstack.com/en/docs/datasets/options/job --- | Column | Type | Description | Fill Rate | | --- | --- | --- | --- | | `id` | integer | Unique identifier for the job | \- | | `url` | str | URL of the job posting. If we have a URL where the job board redirects to, where the job is originally posted, this URL will be returned here. Otherwise, the value of this field will be the same as the value of the \`source\_url\` field. | \- | | `job_title` | str | Title of the job position | \- | | `date_posted` | date | Date when the job was posted | \- | | `company_name` | str | Name of the company offering the job | \- | | `description` | str | Full description of the job | \- | | `location` | str | Location of the job | \- | | `short_location` | str | Short version of the job location | \- | | `long_location` | str | Long version of the job location | \- | | `state_code` | str | State code of the job location | \- | | `latitude` | float | Latitude of the job location | \- | | `longitude` | float | Longitude of the job location | \- | | `postal_code` | str | Postal code of the job location | \- | | `remote` | boolean | Whether the job is remote | \- | | `hybrid` | boolean | Whether the job is hybrid (partially remote) | \- | | `salary_string` | str | Salary information as a string | \- | | `min_annual_salary` | float | Minimum annual salary | \- | | `min_annual_salary_usd` | float | Minimum annual salary in USD | \- | | `max_annual_salary` | float | Maximum annual salary | \- | | `max_annual_salary_usd` | float | Maximum annual salary in USD | \- | | `avg_annual_salary_usd` | float | Average annual salary in USD | \- | | `salary_currency` | str | Currency of the salary | \- | | `country_codes` | array | List of country codes where the job is available | \- | | `discovered_at` | datetime | Timestamp when the job was discovered by TheirStack. Will be equal or greater than the value of the \`date\_posted\` field. Most jobs are discovered within 24 hours of being posted, and many some jobs are also discovered in the following days. Read more about this \[here\](https://docs.theirstack.com/docs/data/job/freshness). | \- | | `source_url` | str | URL of the job board where the job was found | \- | | `seniority` | str | Seniority level of the job position. One of the following values: \`c\_level\`, \`staff\`, \`senior\`, \`mid\_level\`, \`junior\` | \- | | `hiring_team` | json | Information about the hiring team | \- | | `company.id` | str | Unique identifier for the company | \- | | `company.name` | str | Name of the company | \- | | `company.domain` | str | Company website domain | \- | | `company.possible_domains` | array | List of possible company domains | \- | | `company.iso2` | str | Two-letter country code where the company is based | \- | | `company.industry_id` | integer | Industry classification ID | \- | | `company.employee_count` | integer | Number of employees at the company | \- | | `company.annual_revenue_usd` | float | Annual revenue in USD | \- | | `company.total_funding_usd` | float | Total funding raised in USD | \- | | `company.funding_stage` | str | Current funding stage (e.g., Series A, Series B) | \- | | `company.last_funding_round_date` | date | Date of the most recent funding round | \- | | `company.founded_year` | integer | Year the company was founded | \- | | `company.yc_batch` | str | Y Combinator batch (if applicable) | \- | | `company.linkedin_id` | str | LinkedIn company ID | \- | | `company.linkedin_url` | str | LinkedIn company URL | \- | | `company.apollo_id` | str | Apollo.io company ID | \- | | `company.is_recruiting_agency` | boolean | Whether the company is a recruiting agency | \- | | `company.is_consulting_agency` | boolean | Whether the company is a consulting agency | \- | | `company.logo_url` | str | URL of the company logo | \- | | `company.annual_revenue_usd_readable` | str | Human-readable annual revenue (e.g., "$1.5M") | \- | | `company.last_funding_round_amount_readable` | str | Human-readable last funding amount | \- | | `company.long_description` | str | Detailed description of the company | \- | | `company.seo_description` | str | SEO-optimized company description | \- | | `company.city` | str | City where the company is headquartered | \- | | `company.postal_code` | str | Postal code of company headquarters | \- | | `company.alexa_ranking` | integer | Alexa ranking of the company website | \- | | `company.publicly_traded_symbol` | str | Stock ticker symbol (if publicly traded) | \- | | `company.publicly_traded_exchange` | str | Stock exchange where the company is listed | \- | | `company.investors` | str | List of company investors | \- | | `company.num_jobs` | integer | Total number of jobs posted by this company | \- | | `company.num_jobs_last_30_days` | integer | Number of jobs posted in the last 30 days | \- | | `keyword_slugs` | array | List of technology slugs of technologies mentioned in the job title, description, or URL | \- | | `locations` | array | List of Location objects, as described \[here\](https://api.theirstack.com/#model/joblocation-output) | \- | | `employment_statuses` | array | Array containing one or more employment status values. Possible values: \`temporary\`, \`full\_time\`, \`internship\`, \`contract\`, \`part\_time\`, \`other\`, \`apprenticeship\`, \`seasonal\`, \`volunteer\`, \`co\_founder\` | \- | | `workplace_types` | array | Array containing one or more workplace type values. Possible values: \`on\_site\`, \`hybrid\`, \`remote\` | \- | ## FAQS #### Is there any way to get only active jobs? Currently, we scrape all jobs only once. To be able to offer what you're asking for, we'd have to constantly be scraping millions of jobs every day for many days, which would increase our costs by 10x, 20x or even more. Those costs would have to be passed to customers, and no customers would be willing to pay for that. So the best solution at the moment is that you check yourself for each job if they're active or not. You could use something like getting the markdown of the page with Firecrawl and then passing that to an LLM asking if the job is still active or not. If you only want jobs that are likely to be still active, the more recent the jobs are, the more likely it'll be that they're still active. So another recommendation is to keep the range of dates you're passing to posted\_at filters recent, to something like the past 15 days. #### Why are some fields less complete? The completeness of certain fields depends a lot on how our sources structure and share their data. - **City / exact location**: Many job boards only expose country-level information or make detailed locations optional, so we can’t always infer a precise city. - **Salary range**: Salary transparency rules vary by region and company, which means this data is simply not present in many postings. Whenever the data is available in a consistent way, we extract and normalize it. When it isn’t, we prefer to leave the field empty rather than guess.