Datasets
Access complete jobs, closed jobs, technographics, and company datasets — delivered as Parquet or CSV, with historical coverage and daily or hourly updates via S3.

What are datasets?
Datasets give you direct access to our complete database of jobs, technology usage, and company information. Think of them as your raw data goldmine – perfect when you need maximum flexibility for advanced analytics, machine learning projects, or custom analysis.
We've got three powerful datasets ready for you:
-
Jobs Dataset: Access our complete collection of 205M job postings from 195 countries, dating back to 2021. Each job record includes essential details like job title, description, salary, location, company name, company URL, company industry, company size... Check out the complete data dictionary for all available fields.
-
Jobs Closed Dataset: A dedicated dataset of job postings that TheirStack has detected as no longer live. Same schema as the Jobs dataset with one key addition:
closed_atis always populated, telling you the day each posting went offline. Files are partitioned by close date — daily and hourly. Use this for historical close-date analysis, building pipelines that react to role fills, or loading closed job data into a warehouse. -
Technographics Dataset: Dive into 47M technology usage signals across 12M companies using 32k different technologies. This isn't just one file – you'll get three comprehensive datasets: the main technographics data (company_id, technology_slug, confidence_score, n_jobs...), detailed company profiles (including domain, country, revenue, and employee count), and a complete technology catalog with descriptions and categories. Check out the complete data dictionary for all available fields.
Frequency of updates
TheirStack offers three flexible dataset access options to meet your data needs:
- Historical Access: Receive a one-time download link containing all available records at the time of purchase.
- Daily Updates: Get daily download links with new records added to the dataset, including delta files for incremental updates.
- Complete Access: Combine both historical and daily updates for comprehensive data coverage.
Delivery format
All datasets are delivered with a link to a S3 bucket. The format of the dataset can be CSV or Parquet.
Data structure
- Jobs dictionary
- Jobs Closed dictionary
- Companies dictionary
- Technographics dictionary
- Technologies dictionary
How to get lastest dataset link
In order to get the latest dataset link to download, you need to do a /GET request to the dataset endpoint.
Frequently Asked Questions
How is this guide?
Last updated on
