jobs.devinl.im

Overview

So this is my first blog post ever (hooray!), and my aim here is to concisely write about the motivation, general architecture, and design decisions behind this site.

First, the idea:

  1. What is it?
    • It’s a site that scrapes jobs directly from company portals.
  2. Why?
    • I’m about to graduate with my Masters in Data Science soon, and it’s generally a good idea to apply to recently posted jobs over old postings. However, your options to do this are usually quite cumbersome:
      • You can manually checking each company’s site everyday.
      • Some sites let you setup notifications individually.
      • You can download browser extensions that checks for page change on a timer.
      • There was actually one site I found that seems to be doing this: JobRadar, but they’re missing a lot of sources.
  3. How?
    • Historically, the hard part in building something like this are usually the scrapers. You’d handcraft a scraper for each company’s site, which will eventually break and need updates. As you scale the number of sources, it becomes a neverending cycle of creating and updating parsers.
    • Instead of doing that, we can now just use LLMs to identify and extract relevant information to our desired schema -> no more building parsers!

…more writing to come :)