Data Engineer

TEAL India , Bangalore · tealindia.in · Full-time employment · Programming

About the job

As a data engineer at TEAL, you'll be taking the plunge into a rich data lake that includes everything from complex geospatial data to legal court orders to transactions data. You'll be hustling and getting your hands dirty with every part of the data pipeline always having an implicit appreciation for how all of this data will ultimately power a revolutionary real estate risk platform. There are broadly 2 tracks that we will consider you for.

In the transformation & synthesis track, your day-to-day will largely include:

  • Writing complex regular expressions and other parsers to extract usable data from messy PDF, HTML, JSON and other files that range in the millions to tens of millions
  • Working with NLP tools, machine translators and language specialists to efficiently translate documents from languages ranging from Persian to Tamil
  • Working with data annotators and QC experts to ensure data quality is at its highest

In the ingestion track, your day-to-day will largely include:

  • Constantly scoping out new data sources to complement existing ones
  • Creating and maintaining distributed web scrapers using Python, RabbitMQ
  • Creating, using and maintaining web scraping libraries so that future scrapers can be built faster.

About you

Regardless of what track you work in, we'd love it if you:

  • Are proficient in Python or any other object-oriented language
  • Have worked with large (millions to hundreds of millions of rows in a SQL database) interdisciplinary datasets
  • Are patient and methodical with unstructured and messy data
  • Are always hungry to learn newer and better technologies to make the data ecosystem faster, smoother and less silo-ed

Apply for this position

Login with Google or GitHub to see instructions on how to apply. Your identity will not be revealed to the employer.

It is NOT OK for recruiters, HR consultants, and other intermediaries to contact this employer