ggk-quote

Let's Connect

ggk-quote

Let's Connect

ggk-contact

+91 1234 44 4444

Data Engineering & Pipeline Management

Implemented responsive and modern front-end app

Challenges

  • The application hosts hundreds and thousands of datasets (either free or paid) sourced from thousands of providers.
  • Enable the enterprise users, decision scientists, and data analysts to upload their organizational datasets.
  • Facilitate joins between the uploaded transactional/non-transactional datasets and the other publicly hosted datasets.
  • These joins should be executed within a few seconds for seamless user experience.

Solutions

  • Implemented responsive and modern frontend app for data scientists using Redux React.
  • Designed and implemented all middle-tier services that include APIs and data access layer on Python Django
  • Wrote ANSI SQL code generator in Python that considers all user selections, connects with the metadata system, and generates the final query that runs on Snowflake.
  • Built search and recommendation systems on Neo4j that help users find features pertinent to their own uploaded datasets.

Tools & Technologies

Numpy, Django, Redux, React, AWS, Snowflake

Key benefits

  • Snowflake allows complex joins that include running various math functions between large datasets to happen within seconds, giving an output of billions of rows
  • It auto-creates multiple clusters depending on the count of concurrent queries as the workload increases
  • Data Scientists can quickly iterate over their models and thus move towards higher accuracy levels since they now save a significant amount of time finding the most relevant features.