The Impact You’ll Drive
We are looking for a seasoned data engineering professional who is passionate about building scalable, reliable, and high-performance data platforms. The ideal candidate has deep experience designing modern data architectures, developing real-time and batch data pipelines, and working with large-scale distributed systems on cloud platforms. You should be equally comfortable collaborating with business stakeholders to translate data requirements into robust data models as you are implementing engineering best practices that drive quality, scalability, and innovation. If you enjoy solving complex data challenges and building the foundation that powers data-driven decision-making, we’d love to hear from you.
The Hats You Will Wear
- Create and maintain optimal data pipeline architecture
- Extract, Transform and Load data from multiple sources and multiple formats using Big Data Technologies.
- Develop and maintain robust data warehouses and lakes on AWS/GCP, ensuring architecture supports efficient data retrieval and storage.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS ‘big data’ technologies.
- Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Author complex SQL queries and design data models to optimize for performance and scalability, addressing product and business requirements.
- Implement real-time data processing pipelines within a micro-services architecture, ensuring timely data availability and integrity.
- Collaborate with cross-functional teams (Data Science, Product, Business) to support data infrastructure and analytics initiatives.
- Advocate for and implement best practices in data engineering, including Agile, TDD (Test-Driven Development), and CI/CD (Continuous Integration/Continuous Deployment) to enhance team productivity and product quality.
- Stay abreast of emerging data technologies and evaluate their application to continually improve the data ecosystem within the company.
- Create a conceptual data model to identify key business entities and visualize their relationships
- Create detailed logical models using business intelligence logic by identifying all the entities, attributes, and their relationships
- Create a taxonomy/data dictionary to communicate data requirements that are important to business stakeholders work on acquiring external data sets through APIs and/or WebSockets and prepare physical data models on top of that
The Perfect Fit
- 7+ years of experience in a Data Engineer role.
- Experience with big data tools: HDFS/S3, Spark/Flink, Hive, Hbase, Kafka/Kinesis, etc.
- Experience with relational SQL and NoSQL databases, including Elastic search and Cassandra/MongoDB.
- Experience with data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
- Experience with AWS /GCP/Azure cloud services
- Experience with stream-processing systems: Spark-Streaming/Flink etc.
- Experience with object-oriented/object function scripting languages: Java, Scala, etc.
- Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
- Strong analytic skills related to working with structured/unstructured datasets.
- Build processes supporting data transformation, data structures, dimensional modeling, metadata, dependency, schema registration/evolution and workload management.
- Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
- Experience supporting and working with cross-functional teams in a dynamic environment
- Solid understanding of best practices in software development, including Agile methodologies, TDD, and CI/CD processes.
The Problem We’re Solving
Financial institutions today are held back by legacy systems that are slow, rigid, and expensive to scale. Launching or evolving credit, lending, and UPI products often takes months, requires heavy engineering effort, and limits the ability to create personalized customer experiences.
At the same time, customer expectations have changed - speed, flexibility, and tailored financial products are no longer optional. Banks and fintechs need infrastructure that allows them to innovate quickly, adapt continuously, and scale without friction.
This is where we come in.
At Vegapay, we are building modern, configurable fintech infrastructure that enables banks, NBFCs, and enterprises to design, launch, and manage credit and payment programs with ease. Our platform brings together flexibility, speed, and control - helping our partners unlock new growth opportunities and deliver personalized banking experiences at scale.
The Opportunity Ahead
- Own and shape the data and ML strategy for high-impact fintech products
- Work on real-world problems at scale across credit, payments, and financial data
- High ownership - from building pipelines to deploying production-grade ML systems
- Opportunity to explore and apply cutting-edge areas like LLMs, Agentic AI, and advanced ML in production