VP Data Engineering, Careers At ECI Software Solutions

Careers At ECI Software Solutions

Share with friends or Subscribe!

Are you ready for new challenges and new opportunities?

Join our team!

Current job opportunities are posted here as they become available.

Subscribe to our RSS feeds to receive instant updates as new positions become available.

Back To Openings

VP, Data Engineering

Department:	Product Development
Location:

START YOUR APPLICATION

The Mission

Most data engineering roles are about moving data from A to B. This one is about making 20 years of complex, relational ERP data legible to AI agents � so they can reason over financial transactions, inventory movements, and supply chain events without hallucinating.

ECI is rebuilding how enterprise software is built and operated using an AI-native model. The data layer is the foundation everything else runs on. Without a world-class context engine, the agents are guessing. You are the person who makes sure they never have to.

This is a greenfield mandate. You will hire the team, choose the stack, define the architecture, and own the outcome. The CTO is your only direct stakeholder.

What You�ll Own

You are not supporting the AI initiative. You are building the infrastructure without which it cannot exist.

Context Architecture & Retrieval

Design and own the retrieval systems that allow AI agents to reason over ERP data with zero hallucinations

Build and scale the vector infrastructure � pgvector, Qdrant, or equivalent � with production-grade embedding and reranking pipelines

Own the hybrid search strategy: semantic retrieval layered on top of SQL-scoped financial data

Drive context window optimization � packing the most relevant financial 'truth' into each LLM call efficiently

Knowledge Graph & MDM

Lead the Master Data Management strategy � golden record survivorship, identity resolution, entity deduplication across ERP entities

Build the knowledge graph that maps relationships between Vendors, Purchase Orders, Invoices, GL Entries, and Inventory so agents understand meaning, not just rows

Own the semantic layer: translate a 500-table legacy schema into a structured, LLM-readable ontology

Define data quality standards and automated validation pipelines that enforce them continuously

Data Platform & Infrastructure

Build the core data platform from scratch: ingestion, transformation, storage, and serving layers

Own the modern data stack � dbt, Airflow or equivalent, Postgres/SQL Server � with an AI-augmented workflow throughout

Implement data-centric evals: 'Judge Agents' that verify AI output against ground truth SQL

Build synthetic data generation pipelines that produce high-fidelity, relationally consistent ERP data for agent training and testing

Builder Data Track

Own the Data Builder squad: hire, develop, and hold the team to Builder-level output standards

Partner with the Dev and QA Builder leads to ensure data systems are the right interface for agentic tool-calling

Run the Data track of the Builder Bootcamp � define the curriculum, set the graduation bar, make the calls

Partner with product and engineering on AI feature data requirements � you are the upstream dependency for almost everything

Governance & Compliance

Define data governance policies for AI-consumed data: lineage, access control, PII handling, audit trails

Own compliance requirements relevant to financial data in an ERP context � SOC 2, data residency, retention policies

Build the observability layer: OpenTelemetry, Weights & Biases, or equivalent for embedding quality and retrieval performance

Who you are

Requirements:

You have built and led a data engineering team before � you know how to hire, structure, and technically lead a team that ships production data systems

Knowledge graph or MDM at scale: you have designed entity resolution, survivorship rules, and ontologies for complex relational domains � not just prototyped them

AI/ML platform or LLMOps experience: you have operated embedding pipelines, vector stores, and LLM-integrated data systems in production � you understand latency, cost, and quality trade-offs

You think in systems: schema design, retrieval architecture, and data contracts are your native language

You are comfortable in ambiguity � greenfield means no existing patterns to follow and no team to hand things off to on day one

Highly Desirable:

Production RAG pipelines over structured or financial data � you have gone beyond demos and operated retrieval systems with real precision/recall requirements

ERP, financial, or supply chain data domain � you understand what makes a General Ledger different from a web analytics event stream

Modern data stack depth: dbt, Airflow, Postgres, SQL Server � you have opinions about transformation layer design and know when to break the rules

Experience working across time zones with an offshore engineering team (India context is a plus)

The Stack:

Languages

Python, SQL (Postgres / SQL Server), TypeScript

AI / Retrieval

OpenAI / Anthropic APIs, pgvector, Qdrant, LangChain / LangGraph

Data Platform

dbt (AI-augmented), Apache Airflow, Docker

Graph / MDM

Neo4j (primary), with open evaluation of alternatives

Observability

Weights & Biases (embedding evals), OpenTelemetry, custom Judge Agents

Infra

AWS / GCP, Kubernetes, GitHub Actions

The Archetypes we�re looking for:

The Data Alchemist � you believe data is only valuable when an AI can reason over it, and you spend time experimenting with embedding models and retrieval techniques to make that true

The Manual Mapping Hater � if you have to map two schemas twice, you've already built an agent to do it for you

Rigor over Hype � you know the difference between a vector search demo and a production-grade financial data engine; you care about Precision and Recall

The Founding Mindset � you're energized by building from scratch, not managing existing systems, and you make decisions confidently without a playbook

Why this role:

ERP data is the hardest data problem in enterprise software � 20 years of relational financial history, undocumented schemas, and zero tolerance for hallucination. If you can solve RAG for an ERP, you have solved the hardest version of the problem.

Greenfield with real stakes: you are not inheriting someone else's technical debt or org structure. You build what you believe will win.

Direct line to the CTO � no data governance committee, no analytics manager layer, no 6-month roadmap approval process

Unlimited context budget: access to frontier models and the compute to run serious embedding and indexing experiments

The work matters: every AI feature in the product runs on the infrastructure you build

START YOUR APPLICATION

Visit Our Home Page