TrueHire Staffing LLC
Job Title: Data Engineer / AI Engineer (Agentic AI Platform – Financial Data)
Location: Philadelphia, PA (Hybrid)
Duration: 12+ months contract
About the Role:
We are building a platform that converts unstructured financial data (emails, corporate actions, index announcements) into high-quality, structured datasets used by financial institutions.
This is not a typical “LLM wrapper” role.
You will work on systems that:
Extract data from noisy, inconsistent sources
Validate and reconcile outputs across multiple inputs
Ensure correctness, traceability, and auditability
The challenge is not just applying LLMs—it’s making them reliable in production for financial workflows.
What You’ll Work On
Designing pipelines that process high-volume financial documents (batch + near real-time)
Building LLM-powered extraction workflows (classification, parsing, summarization)
Implementing validation layers (rule-based + model-based) to reduce hallucinations
Developing retrieval systems using embeddings and vector search
Architecting end-to-end systems: ingestion → processing → storage → serving
Ensuring data quality, observability, and fault tolerance
Collaborating with product to turn messy data into usable financial intelligence
Core Requirements
Strong Python and backend/data engineering experience
Experience building production data pipelines (ETL, streaming, or async systems)
Solid understanding of distributed systems and failure modes
Experience working with LLM-based systems in production:
Prompt design
Output validation
Retry/fallback strategies
Evaluation and monitoring
Experience with data storage systems (SQL + NoSQL)
Familiarity with cloud infrastructure (AWS or similar)
Preferred Experience
Experience with RAG / vector search systems
Background in financial data or capital markets
Experience with streaming systems (Kafka, etc.)
Experience building multi-step or agent-style workflows
What Makes This Role Interesting
Work on high-accuracy AI systems where correctness matters
Solve real problems around:
LLM reliability and hallucination mitigation
Data consistency across conflicting sources
Real-time vs correctness tradeoffs
Build systems used in financial decision-making workflows
High ownership over core architecture in an early-stage environment
Nice to Know (but not required)
Experience with orchestration tools (Airflow, etc.)
Exposure to evaluation frameworks for LLMs
Experience working with large-scale document processing
Tech Stack (Representative, not exhaustive)
Python, APIs, async processing
LLM APIs + embeddings
SQL / NoSQL databases
Cloud infrastructure (AWS)
Data pipelines and streaming systems
Vector Databases
Best Regards,
Ashish Singh
Truehire Staffing,
5900, Balcones Drive Suit 100, Austin, TX, 78731
Email ID:
Web:
Verified Listing
This role has been verified for authenticity, market-rate compensation, and remote eligibility.
Get the latest updates on AI-powered hiring, career growth, and technical deep-dives delivered to your inbox.