About River
River is the human layer of the internet. The consent and ownership infrastructure that puts people back in the loop. Users see their data, control who accesses it, earn 70% of the value it creates, and get AI that finally serves them, not the platform.
We are pre-revenue, raising a $3M seed at a $30M pre-money, and preparing for market launch. Our founding CEO brought MTV and NetJets to Europe and we are backed by technology luminaries from Google, Sun Microsystems, and beyond.
We are looking for a Lead Data Scientist to take full ownership of everything data science at River – from our personal data graph to our next-generation agentic AI systems. This is a foundational hire that will underpin our expanding team. You will be the person who defines how River understands, connects, and reasons over user data at scale, and who builds the data science function from the ground up.
What We're Building
River is three products, one platform
- River Social – live and growing. A social platform where you own your data, control your identity, and get paid when it's used. 70% of value goes back to you.
- River Source – enterprise consent infrastructure. The bridge that gives AI platforms and brands access to high-fidelity, consented user data at scale.
- RiverAI – launching 2026. A desktop AI client built on your personal data graph, with real agentic capabilities that work for you, not just talk to you. RiverAI is powered by Rivera, River's AI engine, which is unique:
- It knows you. Rivera reasons over your personal data graph – your preferences, relationships, history, and context – to deliver intelligence that's genuinely yours.
- It's agentic. Rivera doesn't just answer questions – it takes action on your behalf, orchestrating multi-step workflows across your data and services with full transparency and user consent at every step.
- It's proactive and predictive. Rivera anticipates your needs and offers to help without you even asking.
- It's portable. Your data identity moves with you. Rivera works across River's platform and beyond, acting with your authority wherever AI meets the individual.
What You'll Be Doing
- Designing and evolving River's personal data graph – defining entity schemas, relationship models, and enrichment pipelines that transform raw user data from diverse platforms into a rich, interconnected graph of individual intelligence.
- Building and maintaining data taxonomies – creating structured classification systems that organize the vast diversity of user data into coherent, queryable categories, enabling consistent data interpretation across River Social, River Source, and RiverAI.
- Cleaning, normalizing, and structuring noisy data – building robust pipelines that ingest messy, heterogeneous real-world data from multiple platforms and sources, resolving inconsistencies, deduplicating entities, and transforming it into high-fidelity graph-ready data at scale.
- Advancing natural language processing capabilities – developing NLP systems that extract meaning, intent, and relationships from unstructured text data, powering Rivera's understanding of user context and enabling semantic search across the personal data graph.
- Building agentic AI systems for RiverAI – architecting multi-agent frameworks where Rivera can reason over the personal data graph, plan multi-step actions, and execute autonomous workflows on behalf of users.
- Staying at the cutting edge of AI development – continuously evaluating and integrating emerging techniques in machine learning, graph ML, embedding models, and agentic architectures to ensure River's AI capabilities remain state-of-the-art.
- Developing AI-powered data ingestion for River Source – building intelligent pipelines that automatically analyze, classify, and enrich enterprise and user data as it enters the graph, powering River's consent infrastructure and maximizing network effects.
- Creating novel data valuation metrics – quantifying the value of user data contributions within the graph to power River Social's compensation model, where 70% of value flows back to users.
- Designing reinforcement learning systems for Rivera's personalized recommendation engine, leveraging graph structure to deliver context-aware, relationship-informed suggestions.
- Implementing hybrid search architectures (vector + graph traversal + traditional) with real-time trend analysis across the personal data graph.
- Building multi-agent preference aggregation models for group commerce features, enabling Rivera to reason over the preferences of multiple connected users simultaneously.
- Creating explainable AI components that reveal decision-making processes, ensuring users understand why Rivera takes actions and how their data informs recommendations.
- Defining and measuring success – establishing evaluation frameworks, A/B testing infrastructure, and metrics for graph quality, agent reliability, and user satisfaction at scale.
What We're Looking For
Advanced degree (MS/PhD) in Data Science, Computer Science, Machine Learning, Statistics, or equivalent practical experience.
5+ years of experience in applied data science or ML engineering, with demonstrated expertise in at least several of the following:
- Strong foundations in classical and modern machine learning – supervised/unsupervised learning, feature engineering, model selection, evaluation, and optimization.
- Knowledge graph construction, entity resolution, and graph-based ML (e.g., GNNs, graph embeddings, link prediction).
- Data taxonomy design, ontology development, and large-scale data normalization across heterogeneous sources.
- Natural language processing (NLP) – text extraction, named entity recognition, semantic similarity, intent classification, and embedding generation.
- Agentic AI architectures, including multi-agent systems, tool use, planning, and autonomous workflow orchestration.
- Vector databases and semantic search (pgvector, Pinecone, or equivalent) for embedding storage and similarity retrieval at scale.
- Designing, deploying, and measuring end-to-end ML systems in production at scale (>1M daily interactions).
- Real-time data pipelines (Apache Kafka, Spark, or equivalent).
- Graph databases (Neo4j, AWS Neptune, or equivalent).
- Python and the modern ML stack (PyTorch, scikit-learn, HuggingFace, etc.).
- Balancing model performance with computational efficiency.
Above all, you must have
- A startup mentality – you thrive in fast-paced environments with ambiguous requirements, can context-switch between strategic thinking and hands-on execution, and are energized (not daunted) by building something from scratch.
- Full ownership mindset – you will own everything data science at River. If it touches data, models, or intelligence, it's yours. You don't wait for specifications; you identify what needs to be done and drive it forward.
- Strong communication skills and the ability to translate complex technical concepts for non-technical stakeholders.
- A strong interest and awareness of the rapidly changing fields of agentic AI, knowledge graphs, and machine learning.
- A passion for ethical AI design and data sovereignty principles, sharing our vision that AI should empower users, not exploit them.
Nice to have
- LLM fine-tuning, prompt engineering, and retrieval-augmented generation (RAG).
- Experience with collaborative filtering and recommendation systems.
- Familiarity with API integration patterns (REST/gRPC/GraphQL).
- Background in data marketplaces, data valuation, or privacy-preserving ML.
- Experience with AWS infrastructure (Neptune, SageMaker, ECS).
Why You Should Join River
Most engineering jobs are maintenance work on someone else's idea. This isn't that.
The core technology is built, the patents filed, and the partnerships in place. The work ahead is the work that matters: taking what we've proven into the hands of real users, at scale, and building the company around it. You will ship things that millions of people will use.
The team blends operator experience, engineering depth, and startup hustle. You get significant equity, real ownership of what you build, a direct line to the founders, and the rare chance to work on problems no one has solved yet.
The market is ready. Join us.
How to Apply
Email recruitment@rivergrp.com
This role is remote. Applicants must be currently authorized to work in the United States and will be required to provide evidence of their right to work as part of the hiring process. Standard business hours are based on the Pacific Time Zone and availability during these hours is essential. References will be sought for shortlisted candidates.