Agentic Data Engineer
DataStaff, Inc is in need of a Lead Agentic Data Engineer for a long-term contract opportunity with one of our direct clients located in Richmond, VA.
*This position is hybrid; the selected candidate will need to come to Richmond quarterly.
Job Description:
Our client is seeking a highly skilled Agentic Data Engineer to design, develop, and deploy data pipelines that leverage agentic AI that solve real-world problems. The ideal candidate will have experience in designing data process to support agentic systems, ensure data quality and facilitating interaction between agents and data.
Responsibilities:?
- Designing and developing data pipelines for agentic systems, develop Robust data flows to handle complex interactions between AI agents and Data sources.
- Ability to train and fine tune large language models
- Design and build the data architecture, including databases, data lakes to support various data engineering tasks.
- Develop and manage Extract, Load, transform (ELT) processes to ensure data is accurately and efficiently moved from source systems to analytical platforms used in data science.
- Implement data pipelines that facilitate feedback loops, allowing human input to improve system performance in human-in-the-loop systems.
- Work with vector databases to store and retrieve embeddings efficiently.
- Collaborate with data scientists and engineers to preprocess data, train models, and integrate AI into applications.
- Optimize data storage and retrieval with high performance
- Statistical analysis, trends, patterns to create data formats from multiple sources. ?
Knowledge, Skills and Abilities:
- Strong Data engineering fundamentals
- Utilize Big data frameworks like Spark/Databricks
- Training LLMs with structed and unstructured data sets.
- Understanding of Graph DB
- Experience with Azure Blob Storage, Azure Data Lakes, Azure Databricks
- Experience implementing Azure Machine Learning, Azure Computer Vision, Azure Video Indexer, Azure OpenAI models, Azure Media Services, Azure AI Search?
- Determine effective data partitioning criteria
- Utilize data storage system spark to implement partition schemes
- Understanding core machine learning concepts and algorithms
- Familiarity with Cloud computing skills
- Proficiency in vector databases and embedding models for retrieval tasks.
- Expertise in integrating with AI agent frameworks.
- Experience with cloud AI services (Azure AI).
- Experience with GIS spatial data to create markers on maps ( lat long nearest topology of road, geo-locate between datasets, correlation etc.).
- Experience with Department of Transportation Data Domains developing an AI Composite Agentic Solution designed to identify and analyze?data models, connect & correlate?information to validate?hypotheses, forecast, predict and recommend?potential strategies and conduct What-if analysis.
- Bachelor's or master's degree in computer science, AI, Data Science, or a related field.
Required Skills:
- 1 Year – Understanding the Big data Technologies
- 1 Year – Experience developing ETL and ELT pipelines
- 1 Year – Experience with Spark, GraphDB, Azure Databricks
- 1 Year – Expertise in Data Partitioning
- 3 Years – Experience with Data conflation
- 3 Years – Experience developing Python Scripts
- 2 Years – Experience training LLMs with structured and unstructured data sets
- 3 Years – Experience with GIS spatial data
This opportunity is available on a corp-to-corp basis or as a W2 position with a competitive benefits package. DataStaff, Inc. offers medical, dental, and vision coverage options as well as paid vacation, sick, and holiday leave. As many of our opportunities are long-term, we also have a 401k program available for employees after 6 months.