LLM/RAG Systems Engineer Needed to Finalise Enterprise Jira Bot

Vollna Client
Remote

Job / Advertisement Description

Job Title: LLM/RAG Systems Engineer Needed to Finalise an Enterprise Jira AI Bot (Strict Retrieval, Zero Hallucinations) ⸻ Project Overview We are building an internal AI assistant for a finance client that integrates with Jira. The goal is to enable natural-language querying of Jira issues without hallucinations, using a combination of: • Strict JQL-based retrieval (deterministic) • Semantic search (conceptual retrieval when JQL can’t apply) • Grounding + validation (answers derived solely from retrieved facts) • Per-user Jira permissions (no data leakage) A functional prototype already exists, but it fails several of our test cases. We now need a highly competent LLM/RAG systems engineer to help us finish the last 20% — the difficult part — and make the system production-grade. This is not a prompt engineering gig. This is LLM systems architecture, retrieval design, and data engineering. ⸻ What’s Working Today • Basic conversational layer • Jira API integration • Semantic search indexing • JQL execution (partial) • High-level RAG pipeline • Some correct issue-level responses ⸻ What’s Not Working / Needs to be Completed The system currently struggles with: • Partial JQL results • Hallucinated issue lists • Missing dataset fields (e.g., labels) • Incorrect or inconsistent status values • Silent failures (no response) • Improper mixing of JQL + semantic retrieval • Weak grounding (model sometimes invents values) • No validation layer • Incomplete permissions handling We have a 10-test functional protocol + 5-test privacy protocol that the system must pass. You will receive this upon hiring. ⸻ Your Role You will collaborate with our lead developer to: 1. Complete and fix the Jira data ingestion layer • Pull all necessary fields: • labels • components • comments • status mappings • timestamps • reporter/assignee data • Normalise + validate schema • Rebuild or improve the embeddings index 2. Architect deterministic JQL-based retrieval • For queries that can be expressed in JQL • Must match Jira’s output exactly • Absolutely no hallucinated tickets or counts • Clear handling of “no results” scenarios 3. Architect a clean semantic retrieval mode • Used exclusively for conceptual / fuzzy questions • With strict boundaries and validation • No fabricated elements • Hybrid filtering where appropriate 4. Implement a grounding + validation layer • All answers must come from retrieved data • No invented issue keys, summaries, or statuses • Add schema-aware response enforcement 5. Add robust user-facing error handling • The bot must never return nothing • It must always explain ambiguity • It must never “guess” 6. Ensure all tests pass We will provide: • 10 functional test cases • 5 privacy/permissions test cases Your job is to ensure the bot behaves reliably under all of them. ⸻ Required Skills & Experience Must have (strict requirement): • Demonstrable experience with RAG (Retrieval Augmented Generation) • Experience building enterprise AI assistants (internal tools, not chatbots) • Strong understanding of LLM hallucinations and how to prevent them • Expertise with vector search (Chroma, FAISS, Weaviate, Pinecone, etc.) • Strong backend experience in Python or Node.js • Experience with API-driven data retrieval pipelines • Experience with schema-constrained outputs (function calling / JSON mode) • Ability to implement guardrails, validation, and hybrid retrieval logic • Past work integrating with complex APIs (Jira, Salesforce, Confluence, etc.) Nice to have: • Jira/JQL knowledge • Experience in finance / regulated environments • LlamaIndex or LangChain experience • Knowledge of permissioning systems ⸻ What This Project Is NOT • Not prompt engineering • Not LLM fine-tuning • Not a chatbot copywriting project • Not a front-end design project • Not a generic API integration • Not an experiment We need reliable, deterministic behaviour suitable for an enterprise client. ⸻ Scope & Timeline Estimated timeline for an experienced LLM systems engineer: 3–6 days of focused work. We want to start immediately. ⸻ How to Apply (required questions) Please answer all four questions clearly. Applicants who skip these will not be considered. 1. How would you architect strict JQL retrieval vs semantic retrieval so they never contaminate each other? 2. How would you prevent hallucinated issue IDs, statuses, and lists? 3. How would you implement grounding + validation in a Jira-based RAG system? 4. What stack / tools would you use to upgrade or rebuild this assistant? Please include links to: • relevant past work • code samples (if possible) • any enterprise AI tools you’ve built ⸻ About Us We develop private, on-premise, and enterprise-grade AI systems. This assistant is part of a real client workflow for a global finance organisation. We value: • clean engineering • deterministic behaviour • maintainability • long-term reliability We’re looking for someone who thinks ...