Mohamed HANNANI, AI Engineer
PROFILE
AI Engineer with 4+ years building production AI and ML systems, the last two focused on LLMs and agents. I design and ship end-to-end: integration patterns, retrieval, model selection, cost and latency, observability, and the safeguards that keep systems reliable in production. Recently solo-built Empfio, a production AI agent platform across voice, WhatsApp, web chat, Telegram, and SMS — multi-LLM routing, RAG, MCP-shaped tools, and full observability. Comfortable owning a system from idea to rollout and working across product, engineering, and non-technical teams.
EXPERIENCE
AI/ML Consultant
Nov 2024 - Feb 2026, Siegen-Wittgenstein, Germany
Healthcare Manufaktur GmbH
- Built and operationalized LangChain agent pipelines backed by a Qdrant vector store across 200+ healthcare facilities — document retrieval, Q&A, and structured extraction — cutting manual processing effort by 40%.
- Defined reusable patterns for LLM integration (prompt design, RAG, structured extraction, model routing) and made principled model-selection calls based on cost-performance trade-offs and inference latency.
- Built interactive KPI dashboards (D3.js, React) enabling non-technical stakeholders to self-serve operational insights, reducing ad-hoc reporting requests by 60%.
- Provisioned AWS infrastructure (EC2, Lambda, S3) with Terraform and CI/CD, orchestrating 50+ Docker containers with zero-downtime deployments and the safeguards required in a regulated industry.
- Identified and prioritized AI opportunities end-to-end — assessed business impact, proposed solutions to stakeholders, and shipped them into daily use.
Data Scientist & AI Researcher
Nov 2023 - Oct 2024, Siegen, Germany
University of Siegen
- Built LLM reasoning pipelines (GPT-4, Claude, LLaMA 2) for automated in-context machine translation — prompt engineering, chain-of-thought, retrieval-based context — 25% BLEU improvement over baseline.
- Designed evaluation frameworks to compare prompts, contexts, and models systematically — turning subjective LLM choices into measurable decisions.
ML Engineer & Data Scientist
Mar 2022 - July 2023, Casablanca, Morocco
Indatacore
- Led a 3-person team building modular, reusable ML inference services across 156+ document types — component reusability reduced integration effort by 30% per new format.
- Architected RESTful API services (Flask) linking ML inference systems to downstream business workflows — 10,000+ monthly transactions, structured logging, 99.2% uptime. Deployed on AWS with Docker and Kubernetes.
PROJECTS
Empfio — Production AI Agent Platform (Solo-built)
Solo-built in ~3 months: a production AI agent platform that runs the front desk for service businesses. Autonomous agents across voice, WhatsApp, web chat, Telegram, and SMS — intake, intent classification, appointment booking, customer payments (Stripe Connect), follow-ups, and escalation to a human when needed. Multi-LLM routing (GPT-4o, Claude, Groq, Ollama) with per-org circuit breakers, MCP-shaped tool registry, RAG with vector retrieval, and topic-driven behavior configuration (new vertical = configuration, not new code). Full observability via Langfuse, Prometheus, and OpenTelemetry GenAI semantic conventions. Built on FastAPI async services, PostgreSQL, Redis, Celery, LiveKit, Next.js.
ECO Analyzer — Healthcare Intelligence Platform
Built a healthcare analytics platform over German Ambulant Specialist Care (ASV) data — 18,977+ doctors, 247 specialist teams, 688 cities. Interactive geospatial map (Leaflet, PostGIS) with network analysis, plus an AI Copilot (GPT-4) that lets users query the dataset in natural language — a shared retrieval pattern reusable across teams and use cases. 8 independently deployable microservices (FastAPI, Node.js) over PostgreSQL + PostGIS and Redis.
EDUCATION
Master of Data Science
The University of Cadi Ayad
2020 – 2022 • Marrakech, Morocco
Bachelor of Computer Science
The University of Cadi Ayad
2017 – 2020 • Marrakech, Morocco
SKILLS
- AI Agents & LLMs: Multi-agent orchestration, tool calling, RAG, retrieval-based knowledge access, prompt design, context engineering, structured extraction, evals
- Production AI: Multi-LLM routing, per-provider circuit breakers, cost-performance model selection, token usage optimization, fail-closed safeguards
- Backend: Python (expert), FastAPI, async patterns, RESTful & event-driven APIs, distributed systems, SQLAlchemy
- Cloud & Infra: AWS (EC2, S3, Lambda), Docker, Kubernetes, Terraform, CI/CD (GitHub Actions)
- Data & Retrieval: PostgreSQL, Qdrant, ChromaDB, Redis, Elasticsearch, Pandas
- Observability: Langfuse, Prometheus, OpenTelemetry GenAI semantic conventions, evaluation harnesses
- Frontend: Next.js, React, TypeScript, D3.js — operator dashboards & internal tooling
- Cross-functional: Working with product, engineering, security, and non-technical stakeholders end-to-end