Mathieu Astruc
AI & Data Science Engineer
Driven by curiosity and a competitor at heart, I love building AI and data systems that are genuinely useful in the real world, and I'm looking for challenges with real responsibility.
Seeking a full-time role in Data Science / AI from Oct. 2026

Experience
Full experience →- 2026AirbusApplied AI EngineerCurrent

Built a hybrid RAG system over 10k+ export licenses and regulatory documents, routing queries between a LangChain/FAISS retrieval pipeline and a NL-to-SQL path for structured data, guaranteeing zero generative approximation on field-level queries.
- 2025NTNUResearch Engineer Intern

Lead author of a paper accepted at HCI International 2026 (Montréal) on a real-time computer-vision architecture for gesture recognition in human-robot interaction.
▸ demo - 2024Banque de FranceData Scientist Intern

Automated scraping workflows aggregating unstructured public and financial data from multiple sources.
- 2025Comat SpecificMachine Learning Engineer

Built a deep-learning OCR pipeline converting legacy hand-drawn 2D engineering sketches into structured machine-readable data, modernizing industrial workflows.
Selected projects
All projects →- 2025Humanoid robot interaction stackNTNU
Embedded AI system combining real-time gesture recognition, computer vision and a fine-tuned LLM for domain-specific dialogue, with GPU latency tuning and Human-in-the-Loop robustness.
▸ demo - 2025HCI International 2026 publication
Lead author of an accepted HRI paper on an optimized real-time computer-vision architecture for gesture recognition, integrating MediaPipe landmarks, lightweight ML classifiers and low-latency robot actuation.
- 2026minigpt · Transformer (GPT) from scratchPersonal project
Decoder-only Transformer built from scratch in PyTorch with hand-written multi-head causal self-attention (no nn.Transformer / nn.MultiheadAttention). Character-level GPT trained on a public-domain corpus to generate text; validation loss dropped 4.55 → 1.56 in ~1 min on Apple MPS. Repo ships the training loop, autoregressive sampling, and a README deriving the attention math.
- 2026tiny-diffusion · DDPM from scratchPersonal project
Denoising diffusion model (DDPM) implemented from scratch in PyTorch: noise schedule, U-Net denoiser and the reverse sampling loop all written by hand, no diffusers library. Trained on MNIST, it generates digit images from pure noise. README visualises the forward noising process, the loss curve and a grid of generated samples.