Akshay Kumar

Computer Vision & Data Engineer

Engineering production-ready AI solutions with expertise in Computer Vision, Natural Language Processing, and scalable MLOps infrastructure. Passionate about transforming cutting-edge research into real-world applications.

About Me

Bridging the gap between advanced AI research and practical business solutions

Computer Vision Engineer & ML Researcher with 1+ years of experience building production-ready AI solutions. I specialize in developing high-accuracy computer vision systems (95%+), implementing NLP solutions with LLMs, and architecting scalable MLOps pipelines. My expertise spans from model development to deployment, ensuring robust performance in real-world applications.

95%+
Model Accuracy
10+
Production Models
15+
AI Projects
3+
Companies Impacted

Professional Experience

Building impactful AI solutions across diverse industries

Jan 2025 - Aug 2025

Computer Vision Engineer

Loksun.ai, Bengaluru, Karnataka
  • PPE Detection System: Engineered real-time pipeline with intelligent frame-dropping & ROI optimization; achieved YOLOv11n mAP50 of 0.87 with 30+ FPS performance
  • Facial Recognition Platform: Developed multi-source tracking system with >90% TPR, implementing cross-profile matching algorithms and SQL-based management interface
  • Octopi AI Agent: Built multi-modal AI system using AWS Bedrock Claude for content generation, realtime voice and text based mental health support chatbot, and voice-based person recognition
  • Enterprise RAG System: Implemented Gemini-powered document Q&A platform supporting multiple formats with intelligent data extraction capabilities
  • AI Automation Suite: Created intelligent agents for automated email generation and LinkedIn content optimization, improving productivity by 40%
Oct 2023 - Feb 2024

AI/ML Intern

Techdome Solutions, Indore, Madhya Pradesh
  • Catalyst Platform: Architected Azure Edge-based video & sensor processing system; achieved YOLO mAP of 0.72, integrated Kafka/MQTT messaging, and built FastAPI microservices
  • ChatWithData: Developed conversational AI system for natural language data queries, generating dynamic visualizations and analytical summaries
  • Resume Intelligence: Created RAG-based information extraction system for automated resume parsing and candidate matching
Aug 2022 - Jan 2023

Associate Solution Developer

Quadwave Consulting, Pune, Maharashtra
  • Developed Java backend solutions using Liferay, JSP, Servlets, and JBPM for enterprise applications
  • Implemented comprehensive unit testing strategies and managed module rollouts for production systems
  • Collaborated with cross-functional teams to deliver scalable business process management solutions

PupsN Vision System

AI-Powered Pet Analytics & Edge Monitoring

The Objective

Traditional pet monitoring systems are often limited by high latency and static analysis.

My goal was to build a high-performance, edge-optimized system capable of real-time detection, persistent tracking, and behavior classification, ensuring a smooth 30 FPS visual experience even on resource-constrained hardware.

Technical Architecture & Implementation

Multi-Threaded Decoupling

Architected a dual-thread system to prevent frame drops. The stream_worker manages the 30 FPS visual layer, while the ai_worker executes the inference pipeline asynchronously at maximum CPU potential.

4-Stage ONNX Pipeline

Implemented an optimized pipeline using onnxruntime featuring YOLO26 Nano (Detection), IoU Tracker (Tracking), OSNet (Re-ID), and MobileNetV2 for real-time behavior classification.

Edge-First Optimizations

Integrated manual memory management, RAM-cached SQLite lookups for zero-latency weight serialization, and HTML5 Canvas rendering to bypass DOM-based memory leaks.

Featured Projects

Innovative AI solutions demonstrating technical excellence and real-world impact

Building a Generative Diffusion Engine from Scratch

Built, debugged, and trained a Denoising Diffusion Probabilistic Model (DDPM) entirely from scratch using raw PyTorch. Features a custom YOLO-style U-Net with Sinusoidal Position Embeddings and a mathematically rigorous linear beta schedule. Scaled to train on the full 60,000-image MNIST dataset.

Generated MNIST 1 Generated MNIST 2 Generated MNIST 3 Generated MNIST 4 Generated MNIST 5 Generated MNIST 6 Generated MNIST 7 Generated MNIST 8 Generated MNIST 9 Generated MNIST 10

Python PyTorch OpenCV Generative AI Deep Learning

Traffic Monitoring System

Advanced real-time vehicle tracking system using YOLOv8 object detection combined with SORT algorithm for multi-object tracking. Integrated EasyOCR for automatic license plate extraction and recognition with 90%+ accuracy in varied lighting conditions.

YOLOv8 SORT EasyOCR OpenCV Python

RAG Model Framework

Comprehensive CLI and Flask-based application for intelligent document Q&A. Supports multiple formats including PDF, DOCX, and XLSX. Leverages state-of-the-art Transformer models for accurate information retrieval and contextual understanding.

Transformers Flask Sentence Transformer Python

Test Cheating Detection

AI-powered exam integrity monitoring system utilizing YOLO for real-time object detection and MediaPipe for precise head-pose estimation. Detects unauthorized devices and suspicious behaviors with automated alert generation for proctors.

YOLO MediaPipe Computer Vision Python

YOLO Object Detection Suite

Focuses on object detection using YOLOv11 and other YOLO models. Automates identifying objects in images, applicable for surveillance, quality control, and smart monitoring.

YOLOv11 Object Detection Computer Vision

Multi-Class Object & Text Recognition

A comprehensive computer vision pipeline that detects humans and animals in images and videos, classifies them correctly, and simultaneously performs Optical Character Recognition (OCR) on the media.

Object Detection OCR Computer Vision

Real-Time Drone Tracking System

Advanced computer vision system to accurately detect and track drones in varied environments using YOLO and SSD (Single Shot MultiBox Detector) architectures.

Drone Tracking Result

YOLO SSD Object Tracking

Robust QR Code Detection

A robust QR code detection and decoding pipeline using OpenCV. Designed to handle challenging real-world conditions including noisy, rotated, or low-contrast inputs.

QR Output 1 QR Output 2

OpenCV QR Detection Computer Vision

Drone Image Classification Research

A research-oriented comparative analysis of two prominent deep learning approaches (CNN and ResNet50) for highly accurate drone image classification tasks.

CNN History ResNet History

CNN ResNet50 Deep Learning

Technical Expertise

Comprehensive skill set for end-to-end AI/ML development

Data Engineering

PySpark dbt (data build tool) Airflow Terraform ETL/ELT Kafka PostgreSQL Lakehouse

AI & Machine Learning

Computer Vision NLP Deep Learning PyTorch TensorFlow OpenCV YOLO MLOps

Languages

Python SQL

Tools & Infrastructure

AWS (S3, Glue, Redshift) Docker Kubernetes FastAPI Git & GitHub Linux CI/CD

Education

Bachelor of Computer Applications & Cyber Security

Jharkhand Raksha Shakti University, Ranchi, Jharkhand

CGPA

8.41

Duration

2020 - 2023

Let's Connect

Open to discussing new opportunities and collaborations in AI/ML

I'm passionate about solving complex problems with AI and always interested in challenging projects. Whether you need help with computer vision, NLP, or building scalable ML systems, let's talk!