Shivam Sharma

MS in Robotics & Autonomous Systems (AI) — Arizona State University (Expected May 2027). B.Tech in Artificial Intelligence — Amity University, Noida (Graduated May 2025).

Projects

Focus

Robotics (ROS2), CV, NLP

Tooling

PyTorch, TensorFlow, OpenCV, ONNX, ROS2, MLflow

Now

Agentic robotics: prompt→action, tele‑op, safety

About Me

I’m an AI/ML Engineer working across robotics (ROS2), computer vision, and NLP. I build real-time perception and decision systems, and deploy models for low-latency inference (e.g., ONNX at 30 FPS) with robust evaluation and logging.

At ASU, I’m focused on agentic control and human‑in‑the‑loop autonomy—connecting LLM/VLM planning to robot kinematics, safety supervisors, and tele‑operation.

Previously, I shipped multimodal analytics, drowsiness detection, and churn modeling pipelines with scikit‑learn, XGBoost/LightGBM, SHAP, and dashboards in Streamlit.

Goal: enable next‑gen autonomous systems—from prompt‑to‑action agents to resilient tele‑op—with measurable reliability and safety.

Skills

🤖 AI & ML

Supervised/unsupervised ML, AutoML, DL; XGBoost, LightGBM, CatBoost; Optuna tuning, MLflow tracking.

👁 Computer Vision

OpenCV, MediaPipe, CNN fine‑tuning (MobileNetV2, VGG16), ONNX export, tracking & detection.

🤝 Robotics

ROS2 (rclcpp lifecycle, tf2), control & kinematics, safety supervisors, Gazebo sim, AprilTag, camera calibration.

💬 NLP

Transformers (HF), Sentence‑BERT, spaCy, NLTK; retrieval, classification, prompting for planning.

💻 Programming & Data

Python, C/C++, SQL, JavaScript; NumPy, Pandas, scikit‑learn, TensorFlow, PyTorch, Keras, Matplotlib.

📜 Certifications

Applied AI (IBM/Coursera), Aerial Robotics (UPenn), Python for DS (NPTEL), Microsoft AI.

Projects

🤖 llm-linkedin-autoapply-bot — Intelligent Apply Pipeline

A safe, LLM-assisted automation that extracts job descriptions, generates role-matched application materials & cover letters (Gemini → LaTeX → PDF), and fills application forms using structured LLM guidance. Preferred flow: LinkedIn Easy Apply; fallback: external portal with recovery steps. Built for traceability, honesty, and safe human-in-the-loop handling of CAPTCHAs.

• LLM-powered application tailoring and JSON-structured form answers
• Safe UI actions only: click (pixel coords), type, wait; logs every step
• PPT presentation & demo video included — perfect for recruiters

GitHub Watch Demo PPTX

Aug 2025 – Dec 2025

🤖 Agentic Robot Control via LLM/VLM (Prompt‑to‑Action)

LLM/VLM‑driven robot control that translates natural‑language prompts to pick/place/rotate actions with monocular depth, IK, and gripper control. Expanded prompt templates and kinematic awareness for precise execution.

GitHub Watch Demo

Sep 2025 – Dec 2025

🦾 Dobot Magician: Agentic Tic‑Tac‑Toe (Vision + LLM Planning)

OpenCV board‑state detection (calibration, AprilTags), ROS2 control, and LLM‑orchestrated planning with function calls (perceive_board → choose_move → execute_move). ~1.4 s p50 turn latency, ≤2 mm placement error over 200 games.

Aug 2025 – Sep 2025

🧩 Dobot Magician Lite Maze — Agentic AI Scene + Local Planner

Dobot Magician Lite maze navigation with an agentic AI scene and local planner on Ubuntu, supporting both camera and file-based inputs. Integrates perception, maze reasoning, and Dobot control for reliable path execution.

GitHub Watch Demo

Aug 2025 – Sep 2025

✋ ROS2 Gesture‑to‑Robot: Vision‑based Tele‑operation

Real‑time MediaPipe/OpenCV gestures mapped to ROS2 actions for TurtleBot nav & gripper control. ≥95% gesture F1, ~55 ms end‑to‑end latency, safety supervisor (debouncing, Kalman, dead‑man) with ≥97% reliability and ≤120 ms safe‑stop.

Jan 2025 – Apr 2025

🤖 AutoML Framework

End‑to‑end AutoML pipeline with automated model selection, hyperparameter tuning (Optuna, GridSearchCV), and evaluation.

GitHub

🖐 Gesture‑Controlled IoT

Real‑time gesture recognition using MediaPipe + OpenCV with Arduino for IoT device control.

GitHub

🎙 Female English Voice Emotion Detection

Deep learning for real‑time female English voice emotion recognition with Mel‑Spectrograms + VGG16.

GitHub

🐾 Animal Emotion Classifier

AI‑powered MobileNetV2 image classifier to detect and classify 9 animal emotion classes with real‑time Tkinter popups.

GitHub

🚗 Vehicle Sleep Detection & Age Prediction

DL + CV system detecting drowsiness & predicting age using VGG16, OpenCV + Tkinter GUI.

GitHub

🎬 CineReco: Hybrid Movie Recommendation

An AI‑powered recommender combining SBERT embeddings, collaborative filtering, and hybrid fusion.

GitHub

📊 LLM‑Powered SQL Assistant

LangChain + Google PaLM tool converting natural language → SQL for real‑time MySQL insights.

GitHub

Research & Publications

Handcrafted AI: Designing Virtual Hardware for Hand Gesture-Based Interaction

Proposed a virtual hardware simulation layer for gesture-based AI systems, enabling IoT integration without physical devices. Achieved 93% accuracy with Mediapipe + OpenCV pipelines across 5 gestures.

View Paper

Smart Interaction: Combining Special Gestures, Virtual Calculations, and IoT

Developed a hybrid gesture + IoT framework where gestures trigger virtual calculations and control smart devices in real-time (~75ms latency). Demonstrated in smart-home control applications.

View Paper

Reach out for collaborations, research, or internships

📞 +1 (480) 208‑5286

📩 sshar443@asu.edu

🔗 LinkedIn — ss1511

Live Code Stats

—

Public Repos

—

Stars

—

Followers

—

Last Activity (days)

Notes (local)

Saved only in your browser (localStorage).