About
Highly analytical and detail-oriented Data Scientist with over 4 years of experience in developing end-to-end data solutions, from robust data preprocessing to scalable model deployment. Expertly leverage machine learning, deep learning, and statistical modeling techniques, alongside big data and cloud platforms (AWS, Azure, GCP), to derive actionable insights and solve complex business challenges across diverse industries. Proven ability to build predictive models, optimize data pipelines, and implement advanced analytics solutions to drive significant business impact and improve decision-making.
Work
Wichita, Kansas, US
→
Summary
As a Data Scientist at Textron Aviation, I develop and deploy advanced data solutions to optimize aircraft manufacturing, maintenance, and after-sales service, leveraging structured and unstructured data for operational efficiency and strategic insights.
Highlights
Optimized manufacturing timelines and after-sales service delivery by analyzing aircraft production, maintenance, and fleet utilization data using Python, R, and SQL, identifying key operational inefficiencies.
Built and deployed predictive models with Random Forest, Gradient Boosting, and XGBoost, accurately forecasting aircraft delivery schedules and maintenance requirements, enhancing supply chain responsiveness.
Developed real-time data pipelines (Apache Kafka, AWS Lambda, Apache Airflow) and processed large-scale aviation telemetry using Apache Spark, Hadoop, and Hive, enabling proactive issue resolution and comprehensive flight performance analysis.
Designed interactive executive dashboards in Tableau and Power BI, visualizing critical KPIs and manufacturing trends to drive data-driven decision-making and improve aircraft safety records.
Applied advanced deep learning (TensorFlow, Keras, PyTorch) and NLP (spaCy, NLTK) models to predict component failure and extract actionable insights from engineering reports, improving preventive maintenance and quality.
Deployed scalable machine learning models on AWS and GCP, providing real-time recommendations for production planning and fleet readiness, and secured data using OAuth 2.0/JWT for compliance.
Leveraged Neo4j graph database to optimize procurement cycles by identifying bottlenecks in supplier-manufacturing relationships and implemented Agile methodologies via JIRA for rapid project delivery.
Tuned model performance with GridSearchCV, Optuna, and RandomizedSearchCV, significantly maximizing accuracy for predicting manufacturing lead times and aircraft readiness metrics.
Olathe, Kansas, US
→
Summary
At Garmin International, I developed and implemented machine learning models for production forecasting, anomaly detection, and risk assessment, utilizing advanced analytics to enhance navigation accuracy and wearable performance.
Highlights
Enhanced navigation accuracy and real-time location tracking across aviation, automotive, marine, and fitness devices by analyzing GPS telemetry and sensor streams using Python, R, and SQL.
Engineered and orchestrated robust ETL pipelines with Apache Airflow and AWS Glue, integrating diverse multi-source GPS, biometric, and environmental data for comprehensive insights and improved system performance.
Built and deployed machine learning models (Scikit-learn, XGBoost, LightGBM) to predict navigation errors, detect GPS signal anomalies, and personalize fitness coaching, enhancing product reliability.
Processed high-volume geospatial and time-series data using Apache Spark, Hadoop, and Hive, enabling large-scale route analytics and wearable sensor calibration optimization.
Applied NLP techniques (spaCy, NLTK) to user feedback and support transcripts, identifying key patterns that led to improvements in device features and user experience.
Deployed scalable containerized analytics applications using Docker and Kubernetes, ensuring reliable data services across global product ecosystems, and secured API communications with OAuth 2.0/JWT.
Developed deep learning models (TensorFlow, PyTorch) for image recognition in marine navigation and aerial mapping, and built Flask/FastAPI microservices to integrate predictive analytics into consumer applications.
Utilized Snowflake and Google BigQuery for cloud-based analytics on GPS performance and user engagement, establishing a unified data warehouse for reporting and compliance.
Mumbai, Maharashtra, India
→
Summary
At Westpac Banking Corporation, I analyzed and interpreted large financial datasets to inform strategic decisions regarding customer needs, market trends, and performance improvements, supporting institutional banking divisions.
Highlights
Analyzed corporate lending, trade finance, and transactional datasets using SQL, Python, and R, enabling client segmentation, risk profiling, and profitability forecasting for institutional banking divisions.
Designed and deployed predictive models (Scikit-learn, XGBoost, TensorFlow) to assess counterparty credit risk and forecast potential credit events across global portfolios.
Built and maintained real-time and historical analytics dashboards (Tableau, Power BI, Looker) to monitor FX exposures and investment portfolio performance, enhancing treasury and capital market insights.
Processed large-scale trade, treasury, and settlement data using Hadoop, Apache Spark, and Hive, improving compliance reporting timelines and decision-making for market operations.
Consolidated multi-source financial data (Snowflake, Google BigQuery, Amazon Redshift) and indexed semi-structured documents (MongoDB, Cassandra, Elasticsearch) for unified reporting and rapid retrieval of KYC records.
Developed financial time series forecasting models (Prophet, ARIMA, LSTM) to predict liquidity positions, market volatility, and cash flow trends, supporting strategic asset-liability management.
Automated ETL workflows (Apache Airflow, NiFi, AWS Glue) and deployed fraud detection models (Docker, Kubernetes) to streamline data ingestion and enhance real-time oversight of high-value transactions.
Streamed market orders and transactional data using Apache Kafka and AWS Kinesis, enabling real-time reconciliation and operational risk mitigation in institutional banking services.
Mumbai, Maharashtra, India
→
Summary
At Hero MotoCorp, I collected, organized, and analyzed large datasets of customer records and market trends, providing insights to improve products and services while developing and optimizing databases for effective data management.
Highlights
Designed dynamic manufacturing performance dashboards using Tableau, Power BI, and SQL to track key metrics such as production throughput, quality defect rates, and supply chain lead times across multiple manufacturing plants.
Automated shop-floor data pipelines with Apache Airflow and Informatica, ensuring real-time updates from IoT sensors, assembly lines, and quality inspection systems for operational analytics and process optimization.
Processed large-scale production, warranty, and dealer network data using Apache Spark, Hadoop, and Hive, enabling advanced segmentation for targeted after-sales service and product reliability improvement initiatives.
Developed predictive models in Python using Scikit-learn, XGBoost, and Pandas to forecast component failure rates, maintenance requirements, and demand trends for motorcycles and scooters.
Built secure RESTful APIs using Flask and FastAPI to integrate dealer servicing platforms with manufacturing, logistics, and warranty claim systems, enabling seamless data flow across operations.
Applied Natural Language Processing (NLP) using spaCy and NLTK to extract insights from dealer feedback, customer complaints, and service records, enhancing product quality and customer satisfaction.
Containerized model deployment workflows using Docker and orchestrated with Kubernetes, improving scalability and reducing downtime for AI-driven production and quality control systems.
Utilized Elasticsearch and Kibana to enable anomaly detection and full-text search across maintenance logs, assembly line records, and dealer service histories for rapid issue resolution.
Skills
Programming Languages
Python, R, SQL, Java, C++.
Data Analysis & Manipulation
Pandas, NumPy, SciPy, Dplyr, Data.Table.
Machine Learning Algorithms
Linear Regression, Logistic Regression, SVM, Random Forest, Gradient Boosting, KNN, XGBoost, LightGBM, Scikit-learn.
Deep Learning
TensorFlow, Keras, PyTorch, Theano, Caffe.
Natural Language Processing (NLP)
spaCy, NLTK, Text Blob, Genism, Transformers, BERT.
Data Visualization
Matplotlib, Seaborn, Plotly, Tableau, Power BI, ggplot2, Looker.
Big Data Technologies
Apache Hadoop, Apache Spark, Hive, Pig, Flink.
Cloud Computing Platforms
AWS, Google Cloud Platform (GCP), Azure, IBM Cloud.
Data Warehousing
Amazon Redshift, Google BigQuery, Snowflake, Teradata, dbt.
Databases
MySQL, PostgreSQL, MongoDB, Cassandra, SQL Server, Oracle, Neo4j.
ETL Tools
Apache NiFi, Talend, Informatica, Pentaho, AWS Glue.
Serverless Computing
AWS Lambda, Google Cloud Functions.
CI/CD for Data Science
Jenkins, GitLab CI, CircleCI, Git, GitHub, JIRA.
Containerization & Virtualization
Docker, Kubernetes.
Model Tuning
GridSearchCV, Optuna, RandomizedSearchCV.
Microservices
Flask, Django, FastAPI.
Security & Compliance
OAuth 2.0, JWT, GDPR, HIPAA.
Time-Series Forecasting
ARIMA, SARIMA, Prophet, LSTM.
Anomaly Detection
Isolation Forest, Autoencoders, Z-Score, Elasticsearch, Kibana.
Conversational AI
Dialogflow, Rasa, Microsoft Bot Framework.
Data Lakes
AWS S3, Azure Blob Storage, Google Cloud Storage.
Edge Computing
TensorFlow Lite, ONNX.