Pakshal Jain

Pakshal Jain

Data Scientist in Mumbai

About

As a data scientist, I leverage data to solve complex problems and drive business growth for clients across various domains like finance, banking, retail, transportation, defence, etc. I have more than 2.5 years of combined hands-on experience in statistical data analysis, inventory management, machine learning, NLP, quantum computing, time series forecasting and data-driven decision-making, using tools such as Python, SQL, R, Tableau and Snowflake Cloud.

I am passionate about exploring new and challenging opportunities to further enhance my skills and expertise in data science, and to contribute to the advancement of data-driven solutions in various industries.

Work Experience

2021 — Now
Mumbai

Roles and Responsibilities -

  • Understand client and stakeholders requirements and translate business problems into analytical structures which can be solved using Statistical/ML/NLP/DL techniques.

  • Primarily focused on building models on areas such as Demand & Sales Forecasting, Supply Chain Optimization, Bankruptcy Risk, Portfolio Optimization.

  • Perform data extraction, cleaning, preprocessing and feature engineering techniques for better model performance, interpretability and decision making.

  • Working with cross functional teams to solve problems efficiently and effectively.

  • Spearheaded internal research projects and proof-of-concept initiatives leading small teams to address Supply Chain and Forecasting challenges.

  • Revamp existing data processes by automating tasks and web scraping techniques, making workflows smoother and time efficient for multiple stakeholders.

  • R&D on problem statements in forecasting domains

  • Addressing Ad-hoc requests from clients and stakeholders

Worked on following projects -

  • Inventory Transfer Model

    Spearheaded development of Python and SQL-based statistical model for optimal inventory transfers across multiple stores based on SKU trends and seasonality. Achieved a remarkable 10-12% increase in net sales revenue of about 15M$+ across 7+ deals in retail industry (apparel, FMCG, hardware, etc) through precise demand and sales forecasting with confidence intervals.

  • Tiger Inventory Analysis

    Conducted advanced inventory analysis using Python, delivering 20+ predictive, descriptive, and prescriptive insights. Successfully converted 5+ proof-of-concepts (POCs) into standard client deliverables for future deals. Addressing Ad-hoc request from clients and providing custom models with specialized calculations and visualizations tailored to specific deals on missed sales, price elasticity, price optimization, stockout analysis, etc.

  • Appraisal Process Automation

    Transformed ETL processes to Python, utilizing parallel processing for data pre-processing and high-end calculations. Implemented SQL database on Snowflake cloud servers, optimizing query efficiency for handling 2TB+ data. Achieved remarkable reduction in data processing time from 24 hours to 10 minutes.

  • Risk Factor Classification

    Built a classification model using neural networks and cutting edge NLP techniques which identifies and further classifies different risks from data extracted from 10-K, 10-Q SEC EDGAR filings with 80% accuracy. Used NLTK library for text data preprocessing and GloVe embeddings for word representations.

  • CreditPulse ↗️

    Optimized and automated Credit Risk classification model which identifies credit default by giving daily bankruptcy scores for 10-K,10-Q SEC filings. Trained on NLP using data from 7,000+ companies and 54,000+ quarterly and annual filings.

  • EcoPulse↗️

    Led an internal research project for forecasting high-priority economic indicators in Ecopulse using classical, ML and neural network based approaches for time series forecasting models. Conducted comprehensive exploratory data analysis and implemented various feature selection, dimensionality reduction techniques for better feature representation from original feature space with 6000+ features.

Projects

2021

Trained a siamese network based feature extractor on over 130,000 videos optimised by triplet loss to classify deepfake videos. Used ensemble learning to improve the robustness of classification. Achieved an AUC score of 0.96 on benchmark datasets

2021

Created a hybrid model for stock price/performance prediction using numerical analysis of historical stock prices, and sentimental analysis of 1.4M news headlines.

Side Projects

2024
Youtube Video Recommendation at Abit

Contact

GitHub