Experience

SG Analytics

Data Scientist | Jan 2026 - Present

  • Engineered and own a production-grade LLM pipeline on AWS Bedrock (Claude) monitoring 200+ global companies across 91 locations, ingesting 800–1,000 articles/day via EventRegistry API with full audit logging.
  • Built structured classification and QC tagging workflows achieving 96% decision accuracy (manually validated), using prompt engineering and automated quality checks.
  • Implemented semantic deduplication using Amazon Titan Embeddings and cosine similarity, eliminating cross-company and cross-day duplicate articles at scale.
  • Delivered concurrent LLM processing, ad-free semantic HTML content generation, and PostgreSQL batch ingestion pipeline supporting reliable daily production delivery.

SG Analytics

Data Science Intern | Jan 2025 - July 2025

  • Developed a smart web crawler and data pipeline using Python, Scrapy, FastAPI, and AWS S3, automating the extraction of 500+ URLs/minute with 95% accuracy and visualizing results via an interactive Streamlit dashboard.
  • Automated an SFDR-compliant CIM system to extract ESG data and KPIs from unstructured corporate documents, generating structured Excel reports to streamline financial analysis and reporting.
  • Designed an intelligent SWOT Analysis System using Streamlit, RAG, and Amazon Bedrock, integrating SEC 10-K and annual reports with web data to produce factually accurate, auto-generated business reports.
  • Implemented a secure document Q&A platform using OpenWebUI and Amazon Bedrock for enterprise-grade information retrieval from uploaded corporate files.

Chegg

Subject Matter Expert | Nov 2023 - Sep 2025

  • Delivered 400+ optimized solutions in C++, Python, Data Structures, Algorithms, and Optimization.
  • Provided comprehensive explanations on a wide range of computer science topics with an Average rating of 4.5+.

Get in touch

It's easy to lie with statistics. It's hard to tell the truth without statistics.

— Andrejs Dunkels