Project Learnings
_________________________________________________________________________________________________________
Project Summary: ServiceM8 Data Centralization & Reporting Automation
🎯 Objective
To move operational data out of the ServiceM8 application and into a client-owned SQL environment, enabling reliable reporting, historical analysis, and KPI tracking without depending on manual exports or platform UI limitations.
⚠️ Key Challenges Faced
1️⃣ Data Locked Inside the Platform
ServiceM8 works well operationally, but reporting is limited to what the UI allows.
No unified historical view across jobs
Limited flexibility in defining custom KPIs
Manual exports were required for deeper analysis
This created delays, inconsistencies, and reporting fatigue.
2️⃣ API Structure vs Business Reality
The ServiceM8 API exposes data in highly normalized, nested JSON objects.
Challenges included:
Multiple endpoints needed to describe a single “job”
Important business context split across objects
Fields changing meaning depending on job state (open, scheduled, completed)
Mapping this technical structure to business-friendly metrics required careful design.
3️⃣ Job State Volatility
Jobs in ServiceM8 are not static:
Schedules change
Completion timestamps update
Status can move forward and backward
This meant:
Simple daily snapshots were insufficient
Incremental logic had to handle updates safely
Re-runs needed to be idempotent to avoid duplicates or data loss
4️⃣ Grain Definition Complexity
Different KPIs required different grains:
One row per job (counts, backlog)
Time-based views (due soon, overdue)
Completion-based metrics
Getting the correct grain upfront was critical; mistakes here would have broken KPI accuracy later.
5️⃣ BI Refresh Stability
Directly connecting BI tools to ServiceM8 was unreliable due to:
API rate limits
Token expiry
Refresh failures outside business hours
This made scheduled reporting unpredictable and unsuitable for executive dashboards.
🧩 How These Challenges Were Addressed
Introduced a RAW JSON layer to store full API responses safely
Designed flattened SQL views aligned to business grain
Implemented incremental, update-safe ingestion logic
Decoupled BI tools from the API entirely by using SQL as the source
🧠 Key Learning
ServiceM8 is excellent for running operations, but serious analytics requires owning the data outside the platform.
By solving for data ownership, grain clarity, and update safety, the client gained reliable insights without disrupting day-to-day operations.
Project Summary: Automated Reporting System using Rithum & Amazon SP APIs
🎯 Objective
To eliminate manual data pulls and streamline reporting for a client managing multiple e-commerce platforms by automating data collection, transformation, and visualization using APIs, SQL, and cloud-based tools.
🛠️ Approach
We began by understanding the client’s existing reporting workflow — logging into platforms like Rithum, manually exporting reports, and consolidating them in Excel. The goal was to eliminate this repetitive work while improving speed, accuracy, and data visibility.
🤝 Collaboration
A big win in this project was co-creating the data blueprint with the client. They provided a shared Google Sheet listing all key columns, expected values, and report types. This gave us clarity and helped us validate API output against actual reporting needs — ensuring the data extraction hit the mark from day one.
🧩 Solution
Integrated Rithum & Amazon SP APIs to pull order, product, and inventory data in an automated fashion
Ingested and structured data in Google BigQuery, using custom SQL views to clean and organize it for analysis
Connected BigQuery to Looker Studio to power real-time, interactive dashboards — accessible to teams across the org
Developed a custom data dictionary that mapped each data field with business meaning, calculation rules, and usage notes
📈 Key Outcomes
⏱️ 90%+ Reduction in Reporting Time
All reports are now generated with zero manual effort — improving team productivity and consistency.📊 Real-Time Dashboards
Metrics are always up-to-date and accessible through Looker Studio or any other BI tool the client prefers.📘 Enhanced Data Literacy
The data dictionary helped teams understand their data better — leading to more meaningful insights and confident decision-making.⚙️ Scalable Infrastructure
Adding new fields, metrics, or platforms is now quick and efficient — thanks to modular SQL views and config-based API pipelines.
Project Summary: Survey Data Transformation and Demographic Reporting Automation
📝 Project Title: Automating Survey Analysis and Demographic Data Cuts with Reverse Coding Intelligence
💼 Project Overview: The client provided raw survey data collected via a 5-point Likert scale along with a metadata file describing survey items, including whether some items required reverse coding.
The objective was to transform this raw survey dataset into structured, insight-ready output tables showing:
Overall % Positive scores
Detailed Demographic Cuts (Age, Gender, Race, Tenure, and 20+ demographics)
Automated handling of Reverse Coded items
Sample size (n) breakdowns alongside % Positive
Excel-ready, business presentation-quality output
The final solution involved building a fully automated Python pipeline that produces demographic-level reporting ready for immediate client use.
🔍 Key Deliverables:
Data Cleaning and Preparation
Converted survey text responses (Strongly Agree → 5, Disagree → 2, etc.) into numeric Likert scores.
Standardized N/A responses across the dataset.
Metadata Integration
Mapped survey questions to their proper metadata descriptions.
Flagged and identified reverse coded items based on both metadata and business logic.
Reverse Coding Intelligence
Applied dynamic reverse coding on selected survey items where lower scores indicate positive sentiment (e.g., disagreement with negative statements).
Demographic Data Cuts
For each demographic field (e.g., Age, Race, Gender, etc.), calculated:
% Positive (Strongly Agree + Agree or Strongly Disagree + Disagree for reverse coded)
Sample size (n) for each subgroup
Included Total Sample and Demographic Sample Size Columns.
Output Structuring
Structured final Excel tables in business-readable format:
First Columns: Question, Overall Respondents (n), Total Demographic n, Overall % Positive
Next: All Subgroup % Positive columns
Then: All Subgroup Sample Size (n) columns
Ensured one-decimal formatting for percentages.
Automated Reporting
Separate Excel file generated per demographic field.
Naming and file saving logic aligned to business requirements.
Quality Control
Ensured that missing demographic data did not skew % Positive calculations.
Ensured that N/A responses were excluded from denominator calculations.
⚙️ Tools Used:
Python 3.11
Pandas for data manipulation
NumPy for calculations
📈 Value Delivered:
🔥 Automated a previously manual and error-prone process.
⚡ Reduced turnaround time for survey analysis from days to minutes.
✅ Ensured 100% consistency in reverse coding and demographic cuts.
🎯 Business-ready reporting files, saving 10+ hours per survey cycle.
💬 Created reusable code templates for scaling to other survey projects.
📊 Ready for direct client presentation without additional editing.
✨ Why This Matters for Organizations:
If you're a business, consulting firm, or HR insights team handling digital transformation, employee engagement surveys, DEI surveys, or organizational pulse surveys, this system allows you to automate, scale, and validate your entire data reporting pipeline.
Ready-to-publish outputs = faster insights = smarter business decisions.