Project Learnings

Project Summary: Automated Reporting System using Rithum & Amazon SP APIs

🎯 Objective

To eliminate manual data pulls and streamline reporting for a client managing multiple e-commerce platforms by automating data collection, transformation, and visualization using APIs, SQL, and cloud-based tools.

🛠️ Approach

We began by understanding the client’s existing reporting workflow — logging into platforms like Rithum, manually exporting reports, and consolidating them in Excel. The goal was to eliminate this repetitive work while improving speed, accuracy, and data visibility.

🤝 Collaboration

A big win in this project was co-creating the data blueprint with the client. They provided a shared Google Sheet listing all key columns, expected values, and report types. This gave us clarity and helped us validate API output against actual reporting needs — ensuring the data extraction hit the mark from day one.

🧩 Solution

Integrated Rithum & Amazon SP APIs to pull order, product, and inventory data in an automated fashion
Ingested and structured data in Google BigQuery, using custom SQL views to clean and organize it for analysis
Connected BigQuery to Looker Studio to power real-time, interactive dashboards — accessible to teams across the org
Developed a custom data dictionary that mapped each data field with business meaning, calculation rules, and usage notes

📈 Key Outcomes

⏱️ 90%+ Reduction in Reporting Time
All reports are now generated with zero manual effort — improving team productivity and consistency.
📊 Real-Time Dashboards
Metrics are always up-to-date and accessible through Looker Studio or any other BI tool the client prefers.
📘 Enhanced Data Literacy
The data dictionary helped teams understand their data better — leading to more meaningful insights and confident decision-making.
⚙️ Scalable Infrastructure
Adding new fields, metrics, or platforms is now quick and efficient — thanks to modular SQL views and config-based API pipelines.

Project Summary: Survey Data Transformation and Demographic Reporting Automation

📝 Project Title: Automating Survey Analysis and Demographic Data Cuts with Reverse Coding Intelligence

💼 Project Overview: The client provided raw survey data collected via a 5-point Likert scale along with a metadata file describing survey items, including whether some items required reverse coding.

The objective was to transform this raw survey dataset into structured, insight-ready output tables showing:

Overall % Positive scores
Detailed Demographic Cuts (Age, Gender, Race, Tenure, and 20+ demographics)
Automated handling of Reverse Coded items
Sample size (n) breakdowns alongside % Positive
Excel-ready, business presentation-quality output

The final solution involved building a fully automated Python pipeline that produces demographic-level reporting ready for immediate client use.

🔍 Key Deliverables:

Data Cleaning and Preparation
- Converted survey text responses (Strongly Agree → 5, Disagree → 2, etc.) into numeric Likert scores.
- Standardized N/A responses across the dataset.
Metadata Integration
- Mapped survey questions to their proper metadata descriptions.
- Flagged and identified reverse coded items based on both metadata and business logic.
Reverse Coding Intelligence
- Applied dynamic reverse coding on selected survey items where lower scores indicate positive sentiment (e.g., disagreement with negative statements).
Demographic Data Cuts
- For each demographic field (e.g., Age, Race, Gender, etc.), calculated:
- % Positive (Strongly Agree + Agree or Strongly Disagree + Disagree for reverse coded)
- Sample size (n) for each subgroup
- Included Total Sample and Demographic Sample Size Columns.
Output Structuring
- Structured final Excel tables in business-readable format:
- First Columns: Question, Overall Respondents (n), Total Demographic n, Overall % Positive
- Next: All Subgroup % Positive columns
- Then: All Subgroup Sample Size (n) columns
- Ensured one-decimal formatting for percentages.
Automated Reporting
- Separate Excel file generated per demographic field.
- Naming and file saving logic aligned to business requirements.
Quality Control
- Ensured that missing demographic data did not skew % Positive calculations.
- Ensured that N/A responses were excluded from denominator calculations.

⚙️ Tools Used:

Python 3.11
Pandas for data manipulation
NumPy for calculations

📈 Value Delivered:

🔥 Automated a previously manual and error-prone process.
⚡ Reduced turnaround time for survey analysis from days to minutes.
✅ Ensured 100% consistency in reverse coding and demographic cuts.
🎯 Business-ready reporting files, saving 10+ hours per survey cycle.
💬 Created reusable code templates for scaling to other survey projects.
📊 Ready for direct client presentation without additional editing.

✨ Why This Matters for Future Clients:

If you're a business, consulting firm, or HR insights team handling digital transformation, employee engagement surveys, DEI surveys, or organizational pulse surveys, this system allows you to automate, scale, and validate your entire data reporting pipeline.
Ready-to-publish outputs = faster insights = smarter business decisions.