Week 4 Status Updates
Monday: Refactoring Adventures
Python Package Development Code Refactoring πββοΈ
Morning Session π
Started the day with a focus on refactoring the code base of sensitive_data_detector package based on the feedback from Amit Sir, making it more efficient and readable learnt how to introduce OOPS concept in the code.

Technical Improvements πͺ
- Worked with crucial package functions:
- Introduced
SensitiveCHecker
Class
- Introduced various other classes to include OOPS concepts
FileReader
,ContentAnalyzer
, PatternLoader
- Inside each class , introduced methods to handle the file operations, content analysis, pattern matching, etc.

Pull Request & GitHub Issues π
-
Detailed Documentation of Feautre Enhancements:
Proposed Feature Enhancements in the project by opening a github issue , mentioned all the details about the feature and the implementation plan
Github Feature Request Documentation Link

-
Pull Request :
After refactoring the code base went ahead and made a PR to merge the changes
GitHub Pull Request Link

-
Ci-Cd Success :
After refactoring the code base and raising the PR , went ahead and checked the CI-CD pipeline to ensure the code is working as expected , it was a success.


Hovering over Licenses π
Learnings π―
Today I learnt about:
- Object oriented approach to code refactoring
- How to introduce OOPS concepts in the code
- Implementing code logic in the classes
A day of overcomming challenges and refactoring! π
Tuesday: Open Source & Learning Journey π
Morning Inspiration π
Mentorship Session π
Had an insightful session with Amit sir and Div sir where:
- Discussed our current progress and activities
- Amit sir shared valuable lessons from his tech journey
- Key takeaway: Despite obstacles, learning to enjoy the process and believing in yourself is crucial
Technical Deep Dive π οΈ
Pandas Local Setup
- Successfully set up Pandas locally for open source contributions
- Learned about build tools:
- Meson and Ninja build systems
- CPython internals
- Why weβre moving from setuptools to meson
- Migration from setup.py to pyproject.toml

Open Source Contribution Prep
- Explored Pandas repositoryβs issues section
- Studied:
- Contribution guidelines
- Merged PR commits
- Best practices for commit messages
- Writing effective PR descriptions

Blog Writing βοΈ
Wrote a detailed guide on setting up Pandas locally:
Git Essentials πΏ
Learned crucial git concepts:
A day full of learning and preparation for open source contributions! π
Wednesday: Grind & Shine π
Networking & API Development Journey π
Computer Networking Fundamentals π‘

Studied networking basics from GeeksForGeeks, covering:
API Deep Dive π
Explored different types of APIs and their use cases:

1. REST API
- Representational State Transfer
- Uses HTTP methods
- Stateless architecture
- Most common API type today
2. GraphQL API
- Query language for APIs
- Single endpoint
- Client specifies exact data needs
- Flexible data fetching
3. SOAP API
- Simple Object Access Protocol
- XML-based messaging protocol
- Strict standards
- Used in enterprise systems
4. WebSocket API
HTTP Methods Study π
Learned about core HTTP methods:
- GET: Retrieve data
- POST: Create new data
- PUT: Update existing data
- DELETE: Remove data
AWS Infrastructure Components βοΈ

Studied AWS networking basics:
- Virtual Private Cloud (VPC)
- Private network in the cloud
- Custom IP range
- Subnets
- Network segmentation
- Public vs Private subnets
- Internet Gateway
- Connection to internet
- Enable public access
FastAPI Implementation π

learned about fastapi and made my own test server , fast api is a modern, fast (high-performance), web framework for building APIs it uses starlette for the web parts and pydantic for data handling

Built my first FastAPI test server:
Example of my test server code:
from fastapi import FastAPI
app = FastAPI()
@app.get("/")
def read_root():
return {"Hello": "World"}
@app.get("/items/{item_id}")
def read_item(item_id: int):
return {"item_id": item_id}
Best Practices Learned from Stack Overflow π―

- API Naming Conventions
- Use nouns instead of verbs in endpoints
- Use plural nouns for consistency
- Examples:
- β
Good:
/users
, /items
, /orders
- β Bad:
/getUser
, /createItem
, /deleteOrder
- Resource Hierarchy
- Keep URLs clean and logical
- Use proper nesting for related resources
- Example:
/users/{id}/orders
- HTTP Methods Usage
- GET for reading
- POST for creating
- PUT/PATCH for updating
- DELETE for removing
- Status Codes
- 200: Success
- 201: Created
- 400: Bad Request
- 404: Not Found
- 500: Server Error

- Query Parameters
- Use for filtering, sorting, pagination
- Keep names clear and consistent
- Example:
/items?sort=desc&limit=10
A productive day of learning modern web technologies and implementing them! π»
Thursday: Cloud Deployment & Real-time Communication π
AWS EC2 Deployment π»
- Created and Set Up EC2 Instance
- Launched new EC2 instance
- Connected locally using SSH
- Installed required packages:

- Server Deployment
- Created test FastAPI server
- Deployed on EC2 instance
- Accessed using public IPv4 address
- Tested on port 8000

FastAPI & Pydantic Implementation β‘

Real-time Communication Deep Dive π

Protocol Study
Voice Agents & Real-time APIs

Communication Types
- Real-time APIs
- Custom-built solutions
- WebSocket implementations

LiveKit Exploration

- Open-source SDK for voice agents
- Real-time communication features
- Integration capabilities
WebRTC Framework

- Open-source technology
- Uses UDP for faster communication
- Perfect for real-time audio/video
Voice Agent Pipeline
User Audio β Speech-to-Text β LLM Token β Text-to-Speech β User Audio

Key Learnings π―
- EC2 instance deployment
- FastAPI with Pydantic
- Real-time communication protocols
- Voice agent architecture
- WebRTC fundamentals
A day full of practical implementations and learning about real-time communication! π
Friday: Voice Agent Development with VAPI.AI ποΈ
Project Overview
Created a voice agent system using VAPI.AI after discussing requirements with Anant. The project focuses on making automated calls and handling call data efficiently.

Technical Implementation π οΈ
1. VAPI Integration Setup
- Explored VAPI.AI documentation
- Learned about making POST requests to VAPI API endpoints
- Key configurations required:
- VAPI API key
- Assistant ID (via dashboard/API)
- Phone number ID

2. Phone System Configuration π
- Initially tested with VAPIβs free number
- Discovered international calling limitations
- Implemented Twilio integration:
- Created Twilio account
- Added international calling support
- Integrated Twilio number with VAPI dashboard
- Generated phone number ID for API calls

3. Server Implementation π
4. Public Access Setup π
Data Flow Architecture
POST Request β VAPI API β Twilio Call β Server Events β Local CSV
Key Learnings π―
- VAPI.AI API integration
- Webhook handling for voice agents
- Twilio international calling setup
- Server event logging
- Public URL tunneling with ngrok
Successfully implemented a voice agent system with real-time call handling and data logging! π
Check out the complete project here: Voice-Agents Repository
Saturday: PySpark & Databricks Deep Dive π
PySpark Architecture Study ποΈ
- Explored distributed computing architecture:
- Driver node (main control)
- Worker nodes (distributed processing)
- Executors (task execution)
- Cluster Manager (resource allocation)

PySpark Optimization Techniques π‘
- Data Optimization
- Partition tuning
- Caching strategies
- Memory management
- Query Optimization
- Using filter and where clause
- Join optimizations
- Broadcast joins
- Minimizing usage of collect and shuffle operations
Directed Acyclic Graphs (DAGs) π
- Studied PySparkβs DAG execution
- Understanding:

Databricks ML Pipeline Implementation π
Created a mock ml pipeline in databricks and ran it
Pipeline Components
- Data Ingestion
- Raw data loading
- Initial preprocessing
- Schema validation
- Data Processing
- Feature engineering
- Data transformation
- Cleaning operations
- Model Training
- Algorithm selection
- Hyperparameter tuning
- Cross-validation
- Model Evaluation
- Performance metrics
- Validation checks
- Error analysis


Exploring Mlops


Key Learnings π
- PySpark distributed architecture
- Performance optimization techniques
- Databricks workflow management
- End-to-end ML pipeline creation
- Understanding DAG-based execution flow
A productive day of learning advanced PySpark concepts and implementing ML pipelines! π