...and this is my portfolio and/or website I throw cool things onto.
I am an undergrad at NJIT, taking the B.S. in Data Science program, and I'm aiming for a career in AI/ML.
I worked on an IP Reallocation algorithm to extract the largest possible IP blocks from AT&T's IPv4 address space, by "moving" customer IPs from one location to another. By arranging blocks such that the largest amount of unused IPs are all in the same spot, we are able to allocate them out more easily when we need them, or sell them in bulk if extra funds are needed. The wrinkle here is that every movement of IP addresses has to be done by another team- so we need to minimize the amount of "moves" we are making to save man-hours spent in reallocation efforts, and more practically to save them a headache or three.
All things considered, I was able to generate moves to empty out 6 /16 blocks, which is 6 sets of 65,356 IP Addresses. Even though we have not taken a single IP address from a customer nor allocated any new ones, this reorganization of IPs generated ~$9M in business value. This is because the per-IP value of each IP address increases if part of a larger whole- 2 free IPs together is inherently worth less than 65K IPs together. We can calculate the difference in the value of all IPs affected before and after based on some internal estimates of per-IP value.
See More
I worked as a grader for two classes at NJIT, both of which covered Machine Learning concepts in depth, with rigorous course material & real-world projects. I've graded for a total of about 228 students over 3 semesters, from Spring '23 to Spring '24.
In my time as a grader I've pushed for a variety of improvements that helped make the class more useful to students, and enabled the class to scale better with the amount of students it has to host.
One benefit of working as a grader is that you have to learn how to tell someone they're wrong without making them angry (and dealing with it when they do). I like to think I've done a decent job of that.
At CDx, I worked on an Image Classification pipeline that could differentiate cancerous cells from debris and other such things on a microscope slide. I ended up working on every part of the process except the final integration, from extracting images from the company's DB to training the model.
The model itself was built on the ResNet-50 architecture using Keras, starting from ImageNet weights and fine-tuned from there. The model was decently performant (just under 90% accuracy) considering that the data wasn't the cleanest, and I had yet to learn about class balancing.
I was trying to filter the documents that LangChain feeds to my LLM, and eventually figured it out. I realized this wasn't written in the documentation so I made a PR to explain the filtering functionality in more detail, along with a (in hindsight, not amazing) suggestion for a change to the actual function.
I think it got rewritten at some point, because I can't find it now- LangChain is constantly changing so I'm not too surprised by that. But I'm still on the contributor list!
https://github.com/langchain-ai/langchain/pull/8803
I'm the reason there's a working guide to install CVAT in Windows! This was the first PR I made to a public repo and I didn't get yelled at, so I take that as a win!
The issue was basically that CVAT was designed for Linux, but the Windows installation instructed users to install it using Git Bash as a stand-in for Linux- which wasn't ideal, but it'd work if you were just using the basic functionalities of CVAT. However for my project I wasn't; and due to an issue in one of CVAT's dependencies, it would actually break on Windows. So I submitted a PR to instruct users to install CVAT on WSL2 instead, which was probably the better approach regardless.
https://github.com/opencv/cvat/pull/5558