GitMe

GitMe

A GitHub repository recommendation system based on the concept of content-based filtering provides you with filtered repositories which do have at least one open issue to make contributions.

GitMe

GitMe

A GitHub repository recommendation system based on the concept of content-based filtering provides you with filtered repositories which do have at least one open issue to make contributions.

The problem GitMe solves

Sometimes it becomes so tough and hectic at the same time to find GitHub repositories, where we can make contributions and use our knowledge to make effective changes in open-source projects. We came here with our website Gitme as a solution that provides you repositories to make open source filtered by the language you choose. You can even sort out the repositories on different parameters like star count, issues opened, contributors, and open pull requests. A dashboard is provided to every user where they can bookmark the repositories in which they are interested in making contributions and can visualise their GitHub stats.

Challenges we ran into

While building this product, we ran into different challenges at both front-end as well as back-end. The first challenge was to collect data, data used to make this recommendation system. As we were using GitHub APIs to fetch data from GitHub, but it does have rate limits even after providing access tokens. So as a solution we collected around 40 usernames from different domains like ML, Web3, DevOps, etc., and used GitHub API to fetch their followers and followings. This makes us collect more than 1800 extracted repositories from each username, as a result, we collected more than 18,000+ repos which are enormous in themselves. Used python libraries like Pandas to make data frames, filter them out according to the language entered by the user, and sort them accordingly. The next challenge we ran into was to deploy the python script on AWS lambda, as packages like pandas needed to be added as a layer. For that, we used CloudWatch and Cloud9 to zip the packages and add them as a layer. Used AWS API gateway to trigger the lambda function, added query parameters to the invoke URL, and allowed origins so that we can integrate it with the frontend. Coming to the frontend part, the main challenge was to fetch the data from the invoke URL, which we solved by allowing ports to Access-Control-Allow-Origin in the AWS console. The next challenge was configuring auth, using which user can access our website with just one click. To make add to bookmark feature accessible, firebase was used to store the user data, during the production unexpected errors came which gets handled by creating new collections during the production.

Discussion