Algorithmia was on-hand at the second-annual DubHacks hackathon last month, the largest collegiate hackathon in the Pacific Northwest. Over 600 student developers and designers flocked to the University of Washington in Seattle campus to form teams, build projects, and create solutions to real-world problems.
intuiti0n wanted to make the literature review process easier by building a service that finds important research papers across all fields of study. The team was comprised of Nirawit Jittipairoj, Alex Thompson, and Bryant Wong.
We spoke to Bryant Wong from the team, a senior at the University of Washington with a triple major (!) in mathematics, statistics, and computer science, about their intuiti0n hack.
What was the problem you were trying to solve?
“Two of the members of our team have been involved with academic research, which has the goal of trying to push the limits of human knowledge. However, in order to push the limits of human knowledge, you need know exactly what is in that field, which you do with a literature review. However, literature reviews are kind of a Catch-22 – you need to read the most important papers in a field, but because you don’t know what’s in the field, you don’t know what papers to read. As a result, literature reviews are often spent just hunting for papers that appear relevant, and then discarding most of them as they are often only tangentially related to your field. This makes the whole process tedious and extremely inefficient.”
How did you solve this problem?
“We devised an app that centered around extracting data from papers, and used them to generate topics to make targeted searches to find (other) papers. We were taking the abstract and title from a paper, running an NLP algorithm called Latent Dirichlet Analysis (LDA) on it to generate topics, then run those topics through Google Scholar, parsing the results with Beautiful Soup. The user could set a threshold for the number of papers they would like returned so that the algorithm does not run indefinitely. Our heuristic for judging the importance of a paper was not so good, as we used the number of papers that had cited this paper. Obviously this is not a good metric as there are many irrelevant papers that are cited, but we did not have a better concrete heuristic to judge by.”
How did you utilize Algorithmia in your project?
“We used Algorithmia as the backbone for our machine learning and topic generation, as we ran our data through one of the LDA algorithms available on Algorithmia to generate topics. This provided several advantages for us over implementing the algorithm ourselves:
1) not having to implement a complicated algorithm
2) not having a powerful enough server to run the algorithm (as our local machines were not particularly powerful)
3) simple integration in our Python scripts.
This was a no-brainer decision and allowed us to have a half-functioning product by the end of DubHacks.”