Summary

I assist the ePubs, Purdue University Press in maintaining their open access repository. By undertaking the following activities

  1. Automation of processing academic texts to extract necessary metadata.
  2. Vet the metadata data using web scraping techniques to ascertain whether a particular academic text complies the requirement of a text being permissible in the open access domain.
  3. Batch upload (~2000 academic publishings) to the Electronic Theses and Dissertations repositories.
  4. Collaborate with other open access publishers to post Purdue authors’ content from their repository to our repository at purdue.

This position requires a significant amount of web scraping and text analysis to efficiently work with scores of document at once. Time is of the essence because processing over 2000 academic publishings in a span of 6 months with minimal errors is a crucial task.