Derek Harmon • 4 days ago
Are we authorized to web scrape teamusa.com?
The Resources tab states that we must use teamusa.com as a data source.
teamusa.com doesn't appear to expose Olympic/Paralympic athlete data to us by CSV or REST endpoint.
Terms of Use on teamusa.com prohibits "unauthorized" web scraping, per https://www.teamusa.com/terms-of-use#:~:text=Engage%20in%20unauthorized%20spidering,means%20to%20compile%20information
When I asked Gemini how can I access this data from teamusa.com in CSV or JSON, it told me:
"For the Team USA x Google Cloud Hackathon, the primary source for athlete data is the "US-only data" dataset found directly under the "Resources" or "Data" tabs on the Devpost Hackathon Page. Supplementary CSV datasets and APIs are available via Kaggle, GitHub, and SportsDataIO, while data from the public Team USA directory requires web scraping using tools like BeautifulSoup."
Has Google made arrangements with the Team USA website such that hackathon participants are Authorized to web scrape the website for this data, for use in our hackathon project submission? Otherwise, can somebody please clarify what is meant by https://vibecodeforgoldwithgoogle.devpost.com/details/faqs#:~:text=Participants%20are%20explicitly%20instructed,and%20blog%20content%29%2E ?
Thank you.
Log in or sign up for Devpost to join the conversation.

0 comments