The successful candidate will be responsible for collecting and extracting data from various websites to support our data-driven decision-making process. The ideal candidate will have a strong background in web scraping, crawling, and data extraction, with experience in Python, R, or other programming languages.
Responsibilities:
- Identify relevant websites and data sources to collect data from
- Develop and maintain web scraping and crawling scripts to collect data on a regular basis
- Optimize web scraping and crawling scripts to ensure high-quality data collection
- Develop data cleaning and processing scripts to ensure data accuracy and completeness
- Apply machine learning algorithms to analyze and model the collected data
- Unit test, debug and troubleshoot web scraping, crawling, and machine learning scripts as needed
- Work collaboratively with other team members to support data-driven decision-making
- Stay up-to-date with the latest web scraping and crawling techniques and tools
Qualifications:
- Bachelor’s or Master’s degree in Computer Science, Information Systems, or a related field
- 1+ years of experience in web scraping, crawling, and data extraction
- Strong knowledge of Python, R, or other programming languages used for web scraping and data extraction
- Experience with web scraping and crawling tools such as Beautiful Soup, Scrapy, Apify,..
- Familiarity with data cleaning and processing techniques
- Experience with machine learning algorithms and libraries such as scikit-learn, TensorFlow, or PyTorch
Note. This is a remote position within US or on-location in Silicon Valley. Visa sponsorship is not available.