Role Overview:
Innodata Lanka is looking for a Software Engineer – Python with strong expertise in web scraping and data handling. The ideal candidate will have hands-on experience with Python scraping libraries and tools, and will be responsible for developing and maintaining high-quality data extraction scripts from a wide range of structured and unstructured web sources.
Key Responsibilities:
- Develop, maintain, and optimize web scraping scripts using Python.
- Extract structured data from static and dynamic websites using libraries such as requests, BeautifulSoup, and lxml.
- Navigate and extract relevant content using Regex, XPaths, and CSS selectors.
- Perform HTTP requests (GET and POST) with appropriate headers and payloads to ensure efficient and secure data retrieval.
- Clean, transform, and process scraped data using pandas.
- Debug, troubleshoot, and ensure the reliability of scraping pipelines.
- Collaborate with team members using Git for version control and code reviews.
Requirements:
- 1–2 years of hands-on experience in Python programming with a focus on web scraping.
- Proficient with tools/libraries such as requests, BeautifulSoup, and lxml.
- Strong skills in Regex, XPath, and CSS selectors.
- Experience working with HTTP methods and crafting custom request headers and payloads.
- Proficiency in pandas for data manipulation and processing.
- Ability to debug and optimize complex scraping logic.
- Familiarity with Git for version control and collaborative development.
Desired Skills (Nice to Have):
- Experience with browser automation tools such as Selenium or Playwright.
- Exposure to cloud deployment environments or job scheduling tools.
- Awareness of data privacy regulations and ethical scraping practices.
Requirements
Requirements:
- 1–2 years of hands-on experience in Python programming with a focus on web scraping.
- Proficient with tools/libraries such as requests, BeautifulSoup, and lxml.
- Strong skills in Regex, XPath, and CSS selectors.
- Experience working with HTTP methods and crafting custom request headers and payloads.
- Proficiency in pandas for data manipulation and processing.
- Ability to debug and optimize complex scraping logic.
- Familiarity with Git for version control and collaborative development.
Desired Skills (Nice to Have):
- Experience with browser automation tools such as Selenium or Playwright.
- Exposure to cloud deployment environments or job scheduling tools.
- Awareness of data privacy regulations and ethical scraping practices.
Generating Apply Link...