Web Scraping kununu.com

Little project to scrape company reviews from kununu.com with the Scrapy framework (based on Python).

"Scrapy is an application framework for crawling web sites and extracting structured data
which can be used for a wide range of useful applications, like data mining, information processing or historical archival."

Prerequisites

Scrapy installed on your machine
→ Follow this installation guide: https://docs.scrapy.org/en/latest/intro/install.html

How to run this project

Clone this repo into your scrapy folder. (where the default tutorial folder should exist after your installation)

Your folder structure should look something like this:

  scrapy/kununu/
     README.md
     scrapy.cfg
     __init__.p
     kununu_project/
             items.py
             middlewares.py
             pipelines.py
             settings.py
             spiders/
                __init__.py
                kununu.py

 scrapy/tutorial/
     scrapy.cfg
     tutorial/
             items.py
             ...

Open your python CLI (I used anaconda prompt):
3.1 Navigate into the spider folder within scrapy folder → (scrapy/kununu/kununu_project/spiders)
3.2 Execute the following command: scrapy runspider kununu.py
By default it scrapes reviews from ec4u expert consulting ag.
You can change this by adapting the links within the "kununu.py" - Spider.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Web Scraping kununu.com

Prerequisites

How to run this project

Files

README.md

Latest commit

History

README.md

File metadata and controls

Web Scraping kununu.com

Prerequisites

How to run this project