A complete data ingestion and automation pipeline capable of scraping

JavaScript-rendered websites (Playwright) and static HTML sources

(BeautifulSoup). The system cleans and normalizes data, stores it into

PostgreSQL, and visualizes insights through a Streamlit dashboard.

Designed to simulate real-world industry use cases for automation and

data engineering tasks.

Multi-source scraping engine

Support for JS-rendered pages (Playwright)

Support for lightweight HTML pages (BS4)

Data normalization service

Automated scheduler for recurring scrapes

PostgreSQL storage (JSON fields & structured data)

Streamlit reporting dashboard

Fully dockerized infrastructure

Data Pipeline Flow

Technologies Used

3721 Single Street
Quincy, MA 02169

123-123-1234
info@conforist-usa.com