Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

3. Install Python and check to see if it installed correctly

Code Block
aptcd install python3..8
python --version

4. Install pip and check to see if it installed correctly

...

5. Install psycopg2, Pandas and requests

Code Block
apt-get update
pip3 install psycopg2-binary pandas requests

...

Jupyter vs Excel for Data Analysis

Jupyter

Excel

Using jupyter will be command-based

Will take some time getting used to it.

Ease of Use with the Graphical User Interface (GUI). Learning formulas is fairly easier.

Jupyter requires python language for data analysis hence a steeper learning curve.

Negligible previous knowledge is required.

Equipped to handle lots of data really quickly. With the bonus of ease of accessibility to databases like Postgres and Mysql where actual data is stored.

Excel can only handle so much data. Scalability becomes difficult and messy.

More Data = Slower Results

Summary:

Python is harder to learn because you have to download many packages and set the correct development environment on your computer. However, it provides a big leg up when working with big data and creating repeatable, automatable analyses, and in-depth visualizations.

Summary:

Excel is best when doing small and one-time analyses or creating basic visualizations quickly. It is easy to become an intermediate user relatively without too much experience dueo its GUI.

How to install and configure jupyter to analyze the datamart

...

Table 1.1 - File Names for each Module

Module Name

Script File Name (With Links)

Datamart CSV File Name

Datamart CSV File Name

PT

pt.py

ptDatamart.csv

W&S

ws.py

waterDatamart.csv

sewerageDatamart.csv

PGR

pgr.py

pgrDatamart.csv

mCollect

mcollect.py

mcollectDatamart.csv

TL

tl.py

tlDatamart.csv

tlrenewDatamart.csv

Fire Noc

fn.py

fnDatamart

FNDatamart.csv

OBPS (Bpa)

bpa.py

bpaDatamart.csv

FSM

fsm.py

fsmDatamart.csv

Table 1.2 - Pod Names for each Module

Module Name

Pod Name

Description

PT

playground-865db67c64-tfdrk

Punjab Prod Data in UAT Environment

W&S

playground-584d866dcc-cr5zf

QA Data

PGR

Local Data 

Data Dump 

mCollect

playground-584d866dcc-cr5zf

QA Data

TL

playground-584d866dcc-cr5zf

QA Data

Fire Noc

playground-584d866dcc-cr5zf

QA Data

OBPS (Bpa)

playground-584d866dcc-cr5zf

QA Data

FSM

playground-584d866dcc-cr5zf

QA Data