In Airflow, a DAG
–Directed Acyclic Graph – is a collection of all the tasks you want to run, organized in a way that reflects their relationships and dependencies.
A DAG is defined in a Python script, which represents the DAGs structure (tasks and their dependencies) as code.
How to run DAG in Airflow?
Manual Trigger
1.Log onto the Punjab Prod server using the credentials:
username : admin
password : admin
2.Trigger the DAG by clicking on the “Trigger DAG with Config” option.
3.Enter date and click on Trigger button
Format {“date” : “dd-MM-yyyy”}
4.The Logs can be viewed by expanding on the DAG and choosing a stage for any module and
Clicking on the Log option.
Logs can also be viewed in the Elastic search index adaptor_logs
GET adaptor_logs/_search
the timestamp can be provided based on the day for which the logs are being searched for
Scheduled DAG
This DAG would trigger midnight everyday for the previous day
Bulk Insert for a date range
Execute this script to run the DAG for a date range for the staging NDB
sh iterate_over_date.sh <start-date> <end-date>
ex: sh iterate_over_date.sh 2022-03-01 2022-03-05
date needs to be in format of YYYY-mm-dd
range is exclusive of last date, [start-date, end-date), ex: in above example, script will call dag for 1, 2, 3 and 4 march. Will not do for 5 march.
Add Comment