Egov-pdf service(interservice call to PDF service) technical document

Introduction:

Egov-pdf service is new service being added which can work in between existing pdf-service and client requesting pdfs. Earlier client used to directly call pdf-service with complete data as json, but with introduction of this new service one can provide just few parameters ex:- applicationnumber, tenantId to this new service to get a pdf. The egov-pdf service will take responsibility of getting application data from concerned service and also will do any enrichment if required and then with the data call pdf service to get pdf directly . The service will return pdf binary as response which can be directly downloaded by the client. With this service the existing pdf service endpoints need not be exposed to frontend.

For any new pdf requirement one new endpoint with validations and logic for getting data for pdf has to be added in the code. With separate endpoint for each pdf we can define access rules per pdf basis. Currently egov-pdf service has endpoint for following pdfs used in our system:-

  • PT mutationcertificate

  • PT bill

  • PT receipt

  • TL receipt

  • TL certifcate

  • TL renewal certificate

  • Consolidated receipt

Requirements :

  • Prior knowledge of JavaScript.

  • Prior knowledge of Node.js platform

  • Prior Knowledge of REST APIs and related concepts

Current endpoints for PDFs

Currently below endpoints are in use for ‘CITIZEN' and 'EMPLOYEE’ roles

Endpoint

module

query parameter

Restrict Citizen to own records

Endpoint

module

query parameter

Restrict Citizen to own records

/egov-pdf/download/PT/ptreceipt

property-tax

uuid, tenantId

yes

/egov-pdf/download/PT/ptbill

property-tax

uuid, tenantId

no

/egov-pdf/download/PT/ptmutationcertificate

property-tax

uuid, tenantId

yes

/egov-pdf/download/TL/tlrenewalcertificate

Tradelicense

applicationNumber, tenantId

yes

/egov-pdf/download/TL/tlcertificate

Tradelicense

applicationNumber, tenantId

yes

/egov-pdf/download/TL/tlreceipt

Tradelicense

applicationNumber, tenantId

yes

/egov-pdf/download/TL/tlbill

Tradelicense

applicationNumber, tenantId, bussinessService

no

/egov-pdf/download/PAYMENT/consolidatedreceipt

Collection

consumerCode, tenantId, bussinessService

yes

/egov-pdf/download/BILL/consolidatedbill

Billing

consumerCode, tenantId, bussinessService

no

/egov-pdf/download/BILL/billamendmentcertificate

Billing

amendmentId, tenantId, bussinessService

no

/egov-pdf/download/WNS/wnsbill

Water and Sewerage

applicationNumber, tenantId, bussinessService

no

/egov-pdf/download/WNS/wnsreceipt

Water and Sewerage

applicationNumber, tenantId, bussinessService

yes

/egov-pdf/download/WNS/wnsgroupbill

Water and Sewerage

tenantId, bussinessService,
locality,
consumerCode,
isConsolidated

no

 

Steps/guidelines for adding support for new pdf:

Interaction Diagram:

 

Bulk Bill PDF Generation

PDF service generates the pdf as per the data and template key sent in the request. PDF service provides functionality to generate the PDF in bulk but there seems to be a limitation. We are using a third party library PDF MAKE for pdf generation and for bulk pdf generation requests, we send data arrays in request payload. The server cannot hold and save the pdf request until all the pdf generation processes are finished for each pdf. This bottleneck affects the performance of pdf service.
Below is the flow of PDF generation.


To overcome this issue, we came up with a solution to make the PDF generation asynchronous and handle bulk pdf generation in batch. When employee / client send request to eGov pdf service, the employee / client receives a job id to access the bulk bill pdf once the pdf is ready. Meanwhile in backend, the service gather all data from other services for pdf. Once all the data is collected, the data is send to pdf service in a batch of 50 or 100 (batch size is configured in environment variable) via kafka topic instead of calling the API to make the process asynchronous. Each batch which is sent to the pdf service once finished processing is stored to the mounted EBS volume and the batch details of completed pdf records so far and total pdf needs to be generated are stored in DB. Once the completed pdf records and total pdf records value matches the process gets stopped.The file present in the EBS volume gets merged into the single pdf and gets uploaded into filestore S3 bucket and its filestore id gets stored into the DB so that we can access the PDF whenever we want. After getting the filestore id, we delete all the pdf present in EBS volume to manage the space. And the employee / client who triggered the bulk pdf API can check the download page whether it's requested job is completed or not.

Below is the flow of bulk bill generation.

 

 

Below is the report for the performance of bulk pdf generation testing: