Introduction:
Egov-pdf service is new service being added which can work in between existing pdf-service and client requesting pdfs. Earlier client used to directly call pdf-service with complete data as json, but with introduction of this new service one can provide just few parameters ex:- applicationnumber, tenantId to this new service to get a pdf. The egov-pdf service will take responsibility of getting application data from concerned service and also will do any enrichment if required and then with the data call pdf service to get pdf directly . The service will return pdf binary as response which can be directly downloaded by the client. With this service the existing pdf service endpoints need not be exposed to frontend.
For any new pdf requirement one new endpoint with validations and logic for getting data for pdf has to be added in the code. With separate endpoint for each pdf we can define access rules per pdf basis. Currently egov-pdf service has endpoint for following pdfs used in our system:-
PT mutationcertificate
PT bill
PT receipt
TL receipt
TL certifcate
TL renewal certificate
Consolidated receipt
Requirements :
Prior knowledge of JavaScript.
Prior knowledge of Node.js platform
Prior Knowledge of REST APIs and related concepts
Current endpoints for PDFs
Currently below endpoints are in use for ‘CITIZEN' and 'EMPLOYEE’ roles
Endpoint | module | query parameter | Restrict Citizen to own records |
---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Steps/guidelines for adding support for new pdf:
Make sure the config for pdf is added in the PDF-Service (https://digit-discuss.atlassian.net/l/c/f3APeZPF )
Follow code of existing supported PDFs (https://github.com/egovernments/utilities/tree/pdf-new/egov-pdf/src/routes ) and create new endpoint with suitable search parameters for each PDF
Put parameters validations, module level validations ex:- application status,applicationtype and api error responses with proper error messages and error codes
Make sure whatever service is used for preparing data for PDF, search call to them by citizen returns citizens own record only, if not then adjust searchcriteria for them by including citizen mobilenumber or uuid to restrict citizen to create pdfs for his record only. If in the requirement itself it is explained that citizen can get PDF for others records also ex:- billgenie bill PDFs then no need for this check
Prepare data for pdf by calling required services.
Use correct pdf key with data to call and return PDF(use “/creatnosave” endpoint of PDF service)
Add access to endpoint in MDMS for suitable roles
Interaction Diagram:
Bulk Bill PDF Generation
PDF service generates the pdf as per the data and template key sent in the request. PDF service provides functionality to generate the PDF in bulk but there seems to be a limitation. We are using a third party library PDF MAKE for pdf generation and for bulk pdf generation requests, we send data arrays in request payload. The server cannot hold and save the pdf request until all the pdf generation processes are finished for each pdf. This bottleneck affects the performance of pdf service.
Below is the flow of PDF generation.
To overcome this issue, we came up with a solution to make the PDF generation asynchronous and handle bulk pdf generation in batch. When employee / client send request to eGov pdf service, the employee / client receives a job id to access the bulk bill pdf once the pdf is ready. Meanwhile in backend, the service gather all data from other services for pdf. Once all the data is collected, the data is send to pdf service in a batch of 50 or 100 (batch size is configured in environment variable) via kafka topic instead of calling the API to make the process asynchronous. Each batch which is sent to the pdf service once finished processing is stored to the mounted EBS volume and the batch details of completed pdf records so far and total pdf needs to be generated are stored in DB. Once the completed pdf records and total pdf records value matches the process gets stopped.The file present in the EBS volume gets merged into the single pdf and gets uploaded into filestore S3 bucket and its filestore id gets stored into the DB so that we can access the PDF whenever we want. After getting the filestore id, we delete all the pdf present in EBS volume to manage the space. And the employee / client who triggered the bulk pdf API can check the download page whether it's requested job is completed or not.
Below is the flow of bulk bill generation.
Below is the report for the performance of bulk pdf generation testing: