Persister-Service

Objective:

Provides a framework to persist data in transactional fashion with low latency based on a config file. Removes repetitive and time consuming persistence code from other services


Requirements:

  • Prior Knowledge of Java/J2EE.
  • Prior Knowledge of SpringBoot.
  • Prior Knowledge of PostgresSQL.
  • Prior Knowledge of JSONQuery in Postgres. (Similar to PostgresSQL with a few aggregate functions.)


Setup, Definitions & Functionality:


Setup:

  1. Step 1: Write configuration as per your requirement. Structure of the config file is explained later in the same doc.
  2. Step 2: Check-in the config file to a remote location preferably github, currently we check the files into this folder - https://github.com/egovernments/egov-services/tree/master/core/egov-persister/src/main/resources for dev and QA and this folder - https://github.com/egovernments/punjab-rainmaker-customization/tree/master/configs/egov-persister for UAT.
  3. Step 3: Provide the absolute path of the checked-in file to DevOps, to add it to the file-read path of egov-persister. The file will be added to egov-persister's environment manifest file for it to be read at start-up of the application.
  4. Step 4: Run the egov-persister app and push data on kafka topic specified in config to persist it in DB



Definitions:

  • Config file - A YAML file which contains configuration for persisting data.

Functionality:

  • Persist data asynchronously using kafka providing very low latency
  • Data is persisted in batch
  • All operations are transactional
  • Values in prepared statement placeholder are fetched using JsonPath
  • Easy reference to parent object using ‘{x}’ in jsonPath which substitutes the value of the variable x in the JsonPath with value of x for the child object.(explained in detail below in doc)
  • Supported data types ARRAY("ARRAY"), STRING("STRING"), INT("INT"),DOUBLE("DOUBLE"), FLOAT("FLOAT"), DATE("DATE"), LONG("LONG"),BOOLEAN("BOOLEAN"),JSONB("JSONB")



How to Use:

  1. Configuration: Persister uses configuration file to persist data. The key variables are described below:
  • serviceName: Name of the service to which this configuration belongs.
  • description: Description of the service.
  • version: the version of the configuration.
  • fromTopic: The kafka topic from which data is fetched
  • queryMaps: Contains the list of queries to be executed for the given data.

       query: The query to be executed in form of prepared statement:

      1. basePath:
      2. jsonMaps: Contains the list of jsonPaths for the values in placeholders.
        1. jsonPath: The jsonPath to fetch the variable value.

Reference - https://raw.githubusercontent.com/egovernments/egov-services/master/rainmaker/tl-services/src/main/resources/tradelicense.yml



Use of {} to link child object to parent :

{

 "Properties": [

   {

     "tenantId": "pb.amritsar",

     "propertyId": "PB-PT-107-001290",

     "acknowldgementNumber": "PB-AC-2019-02-19-001290",

     "address": {

       "address1": "ATAR SINGH COLONY - Area2"

     },

     "propertyDetails": [

       {

         "financialYear": "2019-20",

         "propertyType": "BUILTUP",

         "propertySubType": "SHAREDPROPERTY",

         "assessmentNumber": "AS-2019-05-23-001732",

         "assessmentDate": 1558615079295,

         "usageCategoryMajor": "RESIDENTIAL",

         "usageCategoryMinor": null,

         "ownershipCategory": "INDIVIDUAL",

         "subOwnershipCategory": "SINGLEOWNER"

       }

     ]

   },

   {

     "tenantId": "pb.nawanshahr",

     "propertyId": "PB-PT-110-002348",

     "acknowldgementNumber": "PB-NW-2019-02-19-051547",

     "address": {

       "address1": "GOLDEN PALM PREMIUM - Ward_1"

     },

     "propertyDetails": [

       {

         "financialYear": "2019-20",

         "propertyType": "BUILTUP",

         "propertySubType": "SHAREDPROPERTY",

         "assessmentNumber": "AS-2019-05-27-002679",

         "assessmentDate": 1558615357743,

         "usageCategoryMajor": "COMMERCIAL",

         "usageCategoryMinor": null,

         "ownershipCategory": "INDIVIDUAL",

         "subOwnershipCategory": "MULTIPLEOWNER"

       }

     ]

   }

 ]

}


In the above json to map the child object propertyDetail to its parent object property we insert the propertyId as foreign key in the propertyDetail table. To do so we use the following jsonPath:


    - jsonPath: $.Properties[*][?({assessmentNumber} in @.propertyDetails[*].assessmentNumber)].propertyId


Here we use the assessmentNumber is in curly brackets therefore when the jsonPath is executed the assessmentNumber will be substituted with the assessmentNumber of the propertyDetail object we currently trying to persist.


Eg: When it tries to persist propertyDetail with assessmentNumber=AS-2019-05-23-001732 the jsonPath will become :


$.Properties[*][?({AS-2019-05-23-001732} in @.propertyDetails[*].assessmentNumber)].propertyId


Which will fetch the value of the  propertyId of the parent property object which is PB-PT-107-001290



Bulk Persister:

To persist large quantity of data bulk setting in persister can be used. It is mainly used when we migrate data from one system to another. The bulk persister have the following two settings:

Variable NameDefault ValueDescription
persister.bulk.enabled falseSwitch to turn on or off the bulk kafka consumer
persister.batch.size
100The batch size for bulk update
Any kafka topic containing data which has to be bulk persisted should have '-batch' appended at the end of topic name example: save-pt-assessment-batch