Searcher-Service

Objective:

The objective of egov-searcher service is listed as below.

  1. To provide a one-stop framework for searching data from multiple data-source based on configuration (Postgres, Elasticsearch etc).
  2. To create provision for implementing features based on ad-hoc requirement which directly or indirectly require a search functionality.

Requirements:

  1. Prior Knowledge of Java/J2EE.
  2. Prior Knowledge of SpringBoot.
  3. Prior Knowledge of PostgresSQL.
  4. Prior Knowledge of REST APIs and related concepts like path parameters, headers, JSON etc.
  5. Prior Knowledge of JSONQuery in Postgres. (Similar to PostgresSQL with a few aggregate functions.)

Setup, Definitions & Functionality:

Setup:

  1. Step 1: Write configuration as per your requirement. Structure of the config file is explained later in the same doc.
  2. Step 2: Check-in the config file to a remote location preferably github, currently we check the files into this folder - https://github.com/egovernments/egov-services/tree/master/core/egov-searcher/src/main/resources for dev and QA and this folder - https://github.com/egovernments/punjab-rainmaker-customization/tree/master/configs/egov-searcher for UAT.
  3. Step 3: Provide the absolute path of the checked-in file to DevOps, to add it to the file-read path of egov-searcher. The file will be added to egov-searcher's environment manifest file for it to be read at start-up of the application.
  4. Step 4: Run the egov-searcher app, use the modulename and definitionname parameters from the configuration as path parameters in the URL of the search API to fetch the required data.

Definitions:

  1. Config file - A YAML (xyz.yml) file which contains configuration for search requirements.
  2. API - A REST endpoint to fetch data based on the configuration.

Functionality:

  1. Uses Postgres JSONQuery instead of SQL Queries to fetch data from the Postgres DB.

    JSONQuery:
    JSONQuery is one of the exclusive features of Postgres, It provides a way of fetching data from the DB as JSON instead of ResultSet format. This saves the time spent is mapping ResultSet into required JSON formats at the functionality side.

    JSONQueries are similar to SQL queries with certain functions to internally map the ResultSet to JSON. SQL queries (SELECT queries to be precise) are passed as parameters to these functions, the SQL Query returns the ResultSet which is transformed to the JSON by these functions.
    Some of the functions extensively used are:
    1) row_to_json:  This function takes a query as a parameter and converts the result into JSON. However, the query must return only one row in the response. Note that, JSONQuery functions operate on aliases, So, the query must be mapped to an alias and the alias is passed to the function as a parameter.
    Eg: 
    {"name": "egov", "age": "20"}
    2) array_agg: This functions takes the output of row_to_json and aggregates it into an array of JSON. This is required when the query is returning multiple rows in the response. The query will be passed to row_to_json through an alias, this is further wrapped within array_agg to ensure all the rows returned by the query as converted to a JSONArray.
    Eg: 
    [{"name": "egov", "age": "20"},{"name": "egov", "age": "20"},{"name": "egov", "age": "20"}]
    3) array_to_json: This transforms the result of array_agg into a single JSON and returns it. This way, the response of a JSONQuery will always be a single JSON with the JSONArray of results attached to a key. This function is more for the final transformation of the result. The result so obtained can be easily cast to any other formats or operated on using the PGObject instance exposed by Postgres.

    Eg: 
    {"alias": [{"name": "egov", "age": "20"},{"name": "egov", "age": "20"},{"name": "egov", "age": "20"}]}

    For more details about JSONQuery, please check: https://www.postgresql.org/docs/9.4/functions-json.html

  2. Provides an easy way to set-up search APIs on the fly just by adding configurations without any coding effort.
  3. Provides flexibility to build where clause as per requirement, with config keys for operators, conditional blocks and other query clauses.
  4. Designed to use specific URI for every search request thereby making it easy for role-based access control.
  5. Fetches data in the form of JSON the format of which can be configured. This saves considerable effort in writing row mappers for every search result.

Feature List:

V1:

  1. Search from Postgres using Postgres JSONQuery.
  2. Support multiple operators in where clause.
  3. Fetch result as per the configured JSON format.

V2: (Yet to be implemented)

  1. Enable search from Postgres using SQL Queries.
  2. Enable searching from Elasticsearch.
  3. Enable control over fetching a subset of the configured columns in the search result.

Impact:

  1. Used by PGR for its entire search functionality.
  2. Used by PT for some ad-hoc client requirements.
  3. Used by TL for providing open search to be accessible only by egov-indexer for the legacyindex job.

Impacted By:

  1. Changes in the version of PostgreSQL.
  2. Depreciation/Enhancement in the JSONQuery syntax and other features.

How to Use:

  1. Configuration: As mentioned above, searcher uses a config file per module to store all the configurations pertaining to that module. Searcher reads multiple such files at start-up to support the search for all the configured modules. The file contains the following keys:
  1. moduleName: Name of the module to which this configuration belongs.
  2. summary: Summary of the module.
  3. version: the version of the configuration.
  4. definitions: List of definitions within the module. Every definition corresponds to one search requirement. The keys listed henceforth together form one definition and multiple such definitions are part of this definitions key.
  5. name: Name of the definition.
  6. query: Query configuration key. All the keys required to define a query fall under this key and those keys defined as follows:
      1. baseQuery: Query to be executed while this definition is in use. However, the query here will have placeholder '$where' which is where the WHERE clause built on the fly is replaced. So,  query here should not contain anywhere clause except for some hard-coded conditions in nested queries.
      2. groupBy: Property to be used to group the resultset.
      3. orderBy: Property to be used to order the resultset by.

       7. searchParams: Key to configure the search params of this query. uses the following keys.
                       i. condition: Logical operator to be used across the WHERE clauses (currently it doesn't support mixed logical operators in the query). 
                      ii. params: List of configurations for every search param wherein every search param is to be provided with the following keys:
                      iii. name: Name of the parameter, note that, this has to be same as the column name in the table.
                      iv. isMandatory: Boolean key to mark if this is a mandatory parameter in search. 
                      v. jsonPath: JSONPath of the value to be equated along with the parameter mentioned in 'name' in the query. JSONPath is applied on the incoming request.
                     vi. operator: Operator to be used between parameter mentioned in 'name' and value fetched from 'jsonPath'. This key can take values - LIKE, GE (greater than), LE (less than). default value for this key is '='.
         
       8. output: Key to configure the output format of the result to be fetched from the searcher.
                     i. jsonFormat: Skeleton of the json expected.
                    ii. outJsonPath: JSONPath as to where the result of the query is expected to be in the response.
                    iii. responseInfoPath: JSONPath as to where the 'ResponseInfo' is to be appended in the response.


Reference - https://raw.githubusercontent.com/egovernments/punjab-rainmaker-customization/master/configs/egov-searcher/rainmaker-pgr-v2-searcher.yml




  1. API call:

URI: The format of the search API to be used to fetch data using egov-searcher is as follows:  /egov-searcher/{moduleName}/{searchName}/_get

Every search call is identified by a combination of moduleName and searchName. Here, 'moduleName' is the name of the module as mentioned in the configuration file and 'searchName' is the name of the definition within the same module that needs to be used for our search requirement.

For instance, If I want to search all complaints of PGR I will use the URI -  
/egov-searcher/rainmaker-pgr-V2/serviceSearchWithDetails/_get

Body: Body consists of 2 parts: RequestInfo and searchCriteria. searchCriteria is where the search params are provided as key-value pairs. The keys given here are the ones to be mentioned in the 'jsonPath' configuration within the 'searchParams' key of the config file.

For instance, If I want to search complaints of PGR where serviceRequestId is 'ABC1234' and tenantId is 'pb.amritsar' the API body will be:

{"RequestInfo":{"apiId":"emp","ver":"1.0","ts":1234,"action":"create","did":"1","key":"abcdkey","msgId":"20170310130900","authToken":"57e2c455-934b-45f6-b85d-413fe0950870","correlationId":"fdc1523d-9d9c-4b89-b1c0-6a58345ab26d"},"searchCriteria":{"serviceRequestId":"ABC1234","tenantId":"pb.amritsar"}}

Interaction Diagram: