DevOps Board

Target releaseOngoing
EpicDevOps Milestones 
Document status
ONGOING
Document owner

Objective

Ongoing DevOps epics and stories across various areas and enhancements around tools, infra and process. 

Requirements

#RequirementUser StoryImportanceNotes
1Azure-as-an-additional
Azure playground setup with all the capabilities for a seamless option to choose b/w AWS or Azure OPS-1 - Getting issue details... STATUS
SEVERE
- Deployment Manifest changes for Resources (S3, EBS, etc.)
- Eng support: application level changes from S3 to Azure Blob for FileStore, Telemetry, Logos
  • Playground Ready (POC Env)
  • Data migration from AWS to Azure Tested
  • All services are intact with PBprod
2GIT Git Branching strategy OPS-30 - Getting issue details... STATUS SEVERE
  • Git Review process is implemented
  • Branching and multirepo strategy is in progress

3SpinnakerPOC: Multicloud orchestration and deployment pipeline tool OPS-31 - Getting issue details... STATUS MEDIUM
  • Being able to orchestrate and deploy across multicluster and multi cloud platform.
  • Being able to create RBAC and assign deployment pipelines. 

4Backup
Encrypted logs in S3 OPS-2 - Getting issue details... STATUS
HIGH


5Backup
Logs backup for at least 6 months period OPS-3 - Getting issue details... STATUS
HIGH
- Need to do POC with CloudFront and Log Analytics before finding out our new Log life cycle solution
6Capacity Planning
Cluster and app Sizing determination OPS-4 - Getting issue details... STATUS
MEDIUM
- Need to have sizing templates with BaseMin, GoodToHave & OptToHave
7Infra
Node resizing/restructuring across all the env, including Punjab prod (Upon customer approval) OPS-5 - Getting issue details... STATUS
BACKLOG
- Need to change the instance type M4.large to M5.xLarge
8Infra
Need to have dashboard, monitoring & Deployments for all the Envs OPS-6 - Getting issue details... STATUS
MEDIUM
- Need to evaluate a tool, which is cloud agnostic and all-in-one
9Infra
MultiCloud OPS-7 - Getting issue details... STATUS
BACKLOG


10Kafka Improvement
Deploy HA Kafka and Zookeeper cluster OPS-8 - Getting issue details... STATUS
SEVER
- Need to make Headless service configuration (Kafka connect)- Requests go to hadrcoded individual nodes like Kafka0,1,2)
11Kafka Improvement
Use Kakfa Connect to index instead of indexer OPS-9 - Getting issue details... STATUS
SEVER


12Kafka Improvment
Kakfa paritioning and multi consumer implementation OPS-10 - Getting issue details... STATUS
SEVERE

- 1-to-1 to 1-to-many
13Kube Upgrade
Upgrade Kubernetes to 1.11.6 for all environments (Dev, QA, PUAT, PPROD) OPS-11 - Getting issue details... STATUS
HIGH
- Kops upgrade
- Manifests changes
POC Done
14Kube Upgrade
Pod auto-scaling strategy OPS-12 - Getting issue details... STATUS
BACKLOG


15Logging
Move logging from direct ELK to ELK via Kafka OPS-13 - Getting issue details... STATUS
SEVERE

- Nithin is working on
16Logging
Request/Response event logging from Zuul OPS-14 - Getting issue details... STATUS
SEVERE


17Logging
Log masking OPS-15 - Getting issue details... STATUS
SEVERE
- Need to get the List of Fields from Dev, before working on POC
18Monitoring
Kafka Monitoring & Alerting OPS-16 - Getting issue details... STATUS
HIGH


19Monitoring
Move telemetry to internal Kafka and ELK OPS-17 - Getting issue details... STATUS
SEVERE

- Nithin is working on
20Monitoring
Prometheus or any better monitoring, which is proactively reporting issues ahead of time OPS-18 - Getting issue details... STATUS
HIGH


21Monitoring
Zuul/NGINX Status code monitoring - New dashboard OPS-19 - Getting issue details... STATUS
HIGH


22Monitoring
Error monitoring and configuration OPS-20 - Getting issue details... STATUS
HIGH


23Monitoring
Health and Readiness check on all services OPS-21 - Getting issue details... STATUS
SEVERE


24Monitoring
Intra service traffic management gateway OPS-22 - Getting issue details... STATUS
HIGH


25Monitoring
Monitoring dashboards at multiple levels - infra, IT & business OPS-23 - Getting issue details... STATUS
HIGH


26Process
IAM user policy for whole infra OPS-24 - Getting issue details... STATUS
HIGH

- Only admins, team IAM users have access only for their respective S3 Buckets
27RBAC
ACL on Kubectl access (after 1.11.6 upgrade) OPS-25 - Getting issue details... STATUS
SEVERE


28RBAC
ACL in Jenkins OPS-26 - Getting issue details... STATUS
SEVERE


29Release Mgmt
Need Helm like deployment strategy to rollout and rollback releases with one chart or single config OPS-27 - Getting issue details... STATUS
MEDIUM


30TBD
DB Masking and PII removal OPS-28 - Getting issue details... STATUS
MEDIUM


31Kube EncryptModify kubernetes deployment encryption OPS-39 - Getting issue details... STATUS  


User interaction and design

Open Questions

QuestionAnswerDate Answered

Out of Scope