Observability Senior Architect
Job Overview:
We are looking for professionals with a minimum of 10 years of relevant work experience in setting up monitoring solutions using products like Dynatrace, Datadog, ELK stack, Splunk, Grafana/Prometheus, etc., especially in critical production environments. Additionally, we value a minimum of 5-6 years of experience in end-to-end observability, covering technical, user experience, and business outcome metrics. Experience with AIOps would be a significant advantage.
Qualification/Experience:
· Your experience with private cloud and cloud-native public-cloud (particularly AWS) hosted applications will be highly beneficial. We are particularly interested in individuals who have worked with multi-tenancy setups and data segregation on the observability and AIOps stack.
· Furthermore, we are looking for expertise in designing and building an Observability & Maintenance (O&M) module for multi-tenant solutions, as well as defining SLIs and setting up SLOs for these solutions.
· Experience in implementing Container, Network, APM, RUM, Log Analytics, end-to-end tracing, and custom alerts using tools like Grafana, Prometheus, and Grafana Loki (alternatively Logstash or Fluent bit). Additionally, experience with other third-party products like Dynatrace will also be considered valuable.
· Proficiency with containers and multi-tenancy setup for the observability solution is essential. The ability to configure custom alerts, monitors, and build AIOps workflows based on telemetry is another critical aspect we are focusing on. A solid understanding of setting up integration capabilities with other systems via APIs, consuming external APIs for IAM, and ingesting metric-based telemetry via collectors is also required.
· Furthermore, we need someone capable of building custom observability dashboards tailored to different portfolios and personas. Setting up Synthetic Monitoring and Test Automation and integrating its telemetry into the observability stack is also a key requirement. The role also involves tenant and data segregation and the ability to obfuscate sensitive information on the common observability schema.
· Lastly, proficiency in coding, particularly in Python, Java, and Ansible scripting, is highly preferable. Cloud- GCP/Azure.
· Any certifications in Observability Foundation from the DevOps Institute or any product-level accreditation would be highly valuable for this role. Additionally, having recognized System Architecture qualifications such as TOGAF would be a great bonus.
Responsibilities and Duties:
· The primary responsibilities for this role include architecting, designing, and ensuring the implementation of the entire observability solution to be packaged as a module within our multi-tenant private cloud solution. This also involves implementing the observability solution to monitor and apply the same feature-set across all tenants, effectively serving as a hypervisor.
· Furthermore, the candidate will be responsible for designing and implementing integrations, as well as externalizing APIs. They will also need to set up authentication and authorization controls by integrating with an IAM layer.
· Collaboration with the UI/UX teams is essential to design dashboards for the Observability & Maintenance platform for both the tenants and the host. Additionally, the role entails designing and setting up an AIOps module responsible for automated remediation workflows, such as capacity scaling, container restarts, and anomaly detection.
· The candidate will also work on building Proof-of-Concept solutions to view end-to-end tube-maps or service flows for the respective tenant's services. Defining and setting up a CMDB to serve as a source for infrastructure and application telemetry is another crucial responsibility.
· Moreover, they will work with other teams to ensure the system is well-tested and scalable, meeting tenant demands. Finally, defining business-aligned SLIs and setting SLOs for core services and journeys will be part of their responsibilities.
...DESCRIPTION At Amazon delivering millions of packages each day and fulfilling customer orders from all over the world is what we do. Our Finance teams help drive the company forward by partnering closely with global cross-functional teams to support the growth of and...
...first local pediatric practice in the Syracuse area to receive Medical Home Level 3 certification. As a certified medical home, we... ...To perform this job successfully. Education/Experience: High school diploma or general education degree (GED...
...Job Description Job Description Forklift Operator / Material Handler (Sweeper) Job & Talent is currently seeking a Forklift Operator/Material Handler for the world class UPS Mail Innovations in Bensenville, IL. Come join a great team and be rewarded for your hard...
Job Description Job Description Tutor Me Education is reshaping how students learn. We are looking for teachers and tutors with virtual tutoring experience to provide 1:1 or group instruction to students all across the country! Here are the details: Virtual ...
...Central, Mountain or Pacific time zones. You do not need to live in any of our existing markets for this position. This is a fully remote position. Position is approximately 30-40 hours per week. You must have a work environment conducive to working at home and the...