Best Practices for Performance Testing
Performance testing ensures that a system can handle a specified workload size using certain resources within a defined period and shows how those resources can scale out (or up) to accommodate additional load up to a certain limit.
For instance, a specific system might be expected to process hundreds of thousands of events per minute for an hour.
Identifying the Performance Engineer Early and Establishing Expectations
-
Identify a performance engineer early.
-
Have this person collaborate closely with the Developer.
-
Dedicate time to acquire business domain knowledge to craft the necessary tool or script for performance testing.
Mastering the APM (Application Performance Monitoring) Tool and Workspace
-
Performance testing yields a substantial amount of logs.
-
Learn how to effectively utilize the Application Performance Monitoring (APM) tool like Kibana.
-
Filter logs based on criteria like environment, log levels, and application source.
-
Group the logs into helpful views with specific fields and time constraints.
-
Make sure to apply filters correctly as mistakes can be time-consuming and can waste valuable time.
-
Hire a Subject Matter Expert (SME) who is proficient in using the APM tool.
-
The SME can help with tasks such as debugging issues and saving time on tasks like log level filtering.
-
Their expertise is also invaluable when creating APM-specific dashboards.
Maintaining a Dedicated Performance Environment
-
Do not use a development environment for performance testing.
-
Can cause confusion over the versions being tested and inaccuracies due to simultaneous development work on other features.
-
Can also cause development work to shut down during performance testing cycles.
-
May also lead to misleading or irrelevant error messages not related to performance testing.
-
-
Maintain full control over the creation and provisioning of replicas and their resources, such as CPU/memory.
-
This control is vital because experimenting with different configurations is key to the performance testing process.
-
Some organizations may find it cost-prohibitive.
-
Organizations can adopt the automation of standing up and tearing down test environments as needed.
-
-
Larger infrastructures might employ canary testing in production, allowing for a portion of users to trial new features, effectively mimicking a performance testing environment without requiring a separate setup.
The Importance of Early, Frequent, and Local Testing
-
Developers should be capable of executing performance tests locally with smaller loads.
-
Benefits
-
Informs the team about specific metrics and logs to monitor, which can expedite the documentation process.
-
It allows for early identification of performance issues, enabling quicker resolution.
-
It avoids the time-consuming build and deployment process that is often required in a formal performance test environment.
-
It helps to determine the level of automation needed for setting up and tearing down resources, assisting with the creation of initial scripts.
-
-
Conducting lightweight performance tests frequently on a developer’s local machine at the conclusion of each milestone is a beneficial practice.
-
This proactive approach leads to a more robust system and reduces the number of performance issues discovered once the system moves into the official performance testing environment.
Additional Best Practices
-
Generate performance metrics for both ingress and egress.
-
Log the number of processed, errored, and dropped records at set intervals.
-
This enables the construction of a time-filtered dashboard, aiding in load pattern analysis.
-
-
Capture timestamps from messages. Lets us monitor message processing delays.
-
Allows us to identify lag in our dashboard.
-
-
Errors should be logged properly and not discarded or hidden within complex exception stacks, as they can be difficult to locate.
-
Automate the setup and teardown of resources for a performance run as much as possible.
-
Manual execution can lead to misconfigurations and wasted time resolving subsequent issues.
-
-
Maintain end-to-end visibility for logging and metrics, whether concerning the gateway or database.
-
This could involve consulting Subject Matter Experts (SMEs) like a Database Administrator (DBA) to review backend metrics or a DevOps engineer to assist with gateway-related metrics such as request counts and throttling errors.
-
Conclusion
-
Performance testing is an iterative process.
-
It demands a continuous feedback loop, continuous learning, and improvements.