Currently, when dynamic rescaling is being done all requests (and their
response times) are treated equally. Meaning, if you have a responses
of HTTP 200 or a 500 error, they are both treated equally, and response
times from both are used for dynamic rescaling later.
In my opinion, this should be amended to include only responses with
good return codes (200 only). At the very least, I believe, 500 level
messages should be excluded. Basically, if the backend's application
has crashed, the backend http server response time should not be used
for dynamic rescaling (the response time could be too slow or too fast,
not indicative of the server's response speed).
What do you think?
|