I was struggling on the legacy system. In order to give a better understanding of this issue, let’s look at below diagram.
The crew on ship uses the tablet to send data to AWS Cloud. using 4G Internet. The problem is sometime It sent successfully, sometime not. The lambada function on cloud also calls an external API service and sometime gets 200 status, sometime gets 400. 400 status can say more than error of client. Because when I extracted the payload from AWS Cloudwatch to re-send again, it worked with 200 status code.
Thus, I go to the conclusion of internet connection issue. We actually should not use only one lambda function like this. The problem is when it failed whatever reasons, we lost the message. (1)
Regarding the android application, it makes one time API involve without retrying. I really have no clue because the tablet is on the vessel. No way for tracking log. We have not installed AWS Cloudwatch on it. I have tried to reproduce the issue by filling in the same data as the crew but still sent to cloud successfully. Note that I’m on the land and trying to reproduce the issue on the vessel. So I go to conclusion of internet connection again. (2)
This is my solution. We actually need to use the architecture of:
But I just have a short amount of time. So I implement retry method on both lambda function and android tablet.
MAX_RETRIES = 10
for _ in range(MAX_RETRIES):
# Post api here
if post_res.status_code == 200:
break
else:
print(f"====================RETRY API. Number of retry: {_}")
Here for Android in Java.
int numRetries = 720;
int timeout = 5;
while(true) {
if (numRetries <= 0) {
break;
}
try {
// Post api here
connection.disconnect();
break;
} catch (java.net.SocketTimeoutException e) {
Log.d(TAG, "====================== SocketTimeoutException: " + e);
numRetries--;
TimeUnit.SECONDS.sleep(timeout);
continue;
} catch (java.io.IOException e) {
Log.d(TAG, "====================== IOException: " + e);
numRetries--;
TimeUnit.SECONDS.sleep(timeout);
continue;
} finally {
connection.disconnect();
Log.d(TAG, "====================== numRetries: " + numRetries);
}
}
So if we develop for one time submission, be careful to do it with retry function. We think of many issues such as application issue, internet issue, typo issue. We need to have a way for tracking log at device. May be online or offline tracking. I’m not an android developer and in this case the previous developer has not configured for any tracking solution on device. It’s really tough. Even more tough when our client didn’t not say that they have changed the connection system (connector, antenna, 4g etc).