So what is the polling?
Webhooks
When an event triggered, a function or api will be involved. Let’s say we have service A and we develop a function so every time A get a message, it will involve API to ask service B to do specific thing. So if A is server and B is client. Any time server got message, it involved API to re-calculate a thing at server B. A will say:
hey, I got a potatoes weekly, do you want me to call you when I will get them in future. So you can have potatoes to cook.
A will let some spaces for a few functions. We can call it “callback” functions. By default, A will perform these functions as the checklist if somebody requests. They does nothing except we say:
Sure, let me know once you get potatoes.
However, working with Restful we need to pay attention on the internet connection. I have encountered this issue when using the tablet on the vessels. We need to setup a retry function to re-involve API when it failed due to the internet connection.
Long Polling
If A does not provide webhooks, it might provide Long Polling. Imagine that B have to call API to A each 5 seconds. If A have not got messages yet, it returns empty response. It’s kind of waste. Long polling eliminates false empty responses by querying all of the servers instead of a sampling of servers.
A does not install webhooks, but if B ask:
hey, do you have any potatoes.
B says:
No, I don’t. But I will call you whenever I get them.
So B does not have to ask A day by day. Just ask one time and wait until B responds.
But B need to tell A:
I will wait for 3 days
So if the fourth day, A won’t respond although A got potatoes.
But we need to use it correctly. It asks and wait for response. If we need to get data for every 1 minute. We might create a background job to run every 1 minute. But if we only have 5 messages in 5 minutes.
1 minute | |
2 | |
3 | |
4 | got 2 messages |
5 | got 3 messages |
B hits A 5 times and waste first 3 times due to empty response. Instead of doing like that, we do polling every 5 minutes.
max_number_of_messages: 5 # the maximum number of messages to yield from each polling attempt
idle_timeout: 60 # stop polling after 50 seconds of no received messages.
wait_time_seconds: 3000 # the interval, max duration for each polling attempt.
Imagine the red is wait time 5 minutes to do polling. Blue is idle timeout 1 minute. If in each polling, it took 60 seconds already without any messages. It will stop the polling at 60th seconds and wait to start next polling at 3000 seconds (5 minutes already). With this setting, it processes maximum 5 messages each polling. Can have a proper number for performance.
Clearly that webhooks is better. But long polling is more than enough if we use scheduler to hit the server every 5 seconds for data.
Below are sample ruby code to implement long polling.
class SqsReceiveMessage
include Sidekiq::Worker
def perform
return if count_jobs > 1
Amazon::ReceiveMessage.call
end
private
def count_jobs
Sidekiq::Queue.new(:default).filter do |job|
'SqsReceiveMessage' == job.klass
end.size
end
end
module Amazon
class ReceiveMessage < BaseService
attr_reader :region, :queue_name, :sqs, :processor
def initialize(processor = BroadcastSeat)
Dotenv::Railtie.load
@region = ENV['AWS_REGION']
@queue_name = Rails.configuration.common.dig(:sqs_queue)
@sqs = Aws::SQS::Client.new(region: region)
@processor = processor
end
def call
begin
Aws::SQS::QueuePoller.new(sqs.get_queue_url(queue_name: queue_name).queue_url)
.poll({
max_number_of_messages: 10,
# idle_timeout: 60 # Stop polling after 60 seconds of no more messages available (polls indefinitely by default).
}) do |messages|
messages.each do |message|
processor.call(message.body)
# puts message.body
end
end
rescue Aws::SQS::Errors::NonExistentQueue
puts "Cannot receive messages using Aws::SQS::QueuePoller for a queue named '#{queue_name}', as it does not exist."
end
end
end
end