Down Boy: How to easily throttle requests to an API using Redis
Last week Eric and I needed to gather historic weather data from the Weather Underground API. We needed to retrieve large amounts of data and because we are pre-loading the data we do not mind spacing the collection out over a few days in order to minimize costs for our client.
Weather Underground's pricing is reasonable until you need the historical weather data add-on that bumps it to over $500 a month. With this in mind, we challenged ourselves to provide strategic value to our client by throttling our API usage.
The free plan allows you up to 500 requests per day and only 10 request per minute. If you exceed the 10 request per minute limit your API key gets suspended for the day. So we need to be able to keep track of both daily\_count
and threshold
.
Thankfully, Redis makes this rather easy with the INCR command.
redis> SET mykey "10"
OK
redis> INCR mykey
(integer) 11
redis> GET mykey
"11"
The example above comes from the Redis commands page. What it does not tell you, however, is that when you INCR
a key that does not exist it treats the nil as 0 and sets the key to 1.
redis> GET foo
(nil)
redis> INCR foo
(integer) 1
redis> INCR foo
(integer) 2
With this knowlege we can easily keep track of the number of calls to the API we make. The next step is to make sure they are reset appropriately, another place where Redis really makes this easy.
Threshold
The Redis command EXPIRE
tells Redis to DEL
the key after a certain amount of seconds. So when we increment our counter we set it to expire after 60 seconds if we are at the first increment.
def increment_threshold
redis.expire(:threshold, 60) if redis.incr(:threshold) == 1
end
We can add a check in our worker to idle until under our limits.
def perform
sleep 1 while above_threshold?
increment_threshold
# Do work
end
def above_threshold?
redis.get(:threshold) >= 10
end
Redis will automatically delete the key after 60 seconds and the next time we increment our threshold counter it will be set to 1 and expire in another 60 seconds.
Daily Limit
We also needed to make sure that we did not exceed the allotted number of daily requests. This limit is reset by Weather Underground each day at midnight EST. We can easily achieve the same effect as the threshold with a set time using Redis' EXPIREAT
command. EXPIREAT
is the same as EXPIRE
except it takes a specific unix timestamp.
def increment_daily_count
if redis.incr(:daily_count) == 1
redis.expireat(:daily_count, (Date.today + 1).to_time.to_i)
end
end
Now all we need to do is update the sleep conditional and add the increment\_daily\_count
to before we make the API call.
def perform
sleep 1 while at_limit?
increment_counters
# Do API Calls
end
def at_limit?
above_daily_limit? || above_threshold?
end
def above_daily_limit?
redis.get(:daily_count) >= 500
end
def above_threshold?
redis.get(:threshold) >= 10
end
def increment_counters
increment_threshold
increment_daily_count
end
def increment_daily_count
if redis.incr(:daily_count) == 1
redis.expireat(:daily_count, (Date.today + 1).to_time.to_i)
end
end
def increment_threshold
redis.expire(:threshold, 60) if redis.incr(:threshold) == 1
end
With our API calls throttled, we can safely spin up workers without having to worry about going over our threshold or our daily limit. Of course it would be best to change all of the numbers in this code to constants that can be set through a setting file or initializers.
Comments
I think it might be a little easier (but less creative) to use the SlowWeb gem:
https://github.com/benbjohnson/slowweb
@Almog, this seems good, however, its not updated from last 4 years. So, I am doubtful to use it.
Watch out for race conditions with this approach – if we increment the counter but fail to expire that key, then rate limiting is effectively disabled.
More on the Redis docs:
http://redis.io/commands/INCR
This may be practical to use but this is not a complete problem. Below is the scenario where this fails:
This will work but effectively between 00:00:40 - 00:01:01 i made 19 API calls which should not be allowed. So this approach only solves half the problem.