Validating URL/URI in Ruby on Rails
Wouldn't it be nice if you could validate the format and the existence of a URI/URL provided by the users in your Rails application? Well, I thought so. So after a little research I put together a validates_uri_existence_of.rb which does exactly that. Take a look below:
require 'net/http'
# Original credits: http://blog.inquirylabs.com/2006/04/13/simple-uri-validation/
# HTTP Codes: http://www.ruby-doc.org/stdlib/libdoc/net/http/rdoc/classes/Net/HTTPResponse.html
class ActiveRecord::Base
def self.validates_uri_existence_of(*attr_names)
configuration = { :message => "is not valid or not responding", :on => :save, :with => nil }
configuration.update(attr_names.pop) if attr_names.last.is_a?(Hash)
raise(ArgumentError, "A regular expression must be supplied as the :with option of the configuration hash") unless configuration[:with].is_a?(Regexp)
validates_each(attr_names, configuration) do |r, a, v|
if v.to_s =~ configuration[:with] # check RegExp
begin # check header response
case Net::HTTP.get_response(URI.parse(v))
when Net::HTTPSuccess then true
else r.errors.add(a, configuration[:message]) and false
end
rescue # Recover on DNS failures..
r.errors.add(a, configuration[:message]) and false
end
else
r.errors.add(a, configuration[:message]) and false
end
end
end
end
Code above is a barebones check which first validates the URL format against a provided regular expression (http://www.etc...) and if it passes, it sends a HEAD (header) request and checks if we get a 200 Code (HTTPSuccess) back. You could easily extend it to handle redirects, etc.
How do you use it in your application? Copy the code below into a validates_uri_existence_of.rb and drop it into your lib directory in your Rails application. Next, open up your environment.rb (under /config) and at the bottom of the file add require 'validates_uri_existence_of'. Almost there, now we can use our new custom validation in any model by calling:
validates_uri_existence_of :url, :with =>
/(^$)|(^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(([0-9]{1,5})?\/.*)?$)/ix
:url is the field you want to validate, and with specifies the Regular expression to check the URL against. One in the example should work for almost all valid http / https links. I haven't found a problem with it yet. Theoretically, you could change the expression to validate only URL's within a specific domain or a set of domains.
So, now we can validate the format of the URL and check that the URL is alive and breathing (returns Code 200).
Ilya Grigorik is a web performance engineer and developer advocate on the Make The Web Fast team at Google, where he spends his days and nights on making the web fast and driving adoption of performance best practices.
Follow @igrigorik