Justin Collins' Blugh

SQL injection (SQLi) is any situation in which a user can manipulate a database query in an unintended manner. Consequences of SQL injection vulnerabilites range from data leaks, to authentication bypass, to root access on a database server. In short, it is a very big deal.

Most Rails applications interact with a database through ActiveRecord, the default and convenient Object Relational Mapping (ORM) layer which comes with Rails. Generally, use of ORMs is safer than not. They can provide abstraction and safety and allow developers to avoid manually building SQL queries. They can embody best practices and prevent careless handling of external input.

Instead of unsafe code like

query = "SELECT * FROM users WHERE name = '#{name}' AND password = '#{password'} LIMIT 1"
results = DB.execute(query)

You can have safer, simpler code like

User.where(:name => name, :password => :password).first

My impression is many people assume the Rails framework will protect them as long as they avoid the “obviously dangerous” methods, like find_by_sql.

Unfortunately, ActiveRecord is unsafe more often than it is safe. It does provide parameterization of queries (the API documentation for which can be found here) for some methods, there are many methods for which it does not. While these methods are not intended to be used with user input, the truth is that has never stopped anyone.

To make it clear how dangerous it can be to use ActiveRecord, consider ActiveRecord::FinderMethods#exists? which queries the database and returns true if a matching record exists. The argument can be a primary key (either integer or string, if a string it will be sanitized), an array consisting of a template string and values to safely interpolate, or a hash of column-value pairs (which will be sanitized).

Here is an example of using exists? to determine if a given user exists:

User.exists? params[:user_id]

This looks harmless, since params[:user_id] is a string, and strings will be sanitized. In fact, the documentation clearly points out not to pass in conditions as strings, because they will be escaped.

However, there is no gaurantee params[:user_id] is a string. An attacker could send a request with ?user_id[]=some_attack_string, which Rails will turn into an array ["some_attack_string"]. Now the argument is an array, the first element of which is not escaped.

To avoid this problem, the user input should be converted to the expected type:

User.exists? params[:user_id].to_i

Or use a hash:

User.exists? :id => params[:user_id]

This should be the approach for all uses of user input. Do not assume anything about values from external sources or what safety mechanisms a method might have.

While working on Brakeman, I thought it would be useful to put together a list of all the unsafe ways one can use ActiveRecord.

To make it easier on myself, I built the list into a Rails application so I could easily test, verify, and record any findings. The source is available here for those who would like try out the examples. The application is a single page of all the queries and example injections. From there one can submit queries and see the results:

Query Example

The resulting information is available at rails-sqli.org, including examples of how SQL injection can occur and the resulting queries. This is basically a big list of what not to do when using ActiveRecord. Again, please feel free to contribute so that the list can be as authoritative as possible and help everyone avoid SQL injection in Rails.

Background

About a month ago, an issue was reported where Brakeman taking a ridiculously long time on the “call indexing” step. At the time, I was pessimistic about opportunities to improve the performance of call indexing, since it is a pretty simple operation.

Call Indexing

The majority of the checks performed by Brakeman involve finding and examining method calls (e.g., SQL queries). In order to make these checks faster, Brakeman scans an app once and then saves information about each method call in a data structure called the “call index”. This makes searching for specific method calls very fast.

The call index allows search for methods by name and also by the class of the target. For example, it is possible to search for all calls to open on File. It also allows searching via regular expressions for methods and targets.

Investigation

I happened to notice a Rails application in my collection which also seemed to take a long time indexing calls. So I ran it through perftools.rb to see if there was anything interesting going on.

This is the result:

The large amount of time spent in the garbage collector (60%) was high even for Brakeman. But then something else caught my eye:

Call indexing in 1.8.2

This scan spent 4.7% of its time converting Sexps to strings while indexing calls. This seemed excessive.

This is the entirety of the call indexing code:

call_index.rb

  def index_calls calls
    calls.each do |call|
      @methods << call[:method].to_s
      @targets << call[:target].to_s
      @calls_by_method[call[:method]] << call
      @calls_by_target[call[:target]] << call
    end
  end

@methods and @targets are sets which contain the string versions of all the methods and method targets. This is exclusively used to search for methods and targets via regular expressions.

The call method will always be a symbol…but what about the target? It turns out that while much of the time it is a symbol, if a sane value like :File or :@something cannot be found, then it will be the entire Sexp! This is where Brakeman was wasting time calling Sexp#to_s.

The quick fix was to only store symbol targets in the @targets set, ignoring any other target values.

Results

Scanning the same application with Brakeman 1.8.3 has this result:

Garbage collection time dropped from 60% to 42%. While still very high, this is a good sign. Time spent indexing calls has dropped from 5.4% to 1.8% and the calls to Sexp#to_s have vanished.

The total scan time dropped from 3.5 minutes to about 2 minutes. For the original reporter, scan times went from 78 minutes to 40 seconds.

More Improvements

Looking through Brakeman, it does not currently use the “search via regex” feature for the call index. So the method and target name sets can be removed entirely.

Going even further, nowhere does Brakeman search for targets by any values other than symbols. Note in the graph below that Array#eql? was sampled 1,330 times during call indexing:

Call indexing hashing

Since Sexps are subclassed from Array, it is clear that these calls are generated when using the call[:target] as a hash key (line 6 above). Again, the current Brakeman code only searches for call targets by symbol, never by a full Sexp. There is no reason to the call targets that are Sexps.

This is the modified call indexing code:

Modified call_index.rb

  def index_calls calls
    calls.each do |call|
      @calls_by_method[call[:method]] << call

      unless call[:target].is_a? Sexp
        @calls_by_target[call[:target]] << call
      end
    end
  end

With this code in place, call indexing does not even show up under perftools. Speed improvements vary by project, but this should at least shave off a few seconds. YMMV.

Wrapping Up

Some quick profiling led me to performance improvements where I really did not expect to find them. Sadly, this is one of the cleanest, simplest parts of Brakeman, so I know there are many other instances where Brakeman can be improved. Prior to the introduction of the call index in Brakeman 1.0, I was trying to keep Brakeman scans under 20 minutes (on large applications). Now I worry when scans take longer than a few minutes.

97% of the open source Rails applications I use as test cases can be scanned in less than 30 seconds. Unfortunately, this probably does not reflect scan times for large, commercial applications. Please report any long-running scans! It may lead to more speed improvements like the ones above.

*.blog

Avoiding SQL Injection in Rails

Faster Call Indexing in Brakeman