Skip to main content

Mobomo webinars-now on demand! | learn more.

Six months ago, our criteria for developing apps using Client Side MVC(Single Page Apps or SPA) or Server Side MVC was either: Client side MVC if SEO (search engine optimization) was of no concern or Server side MVC if it was...

This strategy however has proven less applicable as SPA's popularity grows and server side MVC continues its decline. We often run into projects that need both client side MVC and SEO. Since most search engines do not execute JavaScripts on the html pages and content on SPA pages are mostly generated by JavaScript, indexing Single Page App pages yields bad SEO results.

The SEO Solution

Using (our favorite) AngularJS, here is one approach to solving the SEO problem for Single Page Apps. Note: The same approach can be used for other frameworks (e.g. Backbone and Ember) as well. See links under Resources for more details.

Most search engines support the hashbang URL convention - when they see #! in a url, they will substitute it with ?_escaped_fragment_ instead.

For example:

http://example.com/#!/products 

will be indexed at the expanded url:

http://example.com/?_escaped_fragment_=/products  

So the idea is to store and serve the pre-rendered pages (snapshots) at the expanded urls. The snapshot of a page contains the DOM content already rendered out by JavaScript.

It takes several steps to make this work:

1. Enable hashbang on the client side

Since by default Angular routes (urls) uses the /#/ prefix, we need to use the following to make routes to use the /#!/ hashbang format.

$locationProvider.hashPrefix('!'); 

For apps that use the HTML5 pushState, we will need to add

<meta name="fragment" content="!"> 

to the <head> section of the to-be-indexed pages. In Angular apps, push state is enabled with:

$locationProvider.html5Mode(true);   

2. Route urls with escaped_fragment to cached pages on the server side

Rails/Sinatra - use Rack middleware to redirect to pre-rendered urls  Node/Express - use middleware to redirect to pre-rendered urls  Apache - use mod_rewrite to rewrite to the pre-rendered urls  Nginx - use proxy to the pre-rendered urls 

For Rails apps, you can determine in the Rack middleware if a request is coming from the crawler by checking if '_escaped_fragment' is present on the request and redirect to the pre-rendered urls:

query_hash = Rack::Utils.parse_query(request.query_string) if query_hash.has_key?('_escaped_fragment_')   # redirect to the pre-rendered urls   ... end 

3. Capture, store, and cache the pre-rendered snapshots

Do this on a regular basis, preferably as soon as content changes are made to the pages that need to be indexed. Note: Only capture SEO worthy pages.

There are a number of tools that can be used to take snapshots of web pages. PhantomJS, CasperJS, and Zombie are great options.

Here's an example of JavaScript that uses CasperJS (which uses PhantomJS and provides some nice higher level browser functions) to capture the page:

var casper = require('casper').create(); var url = 'http://localhost:3000/#!/';    casper.start(url, function() {   var js = this.evaluate(function() {     return document;   });   this.echo(js.all[0].outerHTML); }); casper.run(); 

You can parameterize something like this, and run it against all the pages that need to be indexed by the search engines.

Here are the screen shots of the Angular TodoMVC app (with Rails backend):

Screen Shot #1 - Original AngularJS rendered content

Screen Shot #2 - As crawler sees it, no content was indexed (simulated with JavaScript turned off in the browser)

Screen Shot #3 - Snapshot taken with CasperJS, served at the expanded url with all the content

As you can see, the snapshot (Screen Shot #3 above) has all the content from the original page (Screen Shot #1) sans styling, ready to be indexed by the crawlers.

SPA SEO Services

Not interested in the DIY method? Here are a few services that provide SEO for Single Page Apps:

If you use Divshot (you should take a look if not), SPA SEO support is built in, they partnered with Prerender.io to bring you this nice feature. See the details here:

  • Divshot SEO for Single Page Apps

Resources

Categories
Author

I recently worked in a Rails project with Peter (@sporkd). The project is intended to be used as a sub-site, and should be served under sub-URI. After google, we ended up by setting config.assets.prefix and wrapped all routes in scope. The solution is simple and worked well. But soon, some weird bugs were found, and Peter was successfully isolated the problem to session (see demo sporkd/asset_prefix_test)

After several hours debugging, I finally found out the cause. To make a long story short, the routes configured in routes.rb should not start with config.assets.prefix, otherwise session save is skipped. The demo sporkd/asset_prefix_test can be fixed by simply setting config.assets.prefix to /one/assets. You also get a bonus by setting a unique prefix for assets, since it is easy to add expiration header for assets in web server.

If you are curious why config.assets.prefix can affect session and want to know some internals about the X-Cascade header in Rails, please read on.

X-Cascade Header in Rails

I never knew X-Cascade header in Rails before. @soulcutter has a post that described its usage.

The basic idea is this: if you return a response from a controller with the X-Cascade header set to "pass", it indicates that your controller thinks something else should handle the request. So rails (or is it rack? in any case...) will continue down your routes looking for the next rule that matches the request.

Indeed, X-Cascade is not only restricted in controller, if a mounted engine sets this header, Rails also continues down the routes searching.

It is a feature of Rails. Since 3.2, Rails has moved the routes logic to journey. The X-Cascade trick can be found in journey/router.rb#L69.

Pay attention that, the rack env object is shared when request is passed on. So if env is changed by former route, the latter one is affected. This is the root cause of the weird session issue, because session is controlled by env['rack.session.options'].

Sprockets, who skips the session

Sprockets, the gem for rails assets pipeline, mounts itself on config.assets.prefix and prepends the route to Rails. So if user accesses a page which path starts with config.assets.prefix, sprockets always processes the request first.

Maybe for performance, sprockets disables session save by changing env['rack.session.options']:

env['rack.session.options'] ||= {} env['rack.session.options'][:defer] = true env['rack.session.options'][:skip] = true 

The options are changed even when asset is not found. If so, sprockets returns 404 and sets the header X-Cascade. Then Rails passes the request to controller, and correct page is rendered as expected. However, since the session is already disabled by sprockets, the changed session in controller is never saved.

Because env is a shared resource between routes when X-Cascade is set, it should not be changed unless it has to. When asset is not found, sprockets should just pass though without touching env, so I submit a PR for it.

How we Debug

Peter and I worked in different time zones. He first found the session issue because several features related to session did not work. He made the demo sporkd/asset_prefix_test to isolate the issue using minimum code at the end of the day in his time zone and left me the message.

When my day started, I got the message and started debugging on session based on the demo in doitian/asset_prefix_test.

Because session store class is customizable, I inherited one from default cookie store and added breakpoints using pry. Soon I found out that options[:skip] was true, but I had no idea where it was set to true. Then I did a grep (using ag) in all gems, and fortunately, only sprockets has set this option to true. The remaining work was just figuring out why sprockets is invoked before controller action.

Categories
Author

The web application landscape has changed drastically in the past year or two. Where once every site was a silo unto itself and could reasonably expect users to create a unique login and password for each site, it is now a different story. I sigh every time I have to fill out yet another registration form, wishing instead for a simple "Connect with Facebook", "Sign in with Twitter", or "Log in with OpenID". At the same time, services are more interconnected than ever. One of the best ways to increase the popularity and viability of a new service is by piggybacking it onto the existing user bases of apps such as Twitter, Facebook, and Foursquare.

There are lots of authentication solutions out there for Rails. Many of them even have ways to connect to services such as Facebook or Twitter. But as I used these in project after project I noticed an emerging pattern: they all make too many assumptions about how I want to handle authentication in my application. Sure that makes it a quick start for the vanilla use case, but I honestly can't think of a time when I've dropped in an authentication solution and I was good to go. It's time for a change in perspective.

OmniAuth: The Unassuming Authentication Library

Today is the public release of OmniAuth. OmniAuth is a Rack-based authentication system that abstracts away the gritty, difficult parts of external authentication without assuming anything about what you actually want to do with that authentication information.

What does this mean for you? This means that you can make your app authenticate through Twitter, Facebook, LinkedIn, Google Apps, GitHub, Foursquare, and more and then have complete control from there.

Installation

OmniAuth is available as a gem:

gem install omniauth 

Diving In

Using OmniAuth is as simple as using any other Rack middleware. Of course, that's because OmniAuth is simply a Rack middleware. No complicated framework-specific configuration, just a collection of middleware to take the pain out of external authentication. Let's say I have a Sinatra app that I want to be able to authenticate via Twitter or Facebook. Here's what's required:

require 'omniauth' use Rack::Session::Cookie # OmniAuth requires sessions. use OmniAuth::Builder do   provider :twitter, "CONSUMER_KEY", "CONSUMER_SECRET"   provider :facebook, "APP_ID", "APP_SECRET" end

That's it! Now if I want to send my user to authenticate via Twitter, I send them to the URL /auth/twitter. For Facebook, /auth/facebook. The user will automatically be redirected to the appropriate site where they will be able to authenticate. Once authentication is complete, they will be redirected back to /auth/providername/callback and OmniAuth will automatically fill the omniauth.auth environment key with information about the user, so for my Sinatra client I just need to add:

get '/auth/:provider/callback' do   auth = request.env['omniauth.auth']   "Hello, #{auth['user_info']['name']}, you logged in via #{params['provider']}." end

Of course, I could do a lot more than just print out the user's name. I could also:

  • Check for an existing user via the uid key of the auth hash and log them in if one exists.
  • Create a user based on the uid and information from the user_info hash.
  • If a user is already logged in, associate this new account with the user so that they can log in using either service or post to both services using respective APIs.

The point here is that OmniAuth doesn't assume that you simply want to log a user in. It lets you make that judgment call and gives you all the information you need to do just about anything you need to do.

OmniAuth works right now for a wide variety of providers, and this list will continue to grow. OmniAuth today supports:

  • Facebook
  • Twitter
  • 37signals ID
  • Foursquare
  • LinkedIn
  • GitHub
  • OpenID (meaning Yahoo, Aol, Google, and many more)
  • Google Apps (via OpenID)
  • CAS (Central Authentication Service)
  • LDAP

A Breath of Fresh Auth

OmniAuth has been my worst-kept secret library for some time now. I've been using it as the go-to authentication strategy for new projects big and small for the last few months, and it feels really refreshing to have so much control over authentication without losing the drop-in ease of use. If you need external authentication but have found existing solutions to lack flexibility, please take a look!

OmniAuth is on GitHub with a growing set of documentation on the GitHub wiki and I have also set up a Google Group to handle any questions that might arise.

Categories
Author

If you're like Intridea, you're using Google Apps for the Domain to handle your e-mail, calendaring, etc. You may also have a good number of web applications out there than have some kind of administrative interface. It's always such a pain dealing with authentication to that interface, isn't it? You can do a few different things:

* Make your own users on the system superusers. This means you have to have some kind of console intervention usually.
* Throw up a simple HTTP Basic gateway with a shared user/password for the admin area. This often gets hardcoded, again not a fantastic way to do it.
* Do some kind of IP restriction or other server configuration magic. This seems tenuous at best.

So what's a developer to do? Well, as it turns out that by using OpenID with Google Apps for the Domain, you can have a secure administrative gateway tied directly into your company's Apps ecosystem!

Categories
Author

Rack is the common Ruby web infrastructure powering just about every framework known to Ruby-kind (Rails, Sinatra and many more). Rack Middleware are a way to implement a pipelined development process for web applications. They can do anything from managing user sessions to caching, authentication, or just about anything else. But one thing that confused me for a long time: what’s the difference between a Rack application and Rack middleware? Both are ostensibly the same, something that responds to #call(env), so how are they different? Unlike Rack applications, Rack middleware have knowledge of other Rack applications. Let’s take a look at what this means with a few simple examples.

The simplest Rack application (in class instead of lambda form) would be something like this:

class RackApp   def call(env)     [200, {'Content-Type' => 'text/plain'}, ["Hello world!"]]   end end

Note that because the method required by the Rack specification is call and (by no coincidence) this is how you execute Procs and lambdas in Ruby, the same thing can be written like so:

lambda{|env| [200, {'Content-Type' => 'text/plain'}, ["Hello world!"]]}

This hello world app would simply output “Hello world!” from any URL on the server that was running it. While this is obviously simple, you can build entire powerful frameworks around it so long as in the end a request boils down to a status, some headers, and a response body. But what if we want to filter the request? What if we want to add some headers before the main application gets it, or perhaps translate the response into pig latin after the fact? We have no way to say “before this” or “after that” in a Rack application. Enter middleware:

class RackMiddleware   def initialize(app)     @app = app   end      def call(env)     # Make the app always think the URL is /pwned     env['PATH_INFO'] = '/pwned'     @app.call(env)   end end

See the difference? It’s simple: a Rack middleware has an initializer that takes another Rack application. This way, it can perform actions before, after, or around the Rack application because it has access to it during the call-time. That’s why this works in a config.ru file:

run lambda{|env| [200, {'Content-Type' => 'text/plain'}, ["Hello world!"]]}

But this does not:

use lambda{|env| [200, {'Content-Type' => 'text/plain'}, ["Hello world!"]]}

Because the ‘use’ keyword indicates a Middleware that should be instantiated at call-time with the arguments provided and then called, while ‘run’ simply calls an already existing instance of a Rack application.

This is something that can be quite confusing, especially if you’re new to the Rack protocol, so hopefully this clears it up a bit!

Categories
Author
1
Subscribe to Rack