follow me on twitter

Automatically Generate Keywords

11 Nov 2008 In: Ruby on Rails

One of the most populair Ruby on Rails plugins is acts_as_taggable. More specifically the enhanced version acts_as_taggable on steriods. It allows you to easily add tags to any model and helps you generate tag clouds.

After reading Nate Koechley article on Yahoo’s term extractor API i was inspired to connect it to acts_as_taggable_on_steroids. The goal is to parse any peace of content (article, blog post, review etc.) and  let Yahoo return a list of terms or tags.

Setting up

The first thing you need to do is install the acts_as_taggable_on_steroids plugin.

ruby script/plugin install http://svn.viney.net.nz/things/rails/plugins/acts_as_taggable_on_steroids

Prepare the database

Next we generate and apply the migration:

ruby script/generate acts_as_taggable_migration
rake db:migrate

Define which model you want to use

class Post < ActiveRecord::Base
acts_as_taggable
 
belongs_to :user
end

You can now use the tagging methods provided by acts_as_taggable, #tag_list and #tag_list=. Both these methods work like regular attribute accessors.

p = Post.find(:first)
p.tag_list # []
p.tag_list = “Funny, Silly”
p.save
p.tag_list # ["Funny", "Silly"]

You can also add or remove arrays of tags.
p.tag_list.add(“Great”, “Awful”)
p.tag_list.remove(“Funny”)

Sign up at Yahoo

You need to sign up for an application ID at Yahoo! Web Services. You need the application ID as an identifier when making API call.

Add parse_tags function

I have written a small function that you can place inside your Posts model and use to create the API call for you.

class Post < ActiveRecord::Base
acts_as_taggable
 
belongs_to :user
 
def self.parse_tags(context, query)
tags = []
 
#yahoo parsing of tags
url = URI.parse('http://search.yahooapis.com/ContentAnalysisService/V1/termExtraction')
 
post_args = {
'appid' => 'your application ID',
'context' => context,
'query' => query
}
 
resp, data = Net::HTTP.post_form(url, post_args)
 
# extract event information
doc = REXML::Document.new(data)
doc.elements.each('ResultSet/Result') do |element|
tags &lt;&lt; element.text
end
 
return tags
end
end

Context is the text part you would like to scan and query is an optional query to help with the extraction process. (usually a title or subject)

Lets say you have a post model with a body and title column. Now you can do the following:

p = Post.new
p.body = “Italian sculptors and painters of the renaissance favored the Virgin Mary for inspiration.”
p.title = “madonna”
p.tag_list  = Post.parse(p.body, p.title)

p.tag_list # “italian sculptors, virgin mary, painters, renaissance, inspiration”

Note: Rate Limit
The Term Extraction service is limited to 5,000 queries per IP address per day.

Sources: Yahoo! Search Web Services, acts_as_taggable on steroids

Trackback URL


Other posts in Ruby on Rails


3 Comments on “Automatically Generate Keywords”

  • antike bilderrahmen
    January 10th, 2010 18:25

    Hallo! Ich wollte kurz mitteilen, dass ich irgendwie Probleme hatte auf dem Blackberry. diese Seite anzuschauen. Jetzt am “normalen” PC ist es aber kein Problem mehr.
    Komisch : ) aber trotzdem, vielen Dank für den interessanten Blog. Hab ihn mir gebookmarked ;)

  • admin
    January 10th, 2010 18:41

    Thanks for the heads up. I will look into it. We are using a different wordpress plugin for the mobile version, so that might be the problem.

  • mediestrategi
    January 18th, 2010 14:38

    hey thanks for that detail really you give complete detail post about that topic thanks for such a nice and informative post. really i benefited a lot..

Leave a Reply