18 August, 2008

Feedity: Unethical RSS?

In my position as webmaster of Share Our Strength, I am constantly on the lookout for better and easier methods of generating content to drive visitors to our many websites. One exceedingly easy method is to grab content from outside RSS feeds, allowing a page on our site to have constantly updating content from another site. Working with RSS is so easy that when I'm in a rush, I sometimes grab our own content through RSS, just to save time.
Grabbing rss content to post on your own page is perfectly permissible, both legally and ethically. Firstly, because when you grab a feed, you are not only linking to them, but also driving visitors to their site, but, more importantly, because by publishing an rss feed, they are inviting users to use that content. (People can even monetize their feeds by using Google Adsense for Feeds.)
But what if you find regularly updated content that doesn't use a feed?
I recently found a site that had data that I wanted to pull, but no RSS feed existed of the content I wanted. The webpage is American Towns, and it shows regularly updated local events content for a specific city. (The content in particular that I am interested in is "Local Events", on the left.)
So, after a bit of thought, I did a quick google search and found Feedity.com. In less than five minutes, I had created an RSS feed that takes JUST the info I want from the American Towns site.
Feedity allows you to define the opening and closing tags of anything yo want turned into RSS; in this case, I chose <div class="event"> to begin each RSS link, and </a> to end it. This allows me to get just what I want in the feed I'm creating: the event name with a hyperlink to more info on the event.
From there, feedity did the rest, and I had an RSS feed ready to go.
Had I been making the feed for my private use, I would not feel s weird about it, but since I was creating this feed for the purpose of making dynamic content on one of my sites, I realized that perhaps this kind of feed was not quite as ethical as feeds that are put out by the content owner. After all, the feed I am pulling here was not intended to be pulled by the content owner. Even though I am linking to their site while pulling the events list American Towns publishes, at no point did I get even an implicit nod concerning the usage of this data on my own site. I was, in a way, just framing their content without their permission.
Because of this moral quandary, I decided not to go through with using feedity in this way. But now that I am aware that the possibility exists of taking content straight from other sources like this, it occurs to me that one could mass produce sites that could be automatically generated from ANY site, just using completely customized RSS from feedity. Each site would literally take less than thirty minutes to create, once the general design was chosen. Pop a few adsense fields on the page, and tailor it to a specific audience who would find the info useful, and profit inevitably results. Hell, I've done testing on this site where I go for multiple months without posting a single blog entry, and I STILL take in a few dollars each month from adsense. Yet what I'm describing through feedity is upwardly scalable in terms of the number of sites, and requires absolutely no upkeep to maintain.
In short, Feedity makes it possible to easily create completely unethical sites that can consistently generate income in the aggregate without maintenance. This makes me almost want to mark my link to them as nofollow, but since most users of feedity probably use the feeds for their personal use rather than for website creation, I decided to give them the benefit of the doubt. After all, I use feedity to keep track of my thirteen year old sister's fan website, TwilifyMe.com, and that's a lifesaver all in itself.
Also, in case you're interested, I did NOT get paid to write this entry.

2 comments:

  1. Cool tip! Btw, I don't see problem with 'web scraping' as long as the bot service follows the host's robots.txt definition. That's the standard way of dealing with crawling. Search engines, inclduing Google and Yahoo, follow the same directions when they 'scrape' and index web sites for public search. Just my 2 cents :)

    ReplyDelete
  2. Then if someone doesn't explicitly say not to crawl their site in the robots.txt, that then makes it okay to do so?

    ReplyDelete