Hyperlocal News Wire

POSTED IN my projects, social media work | TAGS : , , , , , , , , , , , , 9 July 2009

Here’s a pipe I’ve created that attempts to marshal the content from hyperlocal blogging in Birmingham and allow people only to subscribe to feeds that interest them. This is a piece of investigation and experimentation that I’ve been able to find the time to do thanks to Will Perrin and his hyperlocal blogging initiative Talk About Local. Will also helped define the reason why it would be useful to do — for what he called “lazy journalists”.

Lazy here is used in the same way that it might be used — in praise — of a computer programmer; that is, lazy means you’ll work hard at setting yourself up right to make sure you get everything you need easily later on. Will got to the crux of the argument by saying that journalists interested in a subject — let’s say noise abatement issues — could easily find examples of those at a local level outside the areas they physically know.

So this is a run through of the decisions made in building it (and what other options could work), it’s no more than a prototype at this stage so comments and improvements are very welcome. However if you would rather just get stuck into the pipe itself, head on over.

What are we collating?

This is actually one of the most important decisions, blogs about an area (rather than being based there, see this list of Birmingham based blogs for the size of the task without that) are what we want and this isn’t a simple case of aggregation — we wouldn’t want it clogged up with planning applications, flickr photos and BBC travel news, if any of those subjects are interesting then we trust the local bloggers to raise them. There are two levels of human filtering here, first those of the bloggers and secondly the person selecting the blogs is filtering for interest/quality too. In this instance I used Pete Ashton‘s local blog tumblr as a basic list to pick from (some were discarded, I think you can see why from the descriptions there – not a reflection of quality, more subject matter).

Why Birmingham? Simply because I’m there and that helps when making value judgements on the contents and context, a little local knowledge or a deal of research is needed to select which sites to get info from.

Why a pipe?

Yahoo Pipes make operations on RSS feeds easy to understand and alter, plus they are easily copied (cloned) and so people are free to make improvements or use the method for their own areas. Another reason for using a pipe is that it what it’s doing is passing the content through, it’s not storing or replicating the information, so no issues of permission arise.

Luckily in this instance all of the sites we’re interested in have RSS feeds, and almost all have full-text feeds. Tools like Dapper’s Dapp Factory will attempt to produce RSS content from sites that don’t offer it, and can work quite well.

You could attempt to do this via search, it’s possible to set up a custom Google that is restricted to a list of hand added sites (and here I attempted it) — useful for one off searches (although you’ve still to find the search engine) but not “lazy” enough. It would also be possible to use Google Reader, which can be used to collect feed items that you later search within, and offer a hand curated OPML list of relevant sites — two issues with this, one is that you still need to “read” (or “mark as read”) the items as they come in which is time consuming. The more important issue is that by using a pipe blogs and sites can be added/removed by the curator without people using the pipe having to bother.

How is the pipe built?

The first task is to take the RSS feeds of the blogs and add them to a ‘Fetch Feed’ module:

Pipes_ editing _Birmngham local blog aggregator_-1

These feeds are all for blogs that are simply about Birmingham (or areas within it), we can expect the content, titles and tags to be useful in filtering from the collection of posts.

For blogs with specific subject remits as well as area, there is something else we can do to increase the quality of the filtering. By adding tags that apply to the whole site we can add information into the feed — the thought being that sites about a subject may not mention it in every post. For example the blog Created in Birmingham is about art in Birmingham, although every post may not mention the word “art”. We can add this into the feed with a little jiggery with regex:

Pipes_ editing _Birmngham local blog wire_

What we’re doing here is creating another field in the RSS feed and adding the “tag” (or tags) that we wish to apply to the whole feed (I’ll happily admit that my knowledge of regex isn’t huge, this works but a more elegant solution would be very welcome). These feeds can then be combined with the feeds we’re grabbing unaltered.

Pipes_ editing _Birmngham local blog wire_-1

Then comes the filter, the text box (yellow and on right in the illustration) is what makes the pipe work for any specific subject. It is what creates the box on the pipe home page for input:

Pipes_ Birmngham local blog wire

The text input is used to filter by only letting a feed through if there’s a text match in either the title, description, guid (the URL which can contain useful category info) or the “tags’ (which we added).

That’s basically it — sorting the feed wasn’t that useful (the initial results aren’t as important as what’s fed through in future) and as we’re hand picking feeds (rather than using search) we shouldn’t need to run a “uniqueness” operation.

Results and further development

For a prototype the results are good. Trying searches like “council” or “football” works well, and even more detailed queries such as “noise abatement” produce what I’d expect.

Due to the “matching” algorithm used by Yahoo pipes a search for “art” for example will match “part” too — I couldn’t see a way to make sure it matched whole words only.

The adding of the tags probably needs further testing, more synonyms and a few more feeds to test to make sure it works. You could in theory take the content of each blog post and use replace to add synonyms for all sorts of words (football/soccer for example) — this would be time consuming, but may add to the usefulness of the additional tagging.

Thoughts, improvements, versions for other locations (or subjects perhaps) would be very interesting to see — do please tell.

The filter-able pipe is here: http://pipes.yahoo.com/bounder/birminghamlocal please clone and develop (and for those who just fancy all of the content, a feed without filter is here: http://pipes.yahoo.com/bounder/birminghamlocalblogs )

15 COMMENTS

  1. [...] Hyperlocal News Wire – jon bounds – Jon talks through the process of building a Yahoo Pipe to intelligently aggregate local blogs. Amongst other things this helps answer the question "what are all these local blogs good for?" [...]

  2. [...] has created a customisable hyperlocal news aggregator for Birmingham’s local sites.  You can read about the pipes based aggregator here and Jon explains how you can do it for your [...]

  3. [...] How to build a Hyperlocal News Wire – jon bounds (tags: tools) [...]

  4. Wow that’s brilliant well done! aves me adding a tonne of subscriptions to my Google Reader. It also means I think I *finally* understand pipes! :-)

  5. [...] for the work John Bounds who has done a great job in developing a Google pipes solution to deliver hyperlocal blogging from Birmingham as RSS feeds.  Thanks to John there is a really clear work through (with pictures) which I am [...]

  6. Blimey, that’s genuinely proper clever.

  7. [...] but they do tend to follow the 80/20 rule of participation and still act as a bottleneck. The Hyperlocal Newswire is another possible solution but it’s more search based than perhaps I’m [...]

  8. [...] So if you want a feed about arts in Birmingham you just enter arts as your keyword.  Jon describes it on his blog and talks through the process so you can do it yourself.   It is a prototype, but very [...]

  9. [...] has created a customisable hyperlocal news aggregator for Birmingham’s local sites.  You can read about the pipes based aggregator here and Jon explains how you can do it for your [...]

  10. [...] You can read all about the thinking behind the Birmingham Blog Wire here. [...]

  11. Paul Daniel says:

    So this pipe partially solves your “art” issue.
    http://pipes.yahoo.com/pipes/pipe.info?_id=d594a62ab4409ed82d525a95b0f38a61
    The problem with this pipe is that it rules out “arts”, for instance, but it is only a first stab.
    Your Filter module using Pipes version 1 (V1) removes all but one of any items that have no element called guid.content. Pipes version 2 (V2) does not do this it would seem (and my pipe is version 2).
    As of August 1 all pipes will be v2.
    In Pipes V2 there is a maximum of 10 feeds per Fetch Feed module. There is not, as far as I am aware, a limit on the number of Fetch Feed module in a pipe. This restriction applies only to new pipes.

    If you want any help please contact @hapdaniel (Paul Bradshaw knows me).

  12. [...] of our long term thinking at Talk About Local on futures for hyperlocal publishing and means of local content delivery. We are convinced that geotagging of the great public service stuff people write or photograph for [...]

  13. [...] of our long term thinking at Talk About Local on futures for hyperlocal publishing and means of local content delivery. We are convinced that geotagging of the great public service stuff people write or photograph for [...]

Loading