The hinternet, the internet we’re missing

There’s a new digital divide, a fissure opening wider and wider as the social web makes encroachment into most forms of information. You may have heard of the ‘darknets’ — unseen networks of computers for filesharing — networks you’re only allowed onto if you’re trusted not to give the game away. What I think I’m seeing the emergence of is almost the exact opposite, but increasingly disconnected.

There are hundreds of websites, lovingly researched and maintained by enthusiastic and knowledgeable people, that it’s becoming almost impossible to find. The sites are built on old technology, and that contributes to their decreasing visibility but it’s not the only reason. The lack of RSS feeds, pinging servers, dynamically generated sitemaps and up-to-date robots.txt files makes it more difficult for other sites to keep in touch with them. That they are often built in HTML by hand makes them more difficult to update, and fresh content is prized by search engines.

The lack of RSS and knowing when updates occur also decreases people’s awareness of the sites, you either have to remember they’re there and how you found them — or bookmark — and check for updates on a regular schedule. Less reminders, less nudges, so less incoming links.

They are often in very niche areas, like local history, local news, and so generate a limited number of hyperlinks from other sites. They often only get links from each other, which is great in a community sense but means the incoming links are low in pagerank (which would push them higher up Google searches).

As more and more sites get better and better at search engine optimisation, as blogs and other social websites link and link again and expand into more areas, and as Google relies on the same sources more and more the sites are getting less and less visible.

And that’s bad because they have a wealth of important content that we need to be able to find.

I’m calling it the ‘hinternet‘.

(From hinterland, in German the part of a country where only few people live and where the infrastructure is underdeveloped.)

Solving the problem is a tricky one: Google’s mission to index the entire World’s information doesn’t always mean that we can find what we’re searching for, the semantic web will only work if the correct metadata is stored with the hinternet sites (and they’re already often “behind” technology-wise).

Search needs to get better, but us on the social-web also need to help. We need not only to link to these sites, but — where we can — help nudge the guardians of the hinternet towards greater visability by becoming “social”.

The moral web

At WordCampUK over the weekend I found myself increasingly thinking about the moral questions that relate to how we behave on the internet. Not the same questions about whether we’re nice to each other, or lie or steal, that’s just a virtual version of the real world — more just how far you’re willing to push your idea/content/site on other people.

I’ve already decided that, to me, crossposting is a moral dilemma, it’s up to you to decide how much you’re willing to “shout” — if you’re willing to send links your blog posts out via twitter, jakiu, Facebook status update etc. then they’d better be really good, or people will start to get annoyed. I like to give the choice about which stuff I produce that they see, so as well as keeping different subject areas separate, I don’t push anything automatically at people. There are RSS feeds avaliable, they’re clearly labled, you can chose to subscribe (or come to the sites) or not — it’s up to you.

During an interesting talk on WordPress and SEO, I found my mind wandering from the subject of how to optimise your WordPress installation so people can find your content by searching for it (a good thing of course) to whether the “deliberate” search engine optimisation being explained is something I’m comfortable with at all.

While I’m all for tagging, indexing correctly, logical naming, and easily navigable structure, (I’ll call that Semantic SEO, making your meaning clear with formats and data) anything else seems to be gaming the system — cheating.

The talk covered subjects such as paying for content to be written and put on the web that used links to you with your selected ‘keywords’ as anchor text — “journalists are cheap” we were told — playing ‘link-builders’ to get links to your site around the web.

If “journalists are cheap” then they’ll write cheap words, useless words, words that will drive the quality of the web down. If links are bought then they aren’t the “peer review” links that Google based their original algorithm on (the concept that people would link to stuff that was good, the best stuff gets the most links).

If you prize traffic that is coerced into arriving at your site, that has to be your decision. I’d much rather an honest web where the best content, the best services rose to the top. This is why I am being driven from conventional search to more social search options. I’ll often only Google now for a company or product name after checking out the advice of people I trust on twitter. Search engines are being conned, and until they stamp out the link-buying, the splogging, they’re going to lose traffic — and utimately if your sites are doing the conning you will too.

I know that the practices are tempting, I know the argument goes that “our competitiors are doing it, we have to to keep up”. I don’t buy it. Do the best you can, create the best content, host the best discussions, link the best links, provide the best service people will recommend you, with links and by social-web-word of mouth.

Don’t put anything on the net that you don’t think increases its value.

New feature wishlist for Google Reader

I’ve been thinking some more about the whole, information overload, autogenerated echo, crossposting thing.

I’ve come to the conclusion that I don’t want RSS feeds aggregated for me on yet another web service, I don’t want every feed from every person and have to filter them out (and for duplicates). In short I want all my information in one place, custom search feeds and the like as well as people’s RSS, news as well as flickr tag feeds.

I like the Google Reader experience, I like that it’s in sync across my laptop, my phone, other computers. Google Reader could blow FriendFeed and others away if it implemented a few new features.

Here’s my new feature wishlist for Google Reader:

  • The ability to filter feeds as the come in (by location would be great, I have a lot of searches for “Birmingham” and only want the UK versions).
  • The ablitity to remove duplicate items from different feeds (and chose which “original” version remains). Two examples: blog/news results in my search feeds when I already subscribe to the originating feed. Also removing auto generated posts: twitters in friends’ Facebook statuses or “daliy links” posts in blogs when I already subscribe to the feed.
  • Filters to “mark as read” posts (similar to GMail). By tag would be fine — Google Reader’s search feature is brilliant (allowing you to seach within everything that’s come through), there are things I’d like to be able to search (obsure news feeds, heavy feeds like Digg content) but I don’t want to have them as outstanding posts to be read. You’d be building up your own subset of the web.
  • See other people’s notes (if shared of course) — the new notes feature is great, you could have a conversation with the notes if you could see other peoples’. A little like the “comment on anything” stuff that people are hot for on FriendFeed.

Bookmarks for 26th February through 28th February

These are my links for 26th February through 28th February: