Just mumblings and grumblings.

15.12.05

Experiment with T-Shirts

T-shirts are one of the more interesting forms of expressions in the current age. Where once we were required to actually stop and talk to each other to exchange political ideologies, favorite bands, and the occasional joke, we now simply emblazon our chests with a picture or slogan and broadcast it to the world. Perhaps it's just a symptom of our being bombarded with advertising since birth, but I for one am a t-shirt addict.

Some of my favorites from my personal collection:


As you can see, my personal aesthetic leans toward the humorous, obtuse references to sub-cultures and the non-sequitur. I'm constantly on the hunt for new shirts for myself and my friends and sometimes walk a thin line between "artistic expression of ideas" and "just plain tacky".

I'd come across CafePress before when looking for t-shirts. The idea is quite clever. They allow anybody to upload images which you can then either buy for yourself or re-sell in a little boutique shop. They take care of all the details like charging credit cards, printing the shirt and shipping it.

Their catalog of shirts is MASSIVE. With some 18 million products in their marketplace (they don't just do shirts, they also do mugs, mousepads, buttons, books, CDs, etc.) you can be sure to find something in there that catches your fancy, and if you don't, your dream shirt is only a PhotoShop session away.

I've often thought about creating a few shirts for myself and maybe even making a few bucks on the side, but I don't have any illusions about my artistic talent or my ability to come up with ideas that can stick out amongst 18 million, so I never bothered.

That is, until I decided that it'd also be a decent experiment in how difficult (or easy) it is for a boutique web-store with (hopefully) a few worthwhile products to succeed in the current state of the internet.

So I took the leap: Clash Culture T-Shirts.

I'll let you know how it goes in follow-up posts.

30.6.05

Internet Polls

I've always liked internet polls. There's no more succinct or simple way to package so much information about what people are thinking and how they feel about it.

Having written a lot of generic content-management code for Webvolver recently, I thought I'd take a stab at a public internet poll engine with some user-moderation features. I came up with this:

StickyIssue.com

It's sort of an internet poll wiki where people can annonymously submit a poll, vote on existing polls, comment on polls, and perform some self-management functions like marking polls or comments as spam.

27.6.05

7 Ways to Tell If You Are a Political Idiot

I love an intelligent debate, especially with someone who's viewpoint is completely at odds with mine, but I've come across far too many political idiots on all sides of every argument lately, and I have to let off a little steam. Here's a short list of how to tell if you (or someone else) is a political idiot:


  1. Think about a political argument you feel very strongly about. Abortion, gun control, gay marriage, whatever. Is there at least one minor point where you can agree with someone of the opposite viewpoint? If not, you are an idiot. You place either your emotions or your preconceptions ahead of all evidence and/or logic.
  2. Do you know who your city councilman/alderman/commissioner is? This person probably has more influence on the laws which govern your home, your schools, your roads, your job and your taxes than any other single person in government. This person is also probably one of your most accessible office-holders, since he/she probably personally reads his/her email and keeps an open office in your very own local city hall.
  3. If you are a Democrat and never voted for a Republican, or vice-versa, you are an idiot. Nothing wrong with picking a side but you need to educate yourself on what's going on on both sides of the aisle and the time will eventually come when a moderate from the other side is going to represent you better than the radical on your own.
  4. Have you ever compared anyone to Hitler, called them a Nazi or called something fascist? If so, you are an idiot. There's nothing wrong with using a few emotional words to emphasize your point, but you've slid right over the edge into demagoguery.
  5. If you don't know what the difference between a Democracy and a Republic is, you are an idiot. As a corollary, if you think there is absolutely no reason for the US electoral college, you are an idiot. That's not to say you can't disagree with the electoral system, but you should do so because you want the US to shift towards a purely Democratic system rather than it's historically Republican one. (If you are confused about Democratic/Republican as political systems vs. political parties, you are also an idiot.)
  6. If you have an opinion about global warming, population control, free trade, tax reform, nationalized healthcare, or just about any other political issue and have not considered the economic as well as social and environmental impacts of what you believe, you are an idiot. In fact, if you have never voted in favor of a person or policy which you personally found distasteful you probably are in this class of idiots.
  7. If you think the US Constitution is the ultimate answer to all political questions, you are an idiot. It's a fine document which has stood up remarkably well for more than two centuries, but never forget that the bill of rights are amendments that very nearly didn't pass and that it took almost 100 years for it to be clarified enough to give blacks the right to vote and 150 for it to give women the right.

A Better Slashdot

Anyone who regularly visits Slashdot knows that it is a unique combination of profound commentary, hilarious quips, scathing wit, and what can only be described as textual brutality. One of the first true group blogs (dating from 1997), it serves up technology related news for the geek crowd to peruse and comment on.

While it has improved over time, the formula is basically the same. Users submit articles to a group of editors. The editors revise and post appropriate articles to the site. Users can then comment on each article. A self-moderation system is used to weed the sometimes 1000+ comments under each article to a more manageable level.

Yet while Slashdot's system is very good, we've come a long way in the 8 years since its inception. So is there a way we can build a better Slashdot? With the personalization techniques seen in recommendation systems and from what we've learned of social networks, I think we can...

Goals

Let's start by listing Slashdot's strengths:
  1. Articles are usually relevant (to its audience) because of filtering by editors.
  2. Anyone can contribute by submitting articles or comments.
  3. The plethora of commentary is diverse and interesting.
  4. Each user's comments are moderated and contribute to an overall karma score which allows better contributors to float to the top.
And now its weaknesses:
  1. The sheer volume of content is almost impossible to read without a full-time commitment.
  2. Sometimes even the very best comments get buried because they are never moderated.
  3. The volume of "bad" commentary outnumbers "good" commentary by at least an order of magnitude. What is "good" and "bad" is, of course, subjective.
  4. The user moderation system is reserved to only a handful of users each day.
  5. The editorial culling of articles means editorial bias.
I think you can boil this all down into one key problem: content overload.

To be more precise, it's not that there is too much content per se, but that there is too much content you must view in order to get to the content you want to view. Even editorial bias is just a corrallary of this problem, since if you had a way to put submissions in front of users without compromising the overall relevance of the site, you wouldn't need to use an editorial staff.

Ironically, Slashdot is probably successful for just these reasons. It just so happens to strike the current best balance between lots of content and relevant content.

That is not to say that it couldn't do even better...

Solution

Start with a user-contributed news list that allows users to discuss each article, just like Slashdot.

We first need to winnow all those articles down into something manageable. One interesting way to filter without throwing out a lot of content out of hand is personalization. We're all familiar with recommendation systems as used in online stores like Amazon and iTunes to suggest items based on the activity of other users. This approach could be applied to articles on our site in much the same manner. Article feeds could be personalized to each user based on what other, similar users like/dislike.

Assuming such a system works well, this allows us to open the article submission floodgates and depend on the personalization system to weed out spam, uninteresting or irrelevant content on a user-by-user basis. This would remove any external editorial bias yet still allow for a highly varied mix of content to coexist.

The recommendation approach could work for comments also, but we can't realistically expect at least one user rating for every comment in the system, which is a minimum requirement for such a system to work. (And besides, is a good comment necessarily a well liked one?)

Instead, we could institute a 'reputation' system similar to Slashdot's 'karma' system. Users would be able to rate comments and this would contribute to an overall reputation score of the comment's author. Unlike Slashdot, we'd open moderation to all users all the time.

Rather than hide poorly rated comments, we'd emphasize the highly rated ones using colors, larger and bolder text. This provides a positive feedback mechanism for good users, without overly punishing new or recently reformed users.

In addition, users with good reputation could receive other perks, like prefered sorting of articles that they've authored or even access to premium site features.

Of course, there is going to be a certain amount of outright spam in our articles and comments, so we ought to provide a simple spam deletion mechanism. Allowing users to suggest items as spam and having the system automatically hide them after a certain threshold is probably sufficient.

My Money Where my Mouth Is

That's a lot of words, and a lot of would's and could's. So why not step up to the plate...

Shazaaam! Webvolver.com is born.

Webvolver adheres to the design I've laid out above along with a few user-interface tweaks to make browsing comments a bit less tedious. I could go on and on, but I think the site speaks for itself, especially its FAQ page.

Obviously, as a fledging website it lacks the content to truly compete against Slashdot, but after cursory testing with friends, I feel it's certainly a great start.

So feel free to use it or abuse it and you can be the judge of whether or not it's really a better Slashdot.

16.6.05

The Record Industry is Doomed

There are any number of opinions on who's right/wrong or winning/losing the current battle between the record industry and file sharers. But I can't help thinking everyone is missing the point a little. The recording industry as we know it is doomed, whether we like it or not.

Why the Recording Industry Works Now

The recording industry works by signing artists, subsidizing the production, distribution and promotion of their music and taking a cut of media sales. We can argue the ethics (or lack thereof) of the industry, but in the main this system benefits both consumers and artists.

Most up-and-coming artists simply lack the resources to promote their work on any kind of wide scale. In the media rich environment we live in, gaining public consciousness is a tough (and expensive) nut to crack. This is doubly the case if the artist's work doesn't have mainstream appeal.

Consumers benefit because there simply isn't enough time in the world to listen to all the music out there, and even if you could it wouldn't be a very rewarding experience when perhaps 90% of even recorded and distributed music is crap.

Doom Looms

Unfortunately for record labels, this state of affairs all depends on a slippery premise: That media itself has intrinsic value*, apart from the content it contains.

When something has intrinsic value, it is a natural and simple process to assign a price to it:
Price = V * K, where K is some factor > 1.
Distributor and retailers can then pump up K to account for the subjective value of the product (how good the music is) with a healthy (some say too healthy) profit margin for themselves.

This makes sense to consumers who already buy commodities like food or computers that are priced in just this way. It also makes sense in the market since if a company puts K too high, a competitor is sure to come along that prices more aggressively.**

So what happens when V = 0? And when anyone with a home PC is able to create a perfect digital copy and distribute it via the web, V is very very close to zero.

The answer is the recording industry has received it's death warrant.

Labels could shift to a fixed pricing scheme (a. la. iTunes, Napster) but such systems will always have to compete in the market against P2P sharing systems that charge nothing (really, they just subscribe to V*K, they're only free because V = 0).

DRM solutions, regardless of whether they ethically retain consumer rights, are inherently flawed, since no such system can prevent the consumer from eventually subverting it (if need be with a second computer and a microphone).

The RIAA's current tactic of suing individual file sharers can never be effective since the cost of identifying and suing every file sharer is prohibitive. And they would surely have to sue just about every single file sharer on the planet to stop P2P networks.

* - All forms of store bought media (CDs, records, etc.) have intrinsic value. Even if artists wanted nothing for their work, they'd still need to charge some nominal sum in order to pay for pressing, packaging and distribution of the media.

** - It's worth commenting here on the recording industry's consistent use of price fixing and other monopolistic tactics to force out competitors. The FTC's statement regarding the recently settled price fixing case against a few major distributors is available here.

Speculation

I think the only possible answer is that the price of music must drop dramatically. There still exists an equilibrium point where the intrinsic value of a music download service (large catalog, availability, download speed, recording quality) can pump V back up enough to make a pay-to-play service competitive against a free P2P network in the free market. But 99 cents per track ain't it, at least not in this consumers view, especially when the recording industry is only passing on a few of my cents to the artist. Lower prices = lower margins = no room for the recording industry.

iTunes and Napster already offer independent labels the ability to directly sell their music, but artists are still forced to go through an intermediary service (CDBaby.com being a popular one), pay for their own production in a studio and have few options to promote or set the price for their tracks.

I think music services will begin to adopt a more direct, middleman free, approach to listing independent artists and perhaps directly consolidate some of the production and promotional services under one roof. This would allow an artist to be truly label free, if they so choose, and why wouldn't an artist choose?

Independent labels will still exist, but these labels will compete much more fairly in the open market than current labels, and hopefully consume far less of an artist's hard earned profits.

10.6.05

Who are You Online?

The internet is a pretty amoral, often moronic, and sometimes scary place. Why should this be? Shouldn't the internet merely be a reflection of our society and culture onto a computer screen?

Call me an optimist, but I'm not quite ready to buy the argument that 'people are amoral, moronic and scary'. I think it has more to do with anonymity, and anonymity leads to lack of consequences, and no consequences means bad behavior.

In the real world there are many social restrictions on how a person behaves. Laws and religious codes are the more obvious examples, but there are also any number of unwritten cultural rules that range from how one should respect a person's privacy to what clothes are fashionable this season. People who break these rules are punished by being ostracized or ignored. This is by no means a perfect system (as any high school geek can tell you), but it's hardwired into our brains' emotional systems, so we're stuck with it.

On the internet, all these rules are relaxed or erased. While civil law still applies, social and cultural mores can have no consequences when all the participants are anonymous. People behave badly because there is no reason for them not to.*

So how do we make the internet a friendlier place?

Reputation.

Imagine if everyone had a backlog of all the posts they'd ever made to websites that everyone else could trace through. A user would think twice before posting something nasty or foolish. On the flip-side, people would be more likely to listen to someone whom they could see had made intelligent and well-behaved posts in the past.

This immediately sets off the privacy buzzer in some people's minds, but there is really no need to associate this online persona with a real person. Anyone who's played a massively multiplayer online game can attest to how a virtual persona can be anonymous yet still have a reputation to be accountable to.

In fact, most websites already require people to create semi-anonymous user profiles in order to post. The key is to tie all these internet identities together into one cohesive and searchable whole. There need be no limitations on the number of personas you could create nor which personas are used where. People who want to behave badly could just create fresh profiles for every site, but readers could have the option to filter out people without an online reputation.

Such a system could insert reputation into the internet without destroying the anarchic playground that we all know and love. From reputation comes respect and the ability to mute all the shouting, the whiners and the just plain psychopaths infesting the web today.

* - I suspect the biological reason for this is that we are creatures of emotion. When considering our actions, emotional factors get summed up and the strongest one wins. Shame is a very strong emotion, so it often wins in real-world circumstances, but on the internet it is muted so that more visceral emotions like anger, lust, ego win out.

Addendum

The best part about such a system is that the infrastructure is already in place via public-key cryptography. (I'll leave a write-up of how the technology could be used for a later post.) The only gap is a standardized way of associating keys with personas and personas with websites.

There are a few notable single-sign-in systems that I've seen which are kin to what I've described, though they aren't intended necessarily as reputation building systems:
  • TypeKey is a blog focused identity system that allows blog owners to restrict comments based on a centralized registration list.
  • OpenID is an open source effort that allows users to create their own individual profile servers that websites could authenticate and pull profile information from.

5.6.05

Searching, Sharing and Stumbling

Over the last couple of weeks I've been nosing around the internet looking for places to announce my website (KindaKarma.com) and did a lot of musing about how people find things on the internet.

When you really think about it, websites are a just a collection of isolated islands in a vast ocean of information. The island inhabitants build bridges from time to time from one island to another, but such bridges are one-way, and even some of the coolest websites are doomed to anonymity and an early death.

Fortunately, there are a few good ways to get around the archipelago:
  1. Seach Engines (Google, AskJeeves)
  2. Web Directories (Zeal, Yahoo Directory, DMOZ)
  3. Blogs (Slashdot, Joe Blow Blog)
  4. Social Bookmarks (StumbleUpon, del.icio.us)
Search Engines

Everyone knows what a good search engine does. It finds websites based on the keywords they contain. That's not quite enough though, since you'd just end up with a huge list of links of which perhaps only a handful are really interesting. So search engines also sort results by relevance. Google makes a good effort there by assuming that pages with lots of links to them have higher relevancy.

Search engines aren't a panacea though. New sites, even very worthy ones, have a long road ahead before they can come anywhere near to the top of the pile. Compounding this problem, there are also a lot of junk sites out there that exploit how search engines work to garner higher placement even though they might not be particularly relevant to the user.

Web Directories

Web directories are basically just large lists of websites sorted into categories. This provides a sort of rudimentary index to the web. Since two-thirds of the web is spam, most directories go through a filtering process to ensure that their listings are relevant and accurate.

This is all well and good, but at last glance there were more than four and a half million websites listed in DMOZ and that's AFTER filtering. Such an overwhelming list is bad all around because it's a phenomenal amount of work for editors, a bitch to wade through for users and even the best websites are little more than a link among many similar links.

Blogs

Blogs come in many flavors, from the communal news sharing that is Slashdot, to the Joe Blow man on the street's personal diary. A perhaps unintended consequence of blogs, is that blogs tend to be textual (read keyword rich) repositories of links to interesting or useful information. They make for a rich source of cross-linked data for search engines to chew through. This results in better search results, and if users happen to find a blog they particularly like, they get a human edited list of interesting links.

For the webmaster though, this is a zero-sum game. While many blogs allow users to submit their own posts, blog communities are continually bombarded by promotional posts making editors extremely wary to post anything even remotely commercial. Users get stiffed with having to sometimes wade through banal, redundant or downright moronic blogs for information.

Social Bookmarking

Social bookmarking (wikipedia article) is a relatively new concept that is being approached in a variety of ways. In essence, users share lists of links with each other, usually in some kind of category or keyword framework. The categories and keywords keep these lists relevant, while users keep out spam by only submitting worthy links.

One site, StumbleUpon takes a pretty novel approach to this by using collaborative filtering in combination with social bookmarking. Users install a toolbar in their browser that gives them a 'stumble', 'I like it' and 'not-for-me' buttons. Users click 'stumble' to get a randomly selected page based on what similar users liked. They vote them up or down. It works quite well, though it's a much more time intensive approach for users, as many of the websites you visit will not be interesting.

Another interesting site is digg, which adds a rating and ranking system that provides robust relevance filtering. Each bookmarks also gets a sizable write up, allowing the user to skim lists before clicking. Sadly, the site appears to be a victim of its own success, as page load times are extremely slow as of now.

Unfortunately, all these systems are in their infancy. Many suffer from terrible UI design that makes browsing a frustrating affair. Others lack any kind of moderation, which leads to less and less relevant links (I've yet to see a significant amount of spam on these systems, but I feel it's only a matter of time). For the webmaster, they represent a great new way to share links, but they'll need to be active members of each community to give their links any kind of prominence.

Best of All Worlds

So is there a best of all worlds solution that would cure all these ills? Perhaps. Let's lay out what our goals would be in creating a SupraSearch website.
  1. Relevance. Any link would need to have high relevance to users' searches.
  2. Webmaster Friendly. The site would need to have tools for webmasters to promote their sites.
  3. User-Friendly. This means a lot of things, but generally we want the user to have to click on as few things as possible to get where they want to go.
  4. Fairness. We want all websites, new and old, to have a fair chance to vie for a users attention, and any advantages they receive should be on merit.
  5. Abuse-proof. Spammers are everywhere, we don't want them to destroy any chance of achieving our other goals by exploiting the system.
So let's take a stab at a solution.

An obvious first step is to hybridize all the different approaches into a single mutant super-website. How about a classic search engine that only crawls websites that are in a moderated list of links (search engine and directory, check). The list of links would be built and managed by a community of civicly minded users and webmasters who contribute links, filter out spam and just generally keep relevance high (blogs and social bookmarking, check).

Every user would get an annonymous profile through which they search, submit, and moderate. Each website would have a profile, associated with its authors' profiles, that users could track, comment on, and score.

At the heart of such a system would be a relevancy score, which would drive the ranking of results in searches and directory listings. But relevancy is a slippery and subjective concept, so let's break it down a little further:
  • Contextual Relevance (CR). How well a result matches the context of what a user is looking for. Keywords and categories can be used to calculate this component.
  • Popular Relevance (PR). A key measure of how relevant a link is. How often people click it, and how often other people embed it in their own pages.
  • Author Relevance (AR). How highly regarded the author of the website is. Known spammers score low, important scholars or companies score high.
  • Quality Relevance (QR). Not all links are created equal, some are better than others.
  • Freshness Relevance (TR). Information often degrades with time. A philosophical dissertation might be relevant always, but a technical paper might become obsolete in just a few years.
Our SupraSearch has to start somewhere, so let's try this:
Link Relevance Score = (A * CR) + (B * PR) + (C * AR) + (D * QR) + (E * FR);
A-E are just constants to allow for tweaking. All variables are scaled from 0 to 1 and the sum of A-E must equal 1.

CR and PR is the classic keyword and backlink relevance scoring done by almost all current search engines.

AR, QR and FR would be managed by the community of moderators. For simplicity's sake, let's assume that every user gets a vote on each score, and the total score is just the average of all users'. An obvious next step would be to create a relevancy score for each USER and use a weighted average, but this is another discussion altogether.

A-E you either tweak to get the best results, or you might even allow each user to set their own factors search-by-search.

Summary & Conclusion

There are an endless number of other design elements we could discuss endlessly, but the important features here are that we've hopefully excluded spam sites by using an 'In List' and added a human element to the calculation of website relevancy.

Users get more relevant results and webmasters get something better than the one line submission form most search engines offer and the opportunity to influence their scores directly through the moderation community.

Like all things, SupraSearch still has some chinks in its armor:
  • Performance of this type of system might be horrendous since it takes into account so many factors for each search.
  • Abuse prevention still isn't addressed fully.
  • Calculation of the different relevance scores might be very difficult to balance.
  • Will the company who owns SupraSeach be scrupulous enough to not use paid listings or user accounts?
Still seems worth a shot to me...