Can I write a paper for academic publication?

I was recently presented this possibility by Phil Fry of the Business and Economics Department at BSU.  He brought it up out of the blue, so it took me a bit by surprise.  I always looked at academic publication as something that would be a forceful process (like my thesis), rather than something I would enjoy doing; but I think he has a point now, so I'm seriously considering topics that might be worthy of that sort of research/work.

I have been meeting with Phil on statistical matters concerning Booklamp for over a year now.  He has helped me refine the S.I.F.T. Engine that I developed with Booklamp - the math and mechanics behind the current internal version of Booklamp's recommendations.  He has helped me with ideas regarding economic modeling of the data I have available, and other more specific statistical curiosities - i.e., quantile regression.

The thought of publishing something left my mind after I finished my masters thesis, which was a brutally stressful process.  When that was over I just said, "finally, I can go to Booklamp and do what I want to do."  Well, it's been a year now and I do feel I have something to add to the broad world of academic knowledge.

Our Director of Research, Matt Jockers, is in town for a regional conference on Irish literature to present a paper.  I went to the block of presentations that he was a part of and it is interesting to hear academic presentation from people that are not economists... the talks were much more "real" than the major abstractions that can be difficult at times to listen to from the field of economics.  The presentations were pretty entertaining, to say the least.

The first presenter talked about literature as a "technology of the settled," making references to the knowledge that settled peoples develop language and this technology differs between different "settlements."  I'm paraphrasing, but that's what I got.  She also mentioned the "profit" of literature, but in an abstract sense - not necessarily in terms of money.  These two ideas, the technology and profit of literature, are inspiring some interesting thoughts from an econ education.  I can certainly come up with some unique substance concerning, "Literature: Technology and Profit."  A topic like this very much reflects the sort of abstract sense of value that I've developed in and after grad school.

If you are thinking, "literature... technology... what?" then I think it is important to realize that literature is very much a technology.  The Internet, for example, is produced entirely using a form of algorithmic literature - no words (think, code), no Internet, period (at least as we know it).  When we think about "literature" though we don't necessarily think code, we think 'books' and 'emotional stimulation', or academic literature brings to mind intellectual stimulation.

The "profit of literature" is something that is probably much more difficult to define.  It is easy to say that we really enjoyed a book, but... "how much did you enjoy the book?" is an entirely different question.  I would argue that, unless you read on a constant basis (a book a day, or so), most of us would have a difficult time answering this question.

Then, as a economist, I want to know how much did the book cost, which is not to be confused with only price.  The cost of a book includes price, but also the time commitment, and possibly other factors.

So... what is the profit of literature?  These irish studies academics seem to have an understanding of it.  I figure I can find something to talk about here.  We will just have to see.

Pick o' the Post: WYLF (What You're Looking For) by Mofro on JJ Grey and Mofro

I really like me some Mofro... love the blues and they do it so well and make it all sound so diverse (to me).  I guess my favorite albums from them are the first two Lochloosa and Blackwater.  I got a bit annoyed when they split the credits between JJ Grey and Mofro, but meh...

Tongue Switching... what is it?

Hello world.  we're still doing cool stuff at Booklamp, but I want to talk about my ideas for advancing harmonica technique today.

I've been on a mission for the past ?2 years? to explore a harmonica technique called "tongue switching," which is usually referred to as a "shimmer" in a traditional context.  For example, you can play the following riff, and this would be a tongue switching riff.





These are pretty basic.  The idea is that, with tongue switching, you can play non-adjacent notes.  The innovation, if you can call it that, is the idea that at any time your mouth can cover 3-4 (?5?) holes at once... why not have all of them at your immediate disposal.  The biggest difference between a tongue switching approach to the harmonica is that there is much less head movement.  Phrasing takes on a whole new perspective when you have 3 different ways (tongue-block left, tongue-block right, and lip-purse) to play a single note, or a series of notes..

I was exposed to a lot of metal in college (death, tech-death, prog-tech, etc.), and the harmonica seemed like it was stuck in some rut of tradition.  The only real outsider is John Popper, and his technique is so far beyond the rest of the field (in a different way, to be fair), that no one's really tried to replicate it.  Well, LD Miller is as close as it gets, I guess.  I don't follow the profession as much any more, so there could be some others I'm not giving credit to.

I feel a bit unique in my ideas, but I could be wrong there also.  Breathing aside there are 4 basic tongue switching moves that I've discovered:

1) tongue-block left (blow/draw out of the left corner of the mouth) to tongue-block right.

2) tongue-block right to tongue-block left

3) tongue-block left to lip-purse

4) tongue-block right to lip-purse.

... of course, then you add the two breathing directions to make 8 individual moves, then the overblows and overdraws that are available at certain places on the harp, which makes for a lot more moves and a lot of necessary practice.  Once you get use to the those first 4 pretty clean, then the speed depends on your breathing.

I've experimented with incorporating overblows into my tongue switching, but it's really hard and I'll wait until I master some of the simpler concepts first.  Overdraws, will probably be the last thing to add.  Bending is not terribly difficult, but hitting the notes with that kind of speed, power, and precision is not easy either.

I figure the best way for me to develop my ideas is to start a band, so that's what I'm trying to do up here in Boise.  I need something other than a metronome and my own head to play to.  Till then, i'll just keep filling out my technique.  I need to start getting more into scales and modes - all in time.

Pick o' the Post: "The Beacons" by Blues Traveler on North Hollywood Shootout

Check out Popper's solo at 1:39 ... not his most super-duper, fast, note-blurring, shreddy solo, but there's something that's just very deliberate and brilliant about it.  I like it.

PS  Here's a video from about 1.5 yrs ago that might help get the basic concept across.  My technique's developed quite a bit since then.  Right now I'm using a book with harp tabs for the Blues Traveler album, Four to learn to think about the notes like John Popper does as well as to find and create exercises that cater to tongue switching.  All I'm really doing now is trying to play Run Around with tongue switching where it makes sense to economize the playing that way.  I'm about 2/3 thru the first solo; which I think is a bit more difficult than the outro solo, but I haven't gotten there yet.

Additional Note:

Just after this post, I discovered a way to tongue switch between 2 ADJACENT holes.  It's difficult because it amounts to a regular-ol-warble, but switching back and forth between right and left tongue-blocking on only two holes.  So, instead of covering 3 holes of the harp with your mouth, you have to cover only two and develop the muscle memory to tongue switch in a tighter space.

Let's say you wanted a 4D-5D warble.  Essentially, you would center your mouth over the separator between the two holes, blocking the 4D with your tongue (playing the 5D), then switching your tongue and blocking the 5D (playing the 4D), then repeat.  This can get very fast, and I feel it will provide more control than having to move the harp or your head to achieve the same effect - once the muscle memory has been developed.

I just started on this the other day, so I'll see where it goes.  Most of this technical development is an attempt to build muscle memory, since these are not obvious - and in many cases difficult - ways to play the harmonica.  Once the muscle memory is developed, then the music can flow.

Guess what world??? I'm an actor! ... i48 Boise

I'm not sure nodding counts.  I do a great job being a random dude.

Check it out... my spot starts at 1:29 and goes all the way to 1:30.5

I actually had a line, but I'm not a very good actor, and I'm glad they cut it... I think it's creative and lil funny they still put a random scene there.  I did suggest the "Dirty glass for a dirty job" line though.

Booklamp's neighbors, Yellow Box Studio, did the short film for a 48 hour film festival.  There were 50-someodd entries ($50/ea), and participants receive a genre, character, prop, and one line of dialog - each of these must be used in the film... all music must be original.  And you have 48 hours to submit your short film.

Title: "Marshall Law"

Genre: Action/Thriller

Character: Casper Marshall, Attorney

Prop: white angel statuete

Dialog: "It's not like they're going to arrest you."

Marshall Law / 48hour film from Yellow Box Studio on Vimeo.

16 films made the showing last Sunday Night and "Marshall Law" was one of them.  I'd've guessed there wer 2-300 people (or more) in attendance.  Yellow Box got an award for best cinematography, and I walked down with them anyways... that was kinda cool.  I wonder if anybody was like, "hey, that's the random bartender with a coat and tie and no shirt." ... Actually that scene did draw a good laugh from the audience.

I'm happy they guys at Yellow Box let me hang around and help out where I could.  I held some lights and a diffuser and some other random stuff.  It was a good time.

World Cup Tracker and Increasingly Productive Water Cooler Talk

Have something tangible to talk about around the water cooler when the World Cup takes off.  Mint Digital says it will deliver it's World Cup Tracker in time for this year's tournament.

Mint Digital hasn't revealed precisely what sort of data a user will be privy to, but I imagine a casual conversation about the US's match vs. England might call for a more informed dialog if you're using the World Cup Tracker.

Before: "The US seemed to do very well in the midfield, but Rooney was able to drop back quite a bit and help England stifle the US midfiled - England did well adjusting to our gameplan."

After: "The US dominated midfield possession in the first 15 min, 70%/30%.  But, the remainder of the half, Rooney spend 50% more time in the midfield than in the first 15 min - a calculated tactical change, of course."

I have no idea if England actually uses Rooney this way, but I think it makes a decent example.  My suspicion is that Rooney really spends most of his time on the last defenseman.  The example conversation might be more believable if it turned out that England was up 2-0 by the 15 min mark... however, if that's the case, it's very unlikely that the US midfield will have "dominated midfield possession ... 70%/30%" at that time.

Anything could happen... the US could win.

Pick o' the Post: "Ole"

Aerosmith Picks

This has very little to do with words, but the lyrics are great.  I can't believe I've never taken the time to listen to Aerosmith and Joe Perry.  Aside from the more technical side of things that define progressive genres today, I've tried to develop what "progressive blues" might be, and this is about as close as fits the path I've imagined... but it's from 1973.. ?  yeah.. i've got a lot to learn, I guess.  So far my first exposure to non-mainstream Aerosmith sounds very cool.

Grooveshark just grooved on over to "Write Me a Letter" and the progressive sound is composed so well around a very traditional blues sound.  Stephen Tyler's voice is just plain cool.  Very briefly, this track felt a bit slow, but the whole band keeps making up for it in pristine blues rock 'n roll.

"Movin Out" just started.  Reflecting on this track (and the ones up to here), Clutch sounds very similar; still not Aerosmith.  "Moving Out" is okay.  It's very 'steady', but probably my least favorite track to this point - it has it's moments, but they're brief.

Also, if didn't know this was Aerosmith, I'd guess that the first track, "Make It", was a KISS song - tyler's voice is a giveaway tho.

Pick o' the Post:  "One Way Street" by Aerosmith on Aerosmith (1973)

American Idol Non-sense

Pick o' the post: (see below)

Just tuned into my first american idol today, so I'm not sure where they are in the season.  It looked like there were contestants left.  Each sang a song of their choosing, and a song picked by a judge.

Everything was carrying on as I'd recalled from previous encounters... song tracks or a couple instruments accompanying the singers; except for the final singer who got an orchestra and 5 or 6 backup singers.  The song was Hallelujah.

The singing was much more than I can do, but from the moment the singers flowed in from behind him, there was no way this guy was not getting a standing-O.  To be honest, I don't remember any of his singing, except for the end (because the end was replayed).  All I heard was the orchestra, choir, and the entire audience.

I'm calling bullshit on American Idol.

This is basically a conspiracy theory, but it's *actual* conspiracy theories that give a bad name to American Idol's mischievous goings on.

I'm speculating that the producers of American Idol have come to understand that they have the ability to influence the outcome of their contest by any number of factors... song choice, accompaniment, etc.  I don't know about song choice, etc.,  all things were not equal for these competitors.  One got an orchestra for God's sake.

Socialists are Backwards (I guess we already knew that)

Even the casual mention of the word 'socialism' sets off a flag immediately.  The word just has a negative sign next to it in my mind; and many more times than not the speaker doesn't really mean "socialism," or they're a little nutty.

Socialists have it all backwards.  Aside from the "why" in "why are we not a socialist society?", socialists imagine it is possible to strictly design society; have tried, and failed miserably.  The American economy certainly has aspects of socialism (social security, medicare, etc), but these were, at some point, last resort circumstances.

In Socialism's heyday (... I guess ...), information was very slow to travel.  From this perspective, a socialist government could have had no idea that it might someday be possible to distribute information to each and every one of it's citizens.  Even so, history probably wouldn't look too much different... corruption always seems to creep its way into command economies.  Greed gets us all - and drives our economy (put generally).

Now that we can deliver and receive near-instantaneous information almost anywhere in the world, governance and society would benefit from information systems that inform us on some of our basic activity as citizens.  The first thing that comes to mind is privacy issues - I just recently came across this infographic about Facebook's privacy policy - but that's for another blog.

Imagine if you could observe your own water consumption against that of other similar households.  Skipping the R&D, legislation, and infrastructure needed to make that happen, we can imagine that everyone knows this information and can act on it as they wish.  There would be all sorts of behavior as a result of this.

I think I would work to consume less water than the typical household of my type (1000 ft.^2; just me) - more times than not.  There will be people who don't pay attention to it at all.  On aggregate though, this information would do us more good than bad with regard to conservation and sustainability.

Of course, R&D, legislation, and infrastructure can be prohibitively expensive, but who's to say that those things won't become less expensive - whether it's monetary costs, time costs, or simply the time cost of the diffusion of knowledge (I made that one up).  I'm to say... those requirements will become less costly in time.  It might seem that at some point in the next 30 years (because I can predict the future), informing the public on whatever information the public demands will be an important part of democracy.  We are seeing the beginning of this with the attempts at transparency that the Obama administration has made public.

There's no way to know what those systems might look like, but a goal of informing the public allows markets to function on the collective will of its participants.  Nothing a government can build - on its own - will be as clean as that last sentence conveys.  It's probably more accurate to imagine the government incentivizing the market to build these systems.

We already do this to some extent, but there are some vey basic data - like water consumption - that could do a great deal of good for a market society.  The "ruling generation" can be afraid of the (sometimes irrational) vulnerabilities that this might generate and... the word, "socialism."  Even if it's not exactly concept that comes to mind, it's the idea that someone, somewhere is gonna screw you.

On the topic of getting screwed, our market economy did a wonderful job making me feel safe when I got my first employment agreement... a contract... in 'legalese'.  I didn't even have to read it - the words just screamed, "FUCK U!"  We're never gonna stray too far from legal language in our society... governed.. by.. law.  However, the biggest hurdle for a company and a potential employee is jointly determining if the labor match is a good one.

I have one thing to say: Data

-In the long run, it's your theory that's dead.

You know what I really liked about Econ?  It wasn't macroeconomics.  I actually learned very little about macroeconomics - not entirely my fault.  It had to be the (visiting) professor's first time teaching (over from Iceland).  We couldn't understand him and he was very soft-spoken and I could barely pronounce his name.  A whisper and an accent.  I learned very little, and it all seemed as if economists spent 50 years crawling down an attractive rabbit hole.

I did take a course on the History of Economic Thought, which was very revealing.  I will probably never read another 500-page book in my life, but I read "The Origin of Wealth" - a bit about evolutionary economics, but mostly it just challenged and looked for solutions to the field's weaknesses.  The next time you're thinking, "I just want a good 500-page book on heterodox economics...", this book's for you.  Or, if a course syllabus lists it, I guess I'd recommend it.

"That won't be necessary.  I am the agenda."

Price Gouging and Public Opinion

My good friend Jake Russ relayed an article on his blog recently.  Basically, it's a real-world example of the public fighting (probably, unknowingly) the mechanics of a free market pricing system.

I don't know if it's the ignorance of the public in this case, or the mirror of our behaviors that this represents, but I really like the way the point is made... very clear.

Sorry Jake.  I liked your last two posts enough to put post em here.

The Daily Show With Jon StewartMon - Thurs 11p / 10c
American Apparently
Daily Show Full EpisodesPolitical HumorTea Party

... HAHA

Some Cool Processing Creations

I've begun my introduction to Processing, and have been searching the interwebs for cool little pieces of art produced in the environment.  I came across a couple that I think are worth sharing.  They're not exactly dataviz related, but they are a pretty stunning use of Processing.  This one appears to show the movie (?5th Element?) frame by frame across a wall.  I found this:

This one's a little weird, unless you're into (very) modern interpretive dance... interpretive augmented reality?

Here's an audio visualization of a song.  I can't tell if the entire thing is programmed, or if it's more.. custom to the song - where a lyric's appearance was crafted to appear when it did.  I think some iTunes (and WinMedia Player) visualizations have tried to incorporate song lyrics into their graphics, but I haven't seen one that's really awed me much.  This one seems ok tho...

I can't wait to work this into my analyst's toolkit.  Simply from the one tutorial I've completed, I can tell there's a LOT still to learn.  I don't need to know it all, which helps, but the things I do need to learn are going to take some time.

Pick o' the Post:  Cold Shot by Stevie Ray Vaughan on Couldn't Stand the Weather

Saturday, April 17, 2010

Processing and... processing.

Pick o' the Post: The Odyssey by Symphony X on The Odyssey

This is about as epic as a song can get.  Yes, more epic than Rush - gimme a break.. this is Symphony X.  This song is a 24min. musical rendition of the story of Odysseus' adventures in the story The Odyssey.  I don't really know the story that well, to be honest, but I did get the privilege to experience this (entire) song performed live in Atlanta.  That is something I will never forget.  I was there to hear them perform their latest album Paradise Lost, which is a rendition of the classic epic novel.  The entire album is an amazing feat, but the title track is pretty cool for metal/non-metal audiences alike.

So, I got my hands dirty with Processing last night, and I'm glad I did.  I was following along with a post from a blog I follow, and I'm really excited to get some more experience here.  Processing feels like a bridge from working strictly within R or Stata or Open Office for plotting solutions.  Processing feels much more open to the imagination.  The difference is that there's quite a bit more programming involved developing graphics in Processing than there is in in R or Stata, and certainly Open Office.

I haven't figured out how to export images yet - actually, I ran across it once, but I'm too lazy to figure that out right now.  So, these are screenshots of the graphics I created with the tutorial.  First, I'll briefly explain the data and the analysis that's going on.  Jer sent a request out on twitter to have any interested followers tweet a random number (from their head):

He put the 225 human-generated random numbers into this (publicly available) google spreadsheet.  The tutorial works with data stored, more generally, in a remote location, like a google spreadsheet.  He cites not having to change filepath names when data moves on your own system as a good reason to try to keep data in centralized location ("centralized" is relative).  I can relate to that sentiment... with much (if not all) of our data stored on our servers in the office, I know exactly where the data I want at any particular moment is, and don't have to fumble around trying to find the data I want to load into memory for analysis.

Anyway, from here Jer leads readers through a number of methods to analyze the data.  Below is row after row of machine-generated random numbers, and one of those rows is the human-generated data.  Each column is a number 1-99, each machine-generated row represents 255 random numbers.  The brightness of the ellipse indicates how many times that number is present in that set of 255 numbers.

So, this a crude, first round of analysis.  Obviously, it's damn near impossible to visually pick out the human-generated row.  It's the 37th row from the bottom of the image (36th row from the top).  I adjusted some parameters to highlight the our dataset of interest - it's approximately twice as bright as the other rows in this image.  I also summarized some of the next plots into one graphic below this one, observing the increased visual definition that we can get from a bar graph, also adding color gradients to emphasize various bits of data.

Before I go any further with the tutorial, the idea of perspective was emphasized by Jer.  This is why we first observed the bright/dull points, then the bar graphs, which we then applied color to.  Comparing this bottom-most graph to 6 rows of machine-generated numbers, our human-generated data starts to look a little outlier-ish.  Our data is the top row of the next image.

It occurred to me that an extra step in manipulating this graphic might make this "outlier-ish" observation more clear.  Ordering the bars based on their height (color), we should be able to get a better idea of what the difference is here.  Without this additional adjustment, maybe it's that the observer is left to "calculate," in some sense, their own order to compare each row.

Now, the dissimilarity is a bit clearer.  There are at least a few numbers that our human subjects tend to pick, seemingly, a bit more often than random.  Jer continues on to display two more visual representations of this same data to try to find some pattern.  First, using a grid with color gradients.  The first row is 1-10; second, 11-20; and son on.

Then displaying the same grid, but displaying the numbers (colored with a gradient) instead of the squares.

Aside from the Douglas Adams effect (#42), as Jer points out, these random numbers generated by 225 of his followers seem to have chosen numbers ending in 7 quite a bit more than we might expect.  He conjectures if there is something about the number 7 that seems "more random" to us (or, less generically, his 225 participants).  Interesting though.

I'm glad I made it through this (my first) Processing tutorial.  I'm looking forward to applying these tools and concepts to the unique data that Booklamp affords my imagination.

Have a wonderfully data-filled day.

I attended a close cousin's wedding this weekend (congratulations John and Callie), and on the trip out there I was pushing about how important collecting and understanding data will be to our society's future.  My father kept stressing that 'the ones who control whatever data it is will continue to have the ability to manipulate it'.  Yes, maybe, but with a diminishing effectiveness.
FlowingData posted an article today about TransparencyData, a new project aimed to making data more accessible to the public.  TransparencyData is one of a number of projects starting up intent on allowing the public to inform themselves.  These are mostly government-data-type sites for now, but these projects will inspire development into more specific sectors of our economy and our daily lives.

My argument was that, in time, we will reach a point where data has the ability to validate and invalidate itself.  Corruptions and fraudulent uses of data will be more visible to anyone interested in the information that some given dataset provides.

We're only now seeing what might be some sort of start to this notion.  Making data and information more observable than ever is the first step.  Included in this, is the concept of linked data.

I went so far in my argument for what data will do for our society, to say that at some point we will be governed more and more by data.  On the surface, that sounds very USSR/planned-economy/scary type talk.  But, I don't think that's what I mean to symbolize.  The ultimate social decisions, I would think, should still be made via the democratic process.  Data simply provides an avenue for more intelligent decisions to be made, and leaves less room for fraud and other mishandling in government.

Pick o' the Post: "Chasin' the Trane" by John Coltrane ... gets a little "run-on-ish," but I think PBS puts it best: "it's not about every word being right... it's a novel [, not a poem]."  I just heard this for the first time tonight and there are some amazing moments in this 80+-chorus-long solo.  There are some spots that require some endurance by the listener as well - like one part, around 11:00 that just sounds like an elephant... and then he makes some other unintelligible noises.  Definitely worth the 15 minutes.  Listen as we contemplate how we might measure a cliché... ooohhhhhmmmmm,

My flight'd just gotten in to BOI @about 7:45p this evening, and I gave Aaron a call about some questions he had.  He told me, "don't work to hard"; and I felt the complete opposite since I'd missed Thursday and Friday of last week.  So, I responded with a twitter reply: I guess we would be more succeptible to overworking ourselves, when our work is a passionate one. I'll be careful.

It's true, but... it sounds preachy.  Oh well. it got me thinking about clichés.  What are clichés? ... computationally, what are they? .. is that possible?

I remember, as a kid, really wanting to be able to say the right thing at the right time... all the time, like a wise, old man.  That seemed very valuable to me - poignant little truths.  A very efficient form of communication maybe.  My work at Booklamp is a little ironic in that I don't really seek out time to read, but always (in retrospect) had a passion for how to use words.  I've come to learn that saying the right thing at the right time - when you pull it off - is much more than words, but the words are (most times) the most critical ingredient to the social concoction that illicits that electric feeling of and sense understanding.  This is very much a result of how my father raised me - Happy birthday Dad (April 12th)! ... one word: details.

Anyway, it occurred to me that clichés have similar properties; being efficient, sometimes comedic, poignant little sayings, or phrases.  What's a cliché? computationally, what's a cliché?

I'm not (directly) trained in how this question might be answered, but I think my definition is a thought provoking one.  A cliché is:

The redundancy of some given meaning, conditioned on a given context.

There's probably much more to it then that, but that (for me) does a pretty good job of explaining what I think I believe [<=redundancy, but no cliché..] a cliché is.  Come to think of it, maybe there should be a measure of irony in a general definition of what a cliché is.  Regardless, the italicized definition above sounds very much like a matter of probability.  The trick - which, I know is not an easy trick - is to create some sort of measurement to quantify a cliché's "meaning" and "context".  Then you gotta use that info to separate (some how) ordinary redundancy from clichéd redundancy, and context.  That sounds like a lot of work.

I know for a fact, that is very hard.  But if you can do it, then maybe you too can measure the clichéiness in your life, or your books.  If Booklamp does it, then we'll possibly know something about how original a text might be... that we with millions of other texts.  Maybe later we'll license services to measure you're own clichéiness... probably not though.

My job has taught me that statistical inference can be a bitch sometimes.  I have no idea what I'd do without my text books and the interwebs.  Thank you Algore (I made it all one name now.. Algore) for our great series of tubes... I don't know what I'd do without you... Well put Senator Ted Stevens, very well put.

IBM Many Bills

I stumbled across IBM's newest data app... Many Bills.  Haven't had a chance to thoroughly check it out, but it appears to make the perusal and consumption of the many thousands of pages that bills can consist of, a somewhat less daunting task.

I'm really starting to like IBM's perspective on our world, in the midst of a paradigm shift.  IBM's previous data creation for all of us to enjoy was Many Eyes, which allows users to analyze and visualize public and user-uploaded data.  There's only one other company I'd want to work for when it comes to data...

Pick o' the Post:  "Heir Apparent" by Opeth on Watershed (2008)

PS Here's a neat little real-time news app that supposed to show where news is coming from and some other info... haven't gotten too far into it, because it doesn't have anything to do with my job (I'm at work).

I do it all for the pick! ... the pick!

Pick o' the Post:  "Foxy Satan" by Cunt Amputation

Just fo the pick!  The band is a black metal joke, but "Foxy Satan" is a really cool metal rendition of Jimi Hendrix's song.

If you can't deflect the taboo meaning of satan, and see the comedy in the song; or if your are not yet 17... you cannot to listen to this song.. your head will explode.

The Ironic Departure of Data-overload

Pick o' the Post: "Time of the Season" by The Zombies on Odessey and Oracle 

"... actually - it's the opposite.  The more data you have the clearer you see."

I've been noticing the new IBM adds lately, and they really give me a sense of satisfaction about how we are starting to think, and where we are heading as a people.

15 Pb => 15,000,000 Gb => 15,000,000,000 megs... per day.  And that number is not getting smaller.  Also, they don't really make a distinction about what kind of data that is... I would imagine they mean the parts of society that we observably alter.  There's also a whole world of data that is seemingly useless.  For example, what if I chose a spot on my white kitchen wall, and measured the "whiteness" of that spot every day... valid data, but useless - until someone comes up with a meaningful reason, aside from pure data collection.

I was just considering what data can do for us and how we get to that point.  My roommate used an interesting phrase: "life-stream."  This almost fits the breadth of what I think we are starting to understand.  Beyond the "paper-trail" of data we generate every day, we probably want to know many things that may not be directly observable from our life-streams, but do still affect them.  Earthquakes or volcanos, perhaps, or species migrations.  It also seems that, of the data that might first come across as useless, we may not understand the value in certain, seemingly negligible, datasets... similar to my most interesting kitchen wall spot.  It's still white.

The mission is a higher standard of living, which includes all conservatory topics, as well as industrial/infrastructural topics, and any other areas that would seem deserve attention and resources to better our industry and creativity.

For about 1.5 years now, I've held the belief that when the first generation emerges from such a connected and informed upbringing, our world will really begin to see what our informed world has to offer.  The first generation to not know a world without the internet.

I wish there was another word for "the internet" ... sounds cliche.

Data Design for New IBM Commercial

IBM and James Frost release a commercial that does a pretty good job of visually representing how "surrounding" data is, and all the diversity of data and it's possibilities.

IBM Data Anthem from Benjamin Wiederkehr on Vimeo.

I found this in a blog I follow, and James Frost mentions that the video he directed for Radiohead, "House of Cards" was "showing data in it's purest form," so I had to check it out.

Radiohead "House of Cards" from Justin Glorieux on Vimeo.

Pick 'o the Post ##: "Don't Worry Be Happy" by Bobby McFerrin on Simple Pleasures

Measuring Life (cont.)

Not too long ago, I blogged about self-surveillance and I've made some informative observations since then.  Starting off, my biggest concern was what to measure.  There's a trade-off between the time it takes to log information and the value you receive from collecting it.  Ideally, it'd be nice to have another me to collect all the information about me; but that's just silly.

I started doing it for nutrition, so I knew that I was going to track everything I eat and drink.  But, I can log other information, so... what else?  Well, I began by logging when I go to sleep and wake up, when I start and finish showering, when I start and finish brushing my teeth, when I start and finish watching tv/movies/jtv, when I start and stop working, and when I start and stop driving.

So, these particular points are all durations.  The sleep duration makes sense.  I still log my showering duration, which doesn't make much sense, but I haven't come to dropping it like I have brushing, watching and driving.  Working is a bit special, but I still log that duration.

So what happened?  Why did I choose to drop driving, but not showering?  I tried to just let my inclinations (amount of nuisance) drive what data I collect.  Brushing takes about 2-3 min each time, so there's no need to track that.  Logging my "watching" habits was not convenient enough - I don't know (even generally) when I might watch tv or a movie... it doesn't seem like a "structured-enough" action.  I've concluded about the same for my driving habits - mostly I forget to log the arriving time.  I figure (when cars include standard wireless internet), I'll be able to write a script that logs this info automagically.

So far, I've realized that logging my life is and will be a journey.  I need to explore the aspects that are important and what I can and can't log about those aspects.  For example, all of my receipts that I save (along with Blippy, if needed) gives me a financial record of my life.  Right now, the trick appears to be automation, which I don't do at all yet.  With time, I hope to figure out useful ways to collect the most valuable information.

Calculating Words


I can assure you that word is a calculated one.  I put it there so you could read it.  Now tell your children - jk.

I just finished watching an entire documentary about this word - it was called, "Fuck."  There was an obvious attempt at making the conversation two-sided, but it was just as apparent that the makers of the film were not... prudes.

Beyond that single word-of-curse, I really like using all of them.  I like changin' 'em up, and making my own words out of em.  One of the best examples of this is Bobby Bowden's identifying, "Dagnamit!" ... obviously a euphemism for "Goddamnit!"  But this way, he can save face and retain the Christian image that he portrays.  I don't mean to say that he's being cowardly or anything like that - I love Bobby with all my heart, and I'm really fuckin' annoyed with Jim Smith for coming out and saying that he needed to go - that bastard.

Anyway; I was saying.  I remember my father mentioning something that occasionally comes to mind when I use words like fuck and shit, and whatever else just pops into mind.  What he told me seems to ring true in most cases where I hear/use words like this.  He told me, "Curse words are used when our vocabulary fails us."  Well, I don't remember exactly what he said, but it was right along those lines - and that definitely sounds like him.

These words can be meaningful, but in most cases they seem to find their use in eliciting emotion, rather than delivering an accurate representation of what you mean to say.  Of course there's also validity in meaning to elicit emotion in such a way, but it would seem that these words find their use in arbitrary contexts more often than a "valid" context.  So, what is a "valid" context then.  I'm not really sure, outside of the word's literal meaning; but in retrospect, when someone's applauded for their poignant and calculated use of certain curse words, maybe that's a "valid" context.  Not really sure.

Regardless, they're only words and we will continue to fight about them.  It's sort of fun to be honest.  It's like there's a supply and a demand for these words, and "fuck" is just really scarce (considering the situation - i.e., the audience of a concert hall).  So I leave you with a possibly more valuable word in this market of curse:


Pick o' the Post #9: "Satan Will Rape You on Valentine's Day" by Cunt Amputation

I normally wouldn't recommend such a ridiculous song by such a ridiculous band, but it's valentine's day and this one's for all the single dudes out there... and all the single girls who might prefer Satan to the unbearable reality that you are alone.  hehe - jk.

Happy Valentine's Day

The Life & Times of Dan Bowen

So, I had a short office conversation with BookLamp Prez, Anthony Hauser, about Twitter.  Tony's point was question about why anyone would care to log the various "minutia" of their life.  I find it interesting that most people who don't use certain social networks tend to ask the same question: "Why would I want to do that?" - whatever it is... Twitter, Blippy, etc.  After hearing this over and over again from skeptic after skeptic, I found myself frustrated with the question.  I think my perspective promotes an alternate, but similar question:  "Why not?"  Maybe this is the difference from those personalities who are open and those who aren't.  Research and caution is always warranted, but these services (for the most part) are "customizable" enough to allow as little or as much information into your life as you care to.  It seems to me that ignorance is bliss, but at the cost of your own unrealized benefit.

So, I felt I had to find a solid, legitimate reason for social networks outside the obvious benefit of instant information.  To name one application, I can aggregate a personal history of life with more detail than an individual who's opted out of the various social services.  Maybe a good project for retirement.

On that note, I need a way to archive my social logs.  I haven't found a great way to do this yet, but I think I'll have a lot to write about when the time comes.  I want to make sure I have all of my life's data to reference.  I'm on the case.

What!? The BC Olympics are only accessible to arbitrarily fortunate few!? ... many really, but F#$*!!

If you know me, I love just about anything athletic.  Curling's sort of on the fringe of that preference, but I'll watch it if nothing else's on.  That said, I've been pretty pumped about the Winter Olympics... then March Madness (streamed by CBS)... then the World Cup in S. Africa (???).  This was certainly shaping up to be a great year for athletics.

Uhhh... no.  I just got a big boot in my ass today when I learned that NBC's taken up the ESPN360 business model, and Qwest is proving to be pretty gay about these affiliations - at least ESPN360 and now these Olympics.

Something doesn't feel right about this.  Of course, many people have televisions, and I'm part of a minority who will consequently end up on JTV watching crappy streams that NBC will continuously shut down.  What's wrong with this picture?  It's the Winter Olympics, not the NBC Olympics, right?

Oh wait, I can follow the live twitter/blog updates from the Games!  Bullshit.  I am so enraged right now.

Give me a Google-sponsored Olympics.  NBC can kiss my ass.

So my roommate first introduced me to mousepath (Mac, Windows), a java applet that tracks your mouse's activity as you use your comp.  FlowingData mentioned this already, but when I first was told about this and saw some of the output I definitely thought, "Jackson Pollack."  It's a fun little novelty.

So I created one while I was working this afternoon, and another while I was playing a random multiplayer RPG later at night:


You can see I do a lot of scrolling up by the tabs, and all the corners are from using expose and spaces on my macbook.  There's also a bunch of lines straight across in the middle of the image... I tend to highlight text as I read - prolly a bad habbit.


So this is me playing Neverwinter Nights with the BookLamp crew.  I can't make much out if it, except it's obvious that I was playing a windowed version of the game, because there's not much action outside that "box".  All the little dots (rather than the big ones from work) are from constantly moving and clicking my character around.  If I recall, the size of the dot represent the time that the mouse sat idle.

Interesting way to represent usage on a computer.

Pick o' the Day #8: "Wild Man" by Galactic on ya-ka-may (sorry, no link... yet)

A little short on lyrics, but I like the beat.

This album was released today, and I got to hear a bit of it on NPR, so I decided to see if the vinyl was available because I enjoyed what I had heard.  I enjoyed about half the album.  I likes the more groovy, bluesey-esque songs.  The turn-off was the dragging "rap" tracks scattered throughout.

My "Big Bad Idea Monster" - ad-hoc IP economic analysis

I just watched a movie about the interaction between intellectual property law and human ingenuity.  With the huge sums of money that were being thrown around, it's no question that this is certainly a job for economics.  I'm not sure we have the tools to analyze this problem - and revolution may be the only answer.  The movie was about a musician who's instrument is the sample.... "Girl Talk".  If you don't know who this is, you should look him up.

The question:  Which provides the greatest benefit to society... copyright, or "copyleft" as the movie mentions (putting some weird political seasoning on the argument)? ... I'm going to do my best to analyze my understanding of what copyright is and how it might fit in today's cultural world.

I'll admit, I am by no means a master of copyright policy, or even the economics of IP.  Again, I want to do my best to balance this argument.  Being part of a younger generation I grew up with napster, limewire, and now the proliferation of torrent files, I can't ignore the impact that these instruments have had/are having/and will have on our society.  I also recognize the importance of copyright law; my education demands that I do. ... I guess I'm a little freer to think now (since I've graduated), but it would be naive to ignore my education, so I don't.

Copyright is in place to protect innovation and give innovators the ability to develop their original ideas for a profit.  A head-start, if you will, to keep others from saying - "hey, that's a great poem.  I'm going to write it too, and sell it to buyers for a profit."  Of course, this extends well beyond text, into music, medicine, technology, and (questionably) ?life?, among a myriad of other mediums.

The opposing argument is that, if the original idea is a public one, then there are any number of interested people who can develop the idea, exploring all of it's various iterations.  This argument proposes that the social welfare gains from this "public development" are greater than those of confining the development to a single firm/individual.

The economic concern with not having copyright is that a society's willingness to innovate will diminish.  Is this the case?  Was this the case?  Will this be the case?

I recently stumbled upon Robert Nozick's "Utility Monster," which I don't totally understand, but I thought I'd create a similar monster and conduct my very own thought experiment.  Say hello to my "Big Bad Idea Monster" (BBIM), who is loosely defined as the only entity in an entirely copyright-free society with an "idea radar."  So it chooses to build the teleporter I just finished planning in the dream I had last night, but decides against other ideas that have little or no merit - like, square pegs for round holes.  This creature has a budget constraint, and the development and marketing of these "good" ideas are just as constraining as time and money allow.  It simply has all the ideas.

There are two worlds that I want to observe this greedy creature in: 1) a very local world - i.e., pre-telecommunications (Pre-t) - which is some time in the 1800's I guess; and 2) a world some time in the future, where individuals are not limited by the words they conceive for a google search to find the information they require.  Additionally, these two worlds are not to be depicted chronologically.  They are entirely independent of each other.

At first glance, it seems that the BBIM would have a great advantage over the innovators of the Pre-t society.  It can see all the ideas that everyone is having, and will capitalize on the best ones within it's ability to.  The best innovators are SOL.  They subsequently see the futility of their endeavors and distribute their labor to more profitable uses - physical labor.  Eventually, the BBIM is left to its own facilities in creating and developing new ideas for the marketplace.  The benefit of copyright law is blatantly obvious.  And innovation drives a flourishing society - all is right... ahhh.

In my second world, we are all much closer to being our own BBIMs.  The OG (Original Gangsta, for the uninitiated) BBIM still has an advantage, but the playing field appears much more level.  Given the budget constraint of the BBIM, it may even be that the BBIM is at a disadvantage.

To me, this sounds very much like the debate that is currently going on with regards to IP in the digital age.  Though the two worlds I described are independent, they are certainly analogies of our past and future.

Does the public availability and development of ideas provide for greater social value than traditional copyright law?  Will it diminish innovation as much as I've been taught?  Is there a middle-ground? ... are all ideas the same?

Markets for ideas (mostly artistic markets, from my perspective) appear to be in a state of flux.  Can this on-going transition  be a graceful one through policy, or does it have to be a fight to the death to arrive at whatever the future holds?  Unfortunately, I think the latter is true.  Walras' auction will continue to play itself out, and we will eventually experience the result.  We'll then acquire entirely new problems that have yet to be conceived.  I can't wait for the future, but I'm forced to wait forever.

Pick o' the Post #7: "Blackwater Park" by Opeth on Blackwater Park (2001)

... I just bought this album on vinyl, so I decided that I have to make another Opeth pick.

I calculated some words and made a cool heatmap....

I don't think I can say what it represents, but I wanted to share it.  I will say that the bottom-most row (there's 1001 rows), represents reference values, and is entirely white.  As your eyes follow the rows up, each row represents a distance from that first row.  I was expecting this gradient (though small) to appear, and it did.  So, there you go.  My first word calculation... as an image... a random image... with no meaning.  Yay?

I'm becoming very interested in data visualization, and I hope to have some cool concepts to display in the future - as well as the means to say what they represent.  For now, you get this - a meaningless image.  Oh well... I did calculate words to create it though.  You have to trust me on that one.

Pick o' the Post #6: "The Lotus Eater" by Opeth on Watershed

My BB Storm has it's own (secret) agenda... and why I despise it.

BEH BEH BEH BEH BEH - my alarms going off, and my phone just (auto) turned on.  Time to log check any emails/msgs I received for whatever reason between the late hour I go to sleep and 8a.  I need to log some data at through twitter.  And... wha?  you're busy!  What the hell could you be doing!  I don't have (auto) "do stuff" turned on.  God I hate this phone!

There's only so much rant that can go on about what my phone's doing, because I have no idea.  It's got a secret agenda that I am not aware of.  I feel like I've got Windows running on my phone.  It's very annoying.  This fall my contract runs out and I will be getting an iPhone, something I can trust, and that doesn't suck.

How can you calculate words on a phone that just wants to calculate its own bulls#@%! ... oh look, I got a new message, not I get to helplessly watch that damned red light flash away my battery.  Really, it's not that much battery - I just needed something to fill in the blank.  And, damn I'm pissed off.



Measuring Life

I know I just posted about this, but I've learned a few things now and need to voice what they are.  I told a friend about (yfd) and was immediately referred to  The UI seemed more user-friendly than yfd.  I soon learned why.

Daytum is basically a counter.  If you want to count the drinks you've had, Daytum.  If you want to count the number of apples you eat, Daytum.  Limited, but user-friendly.  Yfd, has a more flexible data structure. You can specify the type of data (categorical, event, counter, measurement), and they... tag items better.  This wouldn't seem like a deciding factor, but Daytum, for example would not let me differentiate watching southpark on jtv and watching basketball on jtv.  Maybe I'm missing something, but it was frustrating, and just decided I'd put my Measuring Life efforts into yfd - because of the increased flexibility in what and how to measure my life.  Another important (deciding) feature: I couldn't figure out how to track my weight on Daytum - I came up with a hack to do it, but it seemed I would have to download the data myself and view it that way.  Anyway, yfd it is.

I decided that logging my life this way needed to be an extremely simple process.  There's a visualization tool that yfd provides that will pair two "actions" and generate graphs of the duration between those two actions.  So, whenever I start or stop an activity, I use the following rules: Start) "action"-ing, Stop) "action"-ed.  Now whenever I'd like I can pair these actions and keep track of how much time I spend doing them.  For example, I am "blogging" right now, and I will have "blogged" when I'm finished here.

If my comp's on, then logging this data is pretty simple and takes <10 sec.  If I have to use my (unreliable) BB Storm, then it takes <15 sec, or my phone's not working properly and it doesn't get logged.

Using twitter to do this is pretty simple.  You follow @yfd and create and account with, @yfd requests to follow you and voila!  For what I'm doing, adding data is simple.  Just tweet "d yfd running" when I start running, and "d yfd ran" when I'm done running.  Or sleeping, or cooking, or eating, etc.

More generally, the syntax has a format something like this:

d yfd [action] [value] [unit] [time] #[tag] #[tag] ... #[tag]

... something like that.  For now, I'm keeping it simple.

Sorry, Daytum.  You just don't cut it in this analysts book.  Next I'll be looking into, whose syntax is a little different.  The functionality doesn't appear (on the surface) to be "better" than yfd, but we'll see.

Pick o' the Post #5: "The Day that Never Comes" by Metallica on Death Magnetic

Personal Data Collections, Analysis, and Visualization

I am extremely excited about this new vain of "reserach" that i've found.  Collecting and Analyzing/Visualizing personal data.  I'm also writing this post because I've run across enough websites and services that I want some kind of reference so I don't have to remember all of them.

First, I found  I've looked into this the most so far, and have only glanced at other services such as, and  There's a couple of characteristics which will shape my decision as to which service(s) I want to use.

At one end of the spectrum of personal data collection, we can imagine a 1984-like deal, where it's completely open.  On the other side, you can authorize everything.  It seems that these services have started from the authorization side of things.  It seems that a big part of this competition is ease of use... then visualization.  We'll see - I've only been at this for a bit (couple hours now).  See, I could've logged that time, but it's just not convenient enough yet.

Similar data sites:

Pick o' the Post #4: "Up from the Skies" by Jimi Hendrix on Axis: Bold as Love

How much data are we conscious of? ... health care and a song pick too.

As a data analyst, I think about data too much.  I was driving home from work today and a thought crossed my mind... again.  I ponder this too often, but there is so much data that goes uncollected, uncollected, unobserved, and just taken for granted.  What could we learn from the seemingly unnecessary pieces of reality that just pass us by.  Even if we could collect data in the tremendous quantities that it rolls by us, how would we analyze googles of data - remember, google's a number... .  Well, in all honesty, Google is probably in the best position to answer that question.

The first problem to consider comes from a term that I recently learned form Google's chief economist, Hal Varian... data munging.  This can be considered data prep also, but I think "munging" is meant to be a more general term - not really sure.  This is what gives much of my labor its value.  I'll admit, I'm a youngster in the labor force, so I'm still learning and honing my skills, but if data is not formatted, shaped, reshaped, transformed, merged, appended, dropped, sorted, etc., etc. properly then it might as well just pass right by us.  This can be the most time consuming part of analyzing data, whatever the data happens to be.

Just as a proposition... could data munging be automated?  From a VERY abstract pov, I think this question calls for generalizing a definition of data and the sorts of analysis that are relevant.  There are so many ways to look at data.  I think the answer is, "sure, if you know the source and it's natural format."  But what about new data, like the spawning of a never-before-witnessed black market.  There might be ways to collect, munge, and analyze data from current black markets, but how would you generalize that process when a new (unexpected) one shows up.
The black market is only an example, but consider a scenario where every piece of the DGP (Data Generating Process) is observable.  First off, this is impossible - we'd spend more time observing than time exists, so... .  Anyway, imagine it's so, I'd guess that tools along the lines of anomaly detection would be used.  In that sense, we (think we) know how the world is now, and we observe how it changes - pretty simple.  I wasn't going anywhere in particular with this, but these are the little frustrations that roll through my mind on a daily basis.

On a completely different note, I'm gonna talk about health care.  Not what we should or shouldn't do, but a recent perspective I've had on what's been taking place in washington lately.  Sort of porting my thoughts about data to an optimistic viewpoint on healthcare.  This first came to mind when I'd thought the dems reform bill was an eventuality, which may not be the case anymore with Scott Brown in the Senate now.  Anyway, aside from all the things that I think are inefficient with our current system, I do want Washington to make major changes to it sometime - I don't care if it's a dem or rep bill, just change it... and, I almost don't care what you change as long as it's broad and relatively major.  Of course, I have different personal views because in the end it's my pocket I care about.

This is about understanding our health system.  Right now we know the rules and how to play the game as it is now.  From the perspective of a "social consciousness" (and data) perspective, this is like having a data set of a couple hundred thousand observations - one for each person that plays the game.  They each have their own opinions and such.  From day to day, the game changes very little.  If Washington passes legislation that turns football into rugby (metaphorical), then everyone has to adjust... somehow.  Managers can't use the same strategies, players can't either.  It's like giving each person one more observation to consider in their data set - thus a couple hundred thousand more observations.  Now we have some variation, which was entirely absent before.

My point is that if we all have this second observation in our data sets we can develop a better understanding of what's really important in our health care system.  Rather than listening to politicians, economists, opinion leaders, etc. tell us what's important, we can learn first hand.  The problem with this is that it costs resources to adapt, whether it's time, money, or stress - it's not necessarily a pleasant approach, but we are sure to learn a lot about our health care system and ourselves.  Right now, we think we know what's important, doctors think they know what's important, politicians think they know what's important.  We all think we know what's important to play the screwed up game we have right now, but how do we play the game that provides us with the healthiest nation as a whole.  No one knows that because we can't figure out the best rules for that game.

No matter what happens with health care, it's in the American psyche now and we'll think about it more regardless.  If things do change, pay attention and reform your opinions.  Our current system is messy, and any new system will not be an end-all (it may even be messier).  But we need to pay attention to the effects that any of these changes will have on our lives, because we won't get the chance to again until we go through another one of these polarizing health care debates.

Pick o' the Post:  "On Impulse" by Animals As Leaders on Animals As Leaders

This is Animals As Leaders debut album, and this track is pretty cool.  Oddly enough, I really like the techno-ish beat around 1:20 and 2:50.  This whole album has, just, amazing guitar work on it.  The musicianship reminds me of so many different artists as various times (BTBAM, Guthrie Govan, DT, Meshuggah), but they are very uniquely, Animals As Leaders.  Enjoy.

Calculating Words

Wow! I thought would be too general to be available. Here it is, and I like the title... "countingwords" was taken, but I think I like this better. Either way, it's here to stay.

Hi, my name's Dan Bowen and I'm a data analyst for, hence... Calculating Words. I also find that when I mean to express something in writing, I'm finicky about how it reads - for sure, I'm not alone... and not always perfect about it. I use every character I can think of, no matter how obscure to get the feeling I'm looking for... elipses, double-elipses, double-dash (thanks Paul), dashes, ampersands, carrots, semicolons, pipes, tildes, etc. ... and often. Sometimes I feel like there's an order of operations for punctuation, or the intended pause isn't long enough, or maybe a certain %haracter pops into my head that I just want to use.

I can think of times when I've spent hours writing something up and just scraped it as a lost cause, even though I still really want to get the idea out - again, I can't be alone here. Anyway, I calculate words for a living, and my own as an exercise.

Nuff with the intro, my this first post (below) is something I plan to keep up on a weekly basis. Aptly named, "Pick of the Week." Thanks to my friend Austin's BlahBlahBlog for inspiring this. I'm going to try to keep these as generally "acceptable" as possible. So, I'll do my best to steer clear of all-out weedles, or straight up death metal. But, I'm not sure how long that's gonna last.

The pick of the week is: (... drumroll ...)

"Desert of Song" by Between the Buried and Me on The Great Misdirect