A couple of American professors did a new study on auction bidding behavior. The data used was collected from a Korea auction site and the auction style is different than eBay’s, but it’s an interesting read nonetheless.
While the data used in the study was all publicly available data, the neat thing is that the data was provided to the professors by the auction company. This helped the professors by letting them jumpstart into the nitty-gritty, rather than wasting time trying to collect data.
All the megaNet companies (eBay, Google, Yahoo!, Amazon) have tons of this data. Being walled communities, privacy concerns, etc., yada-yada, they keep this information under lock and key in some dark closet. (Not really, because internal employees must be analyzing the hell out of the data, but we’re talking about the view from outside the company.) This leaves researchers/professors to fend for themselves in data collecting. Knowing how uppity sites are about their data, they probably wouldn’t take too kindly to researchers “collecting” what the company’s consider their data and would probably urge the researchers to stop. So, what are the academics to do?
It’s time we bring a bit of the Old World into the New World. When you wanted to (or were forced to, I should say) do research for a school assignment when you were younger, what did you do? You went to the library.
In the library, there were sections that had reference books that never left the building. You were free to use them all you wanted while at the library, but you couldn’t take them with you. This was a hassle because you had to leave your house, go to the library and look like a nerd. Sometimes though, things worked out and you’d find or bring a friend and things would move faster as you split up tasks. Soon enough, without trying, you’d be having fun and have a better project because of the interaction. If needed, you could copy the pages of info you needed and take that to go with you so you could continue the research until the wee hours of morning.
What if these megaNet companies created their own in-house libraries? Places where academia could come and share a common space and perform research on this fat collection of data.
Before you all go off on a rant about the infeasibility to do such things due to privacy issues, corporate espionage, etc, hear me out. Let’s tackle the obvious issues and agree on solutions.
- The workspace – Lots of talking goes on at companies, not to mention other visible evidence of what’s coming but not out yet. Besides, the obvious NDA, you could put all the academia in a single room. This would prevent wandering eyes, fly on the wall eavesdropping, etc. I’m not saying the rooms have to be windowless, single door jail cells, just setup where nothing big and secretive is easily viewed.
- The Information – The data these peeps will be after is full of personal information. We can’t possibly allow them access to this stuff and not be sued. Sure we can. We can put a DBA, network security and SOX compliance cop in the room with them. The DBA can be setting up temp tables and views into whatever data the researchers want, while carefully stripping personal data. The network security person can be setting up shared drives for the researchers and monitoring their net usage. The SOX cop can insure all’s okay from the legal standpoint. (Sure, this isn’t really SOX related, but SOX peeps are good at freaking out about the smallest bit of security and info breaching, so they’re a good fit.)
- The Computers & Networks – The biggest fear is that these researchers are not going to be able to control themselves and start emailing troves of data out of the company and into the public. No problem. Have two networks in the room: one with access to the corporate network (that the employees are monitoring) and one with a direct connect to the internet only. This way, the researcher can query like mad on the corporate computer and then just manually type his results to the other computer.
Now, each library would be different. Some would be more aesthetically pleasing, while another more intellectually pleasing. Each library would in essence be a miniature version of the host company themselves.
Eventually, the companies would see the good work these researchers are doing for their line of business. They would ask the researchers to assist on projects and begin interacting with employees. Heck, I’m sure some companies will even begin to hire the researchers.
The point is that the data we’re holing up needs to come out somehow. I can think of no better way than having the academia go in there and help us understand what we do on these sites we practically live on. I know I’m curious, aren’t you?