What did big data say to structured data? “Yes, I’m popular, but none of my fiber links fit me anymore”. Okay, enough with the lame attempt at a big data joke. However, even though the joke was lame there’s still some truth to the concern over the capability and security of the network links that support movement of data?
Large Hadoop Cluster Announced in the Switch SuperNAP
Back in September there were several announcements from members of the group building an EMC, Greenplum, Hadoop cluster in the Switch SuperNAP:
This Hadoop cluster will be used as an analytics workbench with the primary purpose of developing Hadoop. The larger picture being that big data tools and solutions are considered a big deal. The premise at the highest level is that the right tools and analysis applied to your company’s unstructured data will unlock information that would otherwise go missing in the traditional combination of structured corporate data stores.
It will be interesting to see how the average enterprise decides to adopt a big data solution
Within the Switch environment the opportunity for sharing between partners and customers is real, it’s simple and it’s fast. The Switch technology eco-system benefits provide a leg up for this Hadoop development effort, as it will make testing and evaluating it much easier for the customer. In a typical customer/SP relationship the customer would need to make a network connect of appropriate bandwidth and or ship data via disk to the SP. In the SuperNAP it is as simple as requesting one or more 10GB connections, getting it cross connected, and having it done the next day. This model of shared space and network provides some additional security assurances to the customer, plus it dramatically reduces the delays associated with requesting a new connection or configuring a VPN link over the internet. It also reduces latency and almost eliminates any cost or performance issues associated with owning a broadband connection.
Good locations for Big-Data-as-a-Service (BDaaS) also require the ability for large-scale private deployments to be co-located near the BDaaS. Rather than storing your data in the BDaaS cloud, your data sits in your low cost storage-as-a-service, private storage, or managed storage environment. Then, large chunks of data to be analyzed can be moved to the compute on-demand and complex tool sets of the BDaaS.
In a Nutshell
Technology ecosystems, like the Switch SuperNAP that support the establishment of secure, dedicated lines at significantly decreased costs will be the best home for Big Data. The model of charge for bytes transferred in and bytes transferred out will not be acceptable for enterprises that want to use the service. A couple of PB’s
of data on a regular basis, using this model, could result in 10s of thousands of dollars in data transfer charges per month. It’s not a sustainable model. Additionally, time constraints will be a limiting factor. Bandwidth and connectivity at scale with more traditional costs models will need to be embraced.
A Big “Thank you” to Jason Mendenhall for his contribution to this blog article!